When does it make sense to use Loop fission/distribution if I am compiling for a single core processor?
Got wonderful answers at comp.compiler