CPU count and parallelism Last week we had a weird issue, all of sudden a batch job, which runs monthy, took lots of time to finish, in fact it didn't finish at all since we decided to stop it. First investgation showed that the explain plan of the query changed, this is normal and happens from time to time, this is what the CBO does, for good and sometimes as in this case for bad. the explain plan used to execute the query used parallelism, which in our case wasn't beneficial at all.
The problem was that manualy reexcuting the query, didn't show the messed up plan, it took the right plan !!!
sometimes good plan sometimes bad plan …. so i couldn't set 10053 event because it wasn't reproducible… During two days I tried to reproduce the issue but i couldn't. i tried it again with some random values when the load was higher and now could in a consistent way reproduce it.
By comparing the CBO traces is saw that the cpu_count was different . A couple of weeks ago the sysadmin implemented solaris cpu pools to dynamically assign CPU resources, sometime more sometime less ....
this caused the optimizer to take the wrong plan when more than 11 cpu's were assigned.... Case closed ;-)