I knew that despite saying 'lets put aside wait events...' some of you wouldn’t be able to resist ;-) I bought the book ages ago and have got the T-shirt, thanks.
It’s a badly written job for sure, but please, lets not focus on that as its not the question here:
Q1. Is there any way of measuring the impact of this on the job?
Q2 I am after experiences in changing the default priority for an AIX5L database with (or without) this kind of mixed workload. Good or bad decision?
My stats tell me
- that I'm waiting for very little other than CPU (60%) and disk (40%)
- that my production disk reads are 15% slower.
- that disk wait numbers are similar on test vs prod
- that I am burning a huge amount more CPU on the production system - I'm talking 30x more. Note this is cumulative - the job does millions of executes of each SQL (please don’t focus on this, I am aware). tkprof shows a query that does 100s CPU on test, does 3000s on prod.
- overhead due to undo resolution is similar on both systems
The CPU difference is perplexing me and I was specifically interested if this finding on lowered priority was a clue. My concern is that 10gR2 AIX stats may include wait time for CPU and I wanted to know how to discern that.
Now it turns out that Solaris 10gR2 has OS_CPU_WAIT_TIME in v$osstat, and 'OS Wait-cpu (latency) time' in v$sesstat.
I'm offsite and have only captured v$mystat output for the job - that statistic is not available in that view on AIX.
I was just after info as to whether anyone else has played with this setting, and it appears the answer is generally that it is not required.
Turning off SMT is an idea that is on the list.
From: Anand Rao
Sent: 06 December 2006 03:49
Subject: Re: CPU priority for AIX 5.3 with mixed workload
I get a feeling that you might be looking in the wrong direction. If the batch job is not running fast, then you need to look at what the job is doing in the 1st place. as you mentioned, tracing is good idea. so, you've got the answer yourself !
we use a couple of p590 as well as p595 machines with SMT enabled and they are blazing fast. but, i only have 64 cores on the p595 :)
i have tried bumping the priority of Oracle shadow as well as background processes (LGWR, DBWR,etc.) with mixed luck and also depending on whether RAC is used or not.
If you are CPU starved, then you might be better of looking who is chewing up the CPU and why instead of raising the priority, etc. look at profiling your batch job. You mentioned that the batch job completed in 2.5 hours in an identical test system?
was everything identical? the disk setup, Oracle, etc.? you need some detailed data from the test system to compare it to your problematic run. so, statistics collection, tracing is where you might find some clues.
AIX and Tru64 do a fabulous job with process scheduling and priority handling. I never had to touch anything even on the busiest of machines.
hope that helps.
On 06/12/06, Dennis Williams <email@example.com> wrote:
Also look at adjusting the nice value.
You should also consider what you will be taking CPU time away from. Usually modern Unix systems do a great job of balancing the requirements of all the users simultaneously. When you start forcing the system to your will, you stand a chance that you may improve the performance of your process, but response could get VERY bad for other users. You may also make things bad for the other users and only help your process slightly.
Consider reading Optimizing Oracle Perfomance by Cary Millsap and Jeff Holt.