I have 3 AIX 6.1 machines running INFORMIX 11.7 database engine.
One of these servers is the database server and the other 2 servers are connecting to it.
I am doing a test to determine the time of query execution between these servers and i see that in specific times one of these server is taking much more response time.
Here is an example of the test:
You can see that the execution time on server3 is much longer that the other 2 and i noticed that this happens at the first 5-6 minutes of each hour.
I am trying to figure out what causes this extra delay during these hours but i cannot find anything. I tried to check the crons to see if there is something automatic that runs during these hours that is causing the delay but i dont think my answer is there.
What makes you think the answer is not in the crontabs? (The question is meant serious, what haveyou done to come to that conclusion?)
You might want to start with the basic tool for all things performance: vmstat . Use
vmstat -tw 1 | tee -a /some/log/file
and analyse the log. See if there is any significant difference between the first 5-6 minutes of an hour and the rest of the time. See if there is a difference between the first 5-6 minutes of the hour on server3 and the other two servers.
Thank you for your responses.
Well to answer your question about the crons, the crons on server3 are schedules during the early morning hours and from the crons that are running during working hours i did some test.
For the crons that are running at the last or first minutes of an hour i tried to disable one cron at a time to see if there will be any improvement on the time of the execution. When i see that there was no difference then i enabled that cron again and for the next hour i disabled a different one. I ended up disabling all crons one at a time and i get no improvement.
Thats why i said that probably the answer is not there.
Except if you have any suggestion of a different way to test it.
By the way i followed your suggestion to create a
vmstat
report. I will let it run for few hours and after that i will examine it.
i let vmstat run for 2 hours and i look at the reports to see the output.
the cpu usage of the server never reaches 100% and also it doesnt seem to have any great variation comparing to the time before and after the delay is experienced.
That was not the question. If you want to find out yourself what infos you can glean from a vmstat-output you might want to read a little treatise about the topic.
thank you bakunin.
the link is indeed very useful!
however when i read the reports the server3 which is having the problem with the delay seems healthy (meaning no paging in/out, no block, wait is low)
one of the other servers shows some block from time to time but that server doesnt have any problem in the aspect of sql delay.
I searched your first post, ( but its friday so forgive if I zapped...) looking anyware for the 3 servers configurations and find nothing...
Saying they are 3 servers running AIX 6.1 only means to us they have same OS ... they can be quite different in size, resource, proc speed etc... and they can also be on different networks...
We could even imagine 2 are LPARS on the same physical machine ( doing nothing...) and the 3rd an LPAR on a periodically hard working machine...
...
Last but not least, some apps can be running periodically - not using crontabs haha
I think of something like ControlM or even TSM since you are on AIX and chaces are the backup are done by Tivoli...
my 2 cents in addition to Bakunin's
With what you given to us so far it can just be free speculation, you need to gather far more information to be able to start any serious interpolation
Note also that another difference between the 3 servers is that the first 2 act as physical machines while server3 is virtual (vio in the middle)
Also about the comment for the backup, there is no backup running during the times that i get the info, Tivoli is disabled as another backup application is used for backups.
I hope this helps more. If you need additional info please feel free to ask.
omonoiatis9, nobody is able to answer that. dukessd has voiced a conjecture (read: educated guess) and maybe he is right, maybe not. As long as you do not publish any data (like the one i have asked for a week ago) nobody will be able to really help you. This is not because dukessd (or i, for that matter) are unable or unwilling to help you, but because we do not know your systems and there are literally hundreds of possible causes.
dukessd himself has called what he said a guess - a good one and certainly based on his considerable experience, but a guess nevertheless. The situation is like you calling a doctor via telephone and asking him: "my left side hurts, tell me what it is."
Show us some data, then we can (maybe) tell you what the problem is. But so far you have only told us some generalities and therefore you get only generalities back.
hello bakunin,
what data are you referring to? cause i ask if you want any more info then you can tell me exactly what you want and i will give it to you.
if you are referring to the vmstat reports, i attached 2 reports in order to compare.
note that file vmstat_report_rea.txt refers to server3 and file vmstat_report_zeus.txt refers to server1.
if you need more data please be specific in what you need.