It sounds more like why when benchmarks are run, they are run 3 or more times with the first run being thrown out.
This is due to the initial loading of data (reading files off of disk into cache) that must be done. The next run will not have as much disk I/O and if the test is small enough, could actually be reading straight from memory, which is significantly faster than disk.
After a reboot, or being powered off, the cache is clear, so all data needs to be read into memory again.
-----Original Message----- From: fabrice piccini [mailto:firstname.lastname@example.org] Sent: Thursday, June 30, 2005 12:15 PM To: email@example.com Subject: Re: [suse-amd64] Puzzling wild variation of execution time
Sheo Shanker Prasad wrote:
I will greatly appreciate help in understanding the puzzling wild
the execution time of small test/benchmark program, as explained
(1) The nature of the benchmark/test program:
A floating point arithmetic intensive atmospheric chemistry modeling
used to take, in a consistent manner, around 2m 37s on my machine when
running under SuSe 9.1 Pro. When I run the benchmark program, I
other programs. Thus, during its run, it is the only active program.
are belonging to root are sleeping. There is no other user.
(2) The machine:
Dual Opteron 250 with 4 GB of PC3200 (DDR) memory in 4 dimms that fill
dimms on cpu0 and two dimms on cpu1 installed in Tyan Thunder K8
that has American MegaTrend AMIBIOS V2-04. It now runs under SuSE 9.3
the Linux version 220.127.116.11-21.7-smp (geeko@buildhost) (gcc version
20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Jun 2 14:23:14 UTC 2005
(3) Background of the problem:
After only 4 months of the delivery of the machine by the vendor, the
board failed. The machine was sent to the vendor. He then replaced the
board with a new one and installed SuSE 9.3 Pro to bring the machine up-to-date with the Linux version 18.104.22.168-21.7-smp (geeko@buildhost)
version 3.3.5 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Jun 2
(4) The problem:
After the repair and upgrade to 9.3 Pro, the execution times of the
test/benchmark program varies widely between run to run. It is never consistent. It varies from 2m 37s to more than 6m. A pattern is seen
variation. Usually, the longest execution time is when the machine is
after remaining shut-off for a few hours. The shortest execution time
obtained after the machine has been constantly on for abot 7 to 10
(5) My puzzle:
While a computer is well known to be reproducible, the wild variation
factor of almost 3 in the execution is very disturbing.
What could be the cause? Does it mean that some vital component
execution of the numerical modeling (RAM memory or the memory on the
are loose or erratic and may be about to go bad? I am just puzzled.
from you all will be really very very much appreciated.
well, I'm not an expert on this kind of problem, but I'm still interested in problem solving... all you said brings me to think to some kind of memory leakage (stabilization of exec time after a long period of uptime)...supposing your algorithm is consumming a lot of memory...
if this is true (memory leakage), the next step is to determine if the guilty is the kernel (9.1 was 2.6.5, 9.3 is 2.6.11 -please, kernel experts, are there a lot of changes in memory management between these two versions, and especially involving smp systems ?), or maybe the compiler (did you compile yourself you algorithm ? did you recompile it after the installation of suse93 ? written in C, C++, Fortran, other ?)
maybe an other possibility is hardware/bios related problem (or incorrect bios settings), but I don't have any idea on how to diagnose a
faulty cpu...please, tyan hardware experts, it's up to you...
Ah, I've just reread your post. You said the longest time is after the machine is started after a long shut-off... What occurs if, after a long
period of uptime you reboot the machine and immediately run your algorithm ? - If you get the same good results (2'37") then you certainly have a hardware problem (thermal stabilization problem on some component ?) - If you get bad results (longest time) then it's certainly a software related problem (bios/kernel/compiler)
sorry not to be able to help you more...that was only to start the discussion...now it's time for true experts.