Hi,
Using the attached program (mcopy_time.c) we noticed a considerable
performance difference between SLES 8 and SLES 9. A 2.2 GHz Opteron
would give the following results (both times executed on the same machine):
*) Compiled against SLES 8 glibc:
$ ./mcopy_time 2200 1000 1048576
Memory to memory copy rate = 2098.644531 MB / sec. Block size = 1048576
*) Compiled against SLES 9 glibc:
$ ./mcopy_time 2200 1000 1048576
Memory to memory copy rate = 1179.235596 MB / sec. Block size = 1048576
I found that the problem is caused by the AMD-specific patches contained
in the source RPMs. If I apply x86-64-opt-mem.diff from glibc-2.2.5-233
(SLES 8) and apply it against glibc-2.3.3-98.38 (SLES 9) and at the same
time do *NOT* apply glibc-2.3.3-amd64-string.diff (originally contained
in glibc-2.3.3-98.38) performance under SLES9 is identical to SLES 8
(mcopy_time yields 2100 MB/sec. under SLES 9 with my patched glibc).
I had to modify x86-64-opt-mem.diff a little so it would apply correctly
against glibc-2.3.3.
I also found that simply not applying glibc-2.3.3-amd64-string.diff
increases performance slightly (without applying any older SLES 8
patches), to about 1550 MB/sec.
Has anybody else witnessed this performance drop with SLES9? Can anybody
see a problem with my solution? Everything seems to be working fine, I'd
just like to make sure I didn't miss anything.
Attached you can find the patches. I split the patch into 2 files for
now, because it was easier to create them that way. They are both based
on x86-64-opt-mem.diff (from SLES 8), modified to apply against glibc
2.3.3. x86_64-string-new.diff adds new files that do not exist in glibc
2.3.3, x86_64-string-modified.diff modifies already existing files.
Thanks and best regards,
-Markus
// Measure how fast we can copy memory
#include
On Mon, Mar 21, 2005 at 09:25:13AM -0800, Markus Mayer wrote:
Hi,
Using the attached program (mcopy_time.c) we noticed a considerable performance difference between SLES 8 and SLES 9. A 2.2 GHz Opteron would give the following results (both times executed on the same machine):
[...] Thanks for the report. It's a known problem caused by a regression in the new amd string diff. -Andi
On Tue, Mar 22, 2005 at 09:52:13AM -0800, Markus Mayer wrote:
Hi Andi,
Thanks for the report. It's a known problem caused by a regression in the new amd string diff.
Thanks for your reply.
Out of curiosity: is SuSE working on an updated glibc for SLES 9 that has this problem fixed?
The next service pack of SLES9 should have it fixed. The root cause was an mismatched access in an assembler file btw - it did access an 32bit variable as 64bit, which caused random performance changes depending on the state of the variable after it in memory. -Andi
participants (2)
-
Andi Kleen
-
Markus Mayer