Interesting. Daniel Kedger sent me the STREAM executable statically linked that he used on his machine. With that, I get 2.8 GB/s, which is believable, considering the slower CPUs on my box. However, I cannot get the same numbers when I compile STREAM myself. What compiler/compiler switches did you use? Alberto --On Tuesday, April 13, 2004 4:14 PM -0500 Kevin_Gassiot@veritasdgc.com wrote:
I just got a quad cpu - 32 GB system in, and am running STREAM to confirm bandwidth. I just got this machine loaded this afternoon, but I am seeing numbers similar to what Daniel Kidger is seeing. After turning off the node interleaving in the BIOS, I get the following numbers :
single cpu - 3170 MB/sec 4 cpus - 12790 MB/sec
I am running the 2.4.21-193-smp SuSE 9.0 professional kernel.
Hardware -
Quartet Motherboard Phoenix BIOS - 09/26/2003 4 Opteron 846 cpus 32 GB memory - 16 2 Gb DIMMs
BIOS -
Dram Bank Interleave [AUTO] Node Memory Interleave [Disabled] ECC [Enabled]
I will probably take the covers off tomorrow to check out the innards :)
Kevin Gassiot Advanced Systems Group Visualization Systems Support
Veritas DGC 10300 Town Park Dr. Houston, Texas 77072 832-351-8978 kevin_gassiot@veritasdgc.com
ascotti@email.unc .edu
To 04/10/2004 01:14 suse-amd64@suse.com PM cc
Subject Please respond to [suse-amd64] Bandwidth on Quad ascotti@email.unc Opteron .edu
Hi all,
I thought I had a pretty good system in terms of Bandwidth (2 GB/s on a single node, using STREAM benchmark), until Daniel Kidger, with a considerable amount of understatement, pointed out that on their system (with faster CPUs but otherwise same memory type and motherboard), they get
a "somewhat" higher value, 3.5 GB/s. 75% increase in bandwidth is too large
not to investigate the matter further. Plus, all of the codes we run on this machine are bandwidth limited, so any improvement translates in days shaved from runs. The system is as follows
Qartet MotherBoard BIOS upgraded to version PQTDX0-B (9/26/2003). The original BIOS gave about
700 MB/s on 1 CPU! 4 OPTERON 840 CPUs (1.4 GHz) SLES8 SP3 kernel 2.4.21-207-numa 8 1Gb 333MHz PC2700 DIMMS (2 DIMMs per node, INFINEON brand) STREAM compiled with pgf77 -fastsse -Mvect=prefetch
The BIOS settings are
Dram Bank Interleave [AUTO] Node Memory Interleave [Disabled] ECC [Enabled]
Enabling Node Memory Interleave does not change the measured bandwidth significantly. I have also taken out 1 DIMM per node (which should ensure that the memory works in standard single channel mode). The bandwidth drops
to 1.5 GB/s. In other words, dual channel seems to give a 35% boost. I have
also tried a number of kernels (2.4.19, 2.4.24, 2.4.25, 2.6.4) with similar
results. Note that the system scales up well with NUMA kernels (up to about
7.5 GB/s with 4 CPUs).
My questions are:
1) Can/should I do better? 2) If there is a problem, where should I look for a solution? Is it likely to be a hardware problem (e.g. lousy DIMMs), a BIOS problem (as I mentioned
earlier, the original BIOS gave me 700 MB/s), incorrect BIOS setting (what else is out there, perhaps the SRAT table?) a kernel problem or a compiler issue?
I'd rather have your input before pestering the vendor and Celestica.
Thank you for your time
Alberto Scotti
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com