Re: [suse-amd64] Opteron Board preference ....

8 Apr 2004

      On Thu, 8 Apr 2004 12:05:10 +0100
Dan Kidger  wrote:
...
This is since in most cases there is an application running on all CPUs 
(certainly for us scientific users). Each applciation can typically saturate 
the local memory bus. If applications from other CPus are also memory 
bandwidth hungry then all applications will slow each other down.
 This is of course the bane of all those dual-Xeon HPC servers where one copy 
of a finite element pogram  takes say X minutes but with two running 
simultaneosuly (one per cpu)  each now takes 1.4 X minutes to run.
Under NUMA kernel on Opteron - they do not slow each other down at all.
The local policy used by the NUMA kernel is not always optimal though - that
is why it is sometimes faster to configure node cache line interleaving in the BIOS.
This happens for example when the working set of all CPUs exceeds a single
node and the workload prefers bandwidth over latency. Or when you only have
a program running on a single CPU, but it needs all the bandwidth it can get; 
then interleaving is the best policy, because you will combine the bandwidth of 
all available memory controllers. For most workloads local affinity seems to be 
pretty good though, so it's a good default.

The 9.1/SLES9 kernel will have a new NUMA API that will allow to configure
NUMA policies [local affinity, binding to a specific CPU, page interleaving] 
finegrained per process and per memory mapping without rebooting.

-Andi

Re: [suse-amd64] Opteron Board preference ....

Andi Kleen