Re: [suse-amd64] Speed comparisons... amd64 vs em64t?
Kees Hoekzema wrote:
The problem with this test, and anands test too, is that in the mysql benchmark they use mysql's own benchmark. That benchmark does a shitload of inserts and selects, but not the kind you would use in a webpage for instance. I'm going to run them too, and probably get a whole lot different results with my own benchmark based on our website.
Last time I ran my own benchmark I tested a 32bits vs a 64bits mysql binary. MySQL's own benchmark gave the 64 bits version quite a lead, but in my testing the 32 bits was roughly 20% faster than the 64bits. I've tested it a couple of times and every time the 32 bits was in the lead up to 20%. Since than I lost my belief in mysql's benchmarks a bit ;).
The only benchmarks that matters to you are your own. Aside fron that, my benchmarks between 32 and 64 bit systems have shown that I get the best results on each system using code built for those specific architectures. I have 32 bit apps that were simply dragged and dropped into the 64 bit systems with some performance improvements. But the big improvements came when we recompiled our apps using new versions of GCC that supported the 64 bit architecture of the Opterons. For our numerically intensive applications, we saw over 70% improvements over 32 bit systems running at the same clock speed. But there was another cost that affected the throughput. The 64 bit compiled apps would use up more memory, so the memory that was just adequate to support the 32 bit apps would not be enough for 64 bit apps, and the subsequent swapping would make the apps run very slowly, as expected. I think that if you can rebuild your mysql binaries from source on the 64 bit system using the latest GCC compilers, you can gain huge increases in performance as well. eyc
I think that if you can rebuild your mysql binaries from source on the 64 bit system using the latest GCC compilers, you can gain huge increases in performance as well.
The binaries shipped in x86-64 Linux distributions should be already pretty well optimized for K8. It's unlikely you can get much improvement from rebuilding. The 3.3-hammer compiler in SUSE and gcc 3.4 are also not that different in terms of generated code for K8. A lot of the K8 improvements in 3.4 were first done for 3.3-hammer and then forward ported to 3.4. On EM64T Xeon a rebuild with -mcpu=nocona may help, but probably not very much. gcc is not very good at generating code for Intel's P4 core, even with appropiate flags. If you want to really optimize for your workload I would suggest to rebuild with profile feedback for your workload. This should generate even better code. -Andi
On 30 Sep, Andi Kleen wrote:
The binaries shipped in x86-64 Linux distributions should be already pretty well optimized for K8. It's unlikely you can get much improvement
They're not... just look at my simple openssl comparisons: http://www.miguelito.org/openssl. The good news (for AMD and fans.. :) ) is that amd chips get a huge boost at recompile. Em64t gets only a little. My guess is the stuff was compiled with the most generic x86_64 support, much like many distros will build some things for i386 (or i586 mostly these days I guess) instead of optimized more.. in order to support the most people OOTB. -- Mike Marion-Unix SysAdmin/Staff Engineer-http://www.qualcomm.com A polar bear is a rectangular bear after a coordinate transform.
On Thu, Sep 30, 2004 at 06:39:28PM -0700, mmarion@qualcomm.com wrote:
On 30 Sep, Andi Kleen wrote:
The binaries shipped in x86-64 Linux distributions should be already pretty well optimized for K8. It's unlikely you can get much improvement
They're not... just look at my simple openssl comparisons: http://www.miguelito.org/openssl.
openssl may be a special case because it uses custom assembler optimizations. I haven't checked, but I remember at some point there were quite some subtle bugs in the assembler functions, so it's possible that the SUSE rpm takes an conservative but safe approach.
The good news (for AMD and fans.. :) ) is that amd chips get a huge boost at recompile. Em64t gets only a little.
My guess is the stuff was compiled with the most generic x86_64 support, much like many distros will build some things for i386 (or i586 mostly these days I guess) instead of optimized more.. in order to support the most people OOTB.
The K8 is the i386 equivalent on x86-64. Most generic support is currently K8 with 3dnow! off. -Andi
On 1 Oct, Andi Kleen wrote:
openssl may be a special case because it uses custom assembler optimizations. I haven't checked, but I remember at some point
heh.. and that's why I really wish we had the manpower to do benchmarks with the EDA apps our engineers use. We've tried numerous times to get either them to run tests (they don't have time) or setup a test tree for us (had one once.. but relied on project data that eventually moved and broke it). I'm one of those that hates synthetic benchmarks.. real world programs that we use are what I want to see. I should do some testing with lame and other things I use too. -- Mike Marion-Unix SysAdmin/Staff Engineer-http://www.qualcomm.com "The Tuxomatic 2200(TM) with patented Gates-Be-Gone(TM) gets rid of blue screens in a flash! It forks! It blits! Look at those fantastic pixels! It surfs the web! You could even host an ISP with it!" -- Matthew Sachs on Slashdot
participants (3)
-
Andi Kleen
-
chu@tes-mail.jpl.nasa.gov
-
mmarion@qualcomm.com