David Haller wrote:
Hello,
On Tue, 12 Jan 2010, Per Jessen wrote:
Just a follow-up on my earlier question about which assembler to choose. I've opted for nasm. I still don't quite like the simplicity-idea, but I think I can work around the most annoying "features" with a set of my own macros.
However, I've just completed some initial testing, and here is why you need assembler:
I'm processing a dataset of 262176 bytes. Not a lot, but still about 2million bits. The computation is 99.9% CPU-bound.
In my first version, I used a div instruction in the code, and the computation took 151minutes (wall-time). In the second, I removed the div, and probably removed one or two other instructions. New computation took 62minutes.
... and replaced it by what? And have you compared to C-Code compiled with gcc or icc? ;)
The div was a divide by 10 followed by examination of remainder and quotient. Under the circumstances, the number being divided would always be in the range 0-19, which meant I could easily substitute with a subtract 10 and a check of the result. The div was effectively replaced by 3 other instructions, and I think I managed to do away with one more instruction along the way. (I really only looked at the timing). As for gcc code, I did start out with C-code for a quick prototype. I also had a quick look at what was generated - it would have been horrendously slow in comparison. After all, neither gcc nor icc could have been aware of the special circumstances that I knew about. Hand-optimized assembler will _always_ beat the compiler, but the trade-off is in speed vs maintainability. The latter is (in this project) not a concern for me. /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org