Re: [suse-programming-e] Programming standards!

18 Apr 2005

      On Monday 18 April 2005 5:48 am, Colin Carter wrote:
...
Thus, copying the string equivalent was much faster than copying an
array of integers.
I have done no Assembler since M$ Windows 95.
Does anybody know enough about modern cpu's to answer my question?
There are many different CPUs today, but let's look at some of the RISC and 
64-bit CPUs.
Most of these CPUs are much more efficient when copying data aligned on a 
natural boundary:
For instance, a 32-bit quantity aligns on a 32-bit boundary, a 64-bit 
quantity on a 64-bit boundary.
Additionally, all CPUs use techniques, such as pipelining and cache. nearly 
all CPUs today do have byte and word (16-bit) instructions, but the copying 
byte by byte is slow:
   1. load byte into register
   2. test value, branch on 0
   3. store byte
   4. goto 1.
Another technique is speculation. Some operations will be performed even 
after the branch is taken, and invalidated later. 

One of the ways a string copy is made more efficient is to load a register 
full of bytes (32-bit or 64-bit). Well written, highly optimized libraries 
will take advantage of the CPU. 
I happen to be most familiar with Digital's (now HP's) Alpha chip. The early 
Alphas did not have byte and word instructions. 

The current most popular chips today are the x86 series. The newer x86-64 
chips have 16 64-bit registers, but 32-bit (legacy) code will only use 8. 
In addition, there are 6 segment registers (3 for 64-bit code). 

In any case, accessing properly aligned data is much faster than accessing 
unaligned data. 
-- 
Jerry Feldman 
Partner Technology Access Center (contractor) (PTAC-MA)
Hewlett-Packard Co.
550 King Street LKG2a-X2
Littleton, Ma. 01460
 (978)506-5243