On Monday 18 April 2005 5:48 am, Colin Carter wrote:
Thus, copying the string equivalent was much faster than copying an array of integers.
I have done no Assembler since M$ Windows 95. Does anybody know enough about modern cpu's to answer my question? There are many different CPUs today, but let's look at some of the RISC and 64-bit CPUs. Most of these CPUs are much more efficient when copying data aligned on a natural boundary: For instance, a 32-bit quantity aligns on a 32-bit boundary, a 64-bit quantity on a 64-bit boundary. Additionally, all CPUs use techniques, such as pipelining and cache. nearly all CPUs today do have byte and word (16-bit) instructions, but the copying byte by byte is slow: 1. load byte into register 2. test value, branch on 0 3. store byte 4. goto 1.
Another technique is speculation. Some operations will be performed even
after the branch is taken, and invalidated later.
One of the ways a string copy is made more efficient is to load a register
full of bytes (32-bit or 64-bit). Well written, highly optimized libraries
will take advantage of the CPU.
I happen to be most familiar with Digital's (now HP's) Alpha chip. The early
Alphas did not have byte and word instructions.
The current most popular chips today are the x86 series. The newer x86-64
chips have 16 64-bit registers, but 32-bit (legacy) code will only use 8.
In addition, there are 6 segment registers (3 for 64-bit code).
In any case, accessing properly aligned data is much faster than accessing
unaligned data.
--
Jerry Feldman