On Monday 18 April 2005 00:09, Jerry Feldman wrote:
On Sat, 16 Apr 2005 13:17:22 +1000
Colin Carter
wrote: Yes. In particular FORTRAN had a nice standard of first BYTE holding size.
Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate Bill Gates and mates just changing definitions.)
NULL is a macro in C referring to the null pointer. The string termination is simply a null byte, not NULL. A macro you say! This I didn't know. I think of a NUL as being the ASCII character/byte being a full house of zero bits. The same as used at the end of a C string. You know, like ACK, BEL, HT, ETX, et cetera To make a FORTRAN character string acceptable to a C routine one might use:
But, it is only hidden from the C++ programmer. Do C++ programmers think that the OS has no idea where the allocated string space ends? The OS is now burdened with keeping track of how much space it has allocated for the string; and it will probably have to abandon that
Different languages have different standards. The length byte in a FORTRAN string is hidden from the programmer. Languages like BASIC have some very sophisticated string manipulation routines built=in. Yes. piece of
memory and allocate a new chunk of memory and shift the rubbish over when the (blind) C++ programmer inserts too many characters for the allocated space.
The OS has nothing to do with this. The C++ sting is a class. It uses the C-style string as its basis, so if there is any changing of the size of a C++ string, no buffer overflow will occur if the implementation coded the underlying class correctly. In C, you can implement a similar thing: typedef struct _String { size_t slen; /* Length of string - eg. strlen */ size_t alen; /* Amount allocated for the string */ char *s; /* pointer to string */ } STRING; slen is a bit redundant but used for efficiency. alen is used because we might allocate more than slen. Here is a possible string copy function: int STRING_Copy(STRING *dstr, const STRING *sstr) { if (dstr->alen < (sstr->slen + 1)) { if (dstr->alen > 0) /* if we have an allocated string, free it */ free(dstr->s); dstr->s = malloc(sstr->alen); dstr->alen = sstr->alen; return -1; /* Return failure */ } strcpy(dstr->s, sstr->s); return 0; } If the programmer always uses the supplied functions, then this method will work fine. Note that I did not code for cases where dstr or sstr are NULL since most implementations don't do this for strcpy. This is interesting. But it appears to me that the cpu overhead must be reasonably high cf FORTRAN fixed length character strings (which the
StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate. How does the C macro work? programmer must manage tightly).
All must agree that a bad programmer will over-run string space in any language!
Not entirely true. If the string structure is hidden from the programmer, then the implementation will protect the strings. Of course, a poorly written implementation could cause the same thing. Agreed! I recently discovered a serious error in a compiler at the highest optimization level, where a loop: for (i = x; i > min && i < max; i += step)
The compiler generated the following: for (i = x; i < max; i += step)
This worked fine when step was > 0, but failed when step was < 0. The bug has since been corrected.
-- Jerry Feldman
Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9 Interesting info, Thanks Jerry. Colin