Programming standards!

Colin Carter

15 Apr 2005 15 Apr '05

06:23

Hi, I don't see much activity on this mailing list, so perhaps I am talking to myself. Perhaps I can stimulate a debate: Programming standards have deteriorated significantly. My argument: Some years ago academics developed design systems (Jordan etc) to stop young programmers writing spaghetti code (with lots of 'goto' statements). Then Knuth developed PASCAL to force young programmers to write properly structured code (goto statement not included). Well, nobody could get a job if they hadn't done one of these 'design' courses and didn't know PASCAL. Turns out that we old guys (especially FORTRAN scientific types) had been writing structured code for years, and we controlled our goto statements. And 'they' had to add a goto to PASCAL (we grinned) because without it the code can become very inefficient (especially with PUSH/POP overheads). Now the latest is "Object Oriented" (shouldn't that actually be orientated?) code with lots of over-heads. I know: there's plenty of cpu power and plenty of memory now. So now we are forced to buy hundreds of MB of RAM because so many young programmers say "there's plenty of memory". Nothing runs in 4 MB any more - too much over-head. While scientists write A(1) = ... A(2) = ... A(3) = ... to cut out a nano-second or two, the new generation are writing functions (let alone for loops) to do the same thing. And then there are the programs which ask dumb questions.... And the ones which send out demands for payments of $0.00 And the ones which start with annoying default behaviour and the damned switch is buried deep in the menus under font colour. et cetera Now I ask you, why does a modern O.S. take so long to start up? My PC runs 64 bit at 3200 MHz and it takes longer to start than my car. I know: it is preparing the 42,000 fonts that I will never use! (Don't bother with "M$ takes longer - everybody knows that.) My latest bitch is Updates. I just wanted to read the new document on rpms. But of course it couldn't be written in ACSII: it had to be written in the latest, smartest version of Adobe (you know, the one with the proprietary lock). That means that we all have to upgrade Adobe reader. My reader is only a couple of months old. That's easy, you young blokes say. But the read SLE mailing list and see what trouble it has caused me! It has cost me hours of messing around and left my system stuffed. Now I will never get to read the document (nor anything else). Anyway, why use Adobe - it takes five times as long to open a document than anything else I've seen, and the reader has no control - half the internet sites don't even let you copy it. And they keep changing it so that everybody MUST upgrade. And anyway, although it is massive, it is only a cheap version of Tex. So you young guys, start writing faster/smarter code and forget the fancy, 'full of options', 42000 font code. Colin

Show replies by date

Jeffrey L. Taylor

15 Apr 15 Apr

07:24

New subject: [suse-programming-e] Programming standards!

Quoting Colin Carter :

...

Hi, I don't see much activity on this mailing list, so perhaps I am talking to myself. Perhaps I can stimulate a debate:

Programming standards have deteriorated significantly.

My argument: Some years ago academics developed design systems (Jordan etc) to stop young programmers writing spaghetti code (with lots of 'goto' statements). Then Knuth developed PASCAL to force young programmers to write properly structured code (goto statement not included).

You are right, programming education has deteriorated. Niklas Wirth developed Pascal. Donald Knuth wrote "The Art of Programming". And no language design can force young or old programmers to write properly structured code. The best thing it can do is make it easy to write structured code. Good code is even tougher. Jeffrey

Stefan Hundhammer

10:01

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 08:23, Colin Carter wrote:

...

Now I ask you, why does a modern O.S. take so long to start up?

All the services and features that need to be set up - all the stuff people expect from a modern OS. If you want something that boots fast, there is always old MS-DOS. You might miss a couple of features you have come to like, though. ;-)

...

I know: it is preparing the 42,000 fonts that I will never use!

Did you really check what consumes all the time at boot? Dozens of services are initialized and started. Dozens of different kinds of hardware are probed - which might or might not be available, thus requiring some (frequently heuristic) hardware probing - and waiting for answers for some time. Some hardware subsystems have looong possible timeouts - SCSI (an early 80s era technology) is a notorious example, USB (connectivity for all kinds of possibly very dumb and possibly very slow devices) is another. If you know for sure you don't need some of that stuff, you can easily remove it from the boot process - simply use the YaST2 runlevel editor. But then please don't complain that that cheapo printer you bought for $69.- or that geeky USB stick doesn't work, or that you can't connect to a network printer. All that stuff requires some initialization and/or some services to run, and all that takes some time at boot time. HTH CU -- Stefan Hundhammer Penguin by conviction. YaST2 Development SUSE Linux Products GmbH Nuernberg, Germany

Colin Carter

10:46

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 20:01, Stefan Hundhammer wrote:

...

On Friday 15 April 2005 08:23, Colin Carter wrote:

...
Now I ask you, why does a modern O.S. take so long to start up?

All the services and features that need to be set up - all the stuff people expect from a modern OS.

...

If you want something that boots fast, there is always old MS-DOS. You might miss a couple of features you have come to like, though. ;-)

Dozens of different kinds of hardware are probed - which might or might not be available snip -- Stefan Hundhammer Penguin by conviction. YaST2 Development SUSE Linux Products GmbH Nuernberg, Germany

True Stefan. I don't mean to Hammer SuSE; not at all. My respect for SuSE is the best. :-) I mean the Adobe/Word/ and the like. Code that takes hundreds of MB on the disk. In general the Linux software is smaller than the M$ software. I mean things like the flexibility of Word which can 'undo' so much work. Very useful - except that the stuff you want to undo us generally deep under stuff which you don't want to undo. I was sent a Word document which was a note of about ten lines; but the file was about half a MB. So, out of curiosity I openned it with a binary editor and discovered thousands of zeros, and all the details of a confidential contract which Word had obviously 'remembered' from a previous document! Exercise: in Word, type This is a sentence." Check the size of the saved file. Re-open the file and delete a word of the sentence, then undo, redo, undo, about ten times. Check the size of the file again. How can a programmer program so badly? And what do you think about my comments on 'upgrade'? So time consuming... I wasted today, and only went backwards. Now I will have to start my old XP laptop to read the rpm documentation. Cheers, Colin

Synthetic Cartoonz

11:31

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 02:23, Colin Carter wrote:

...

Hi, I don't see much activity on this mailing list, so perhaps I am talking to myself. Perhaps I can stimulate a debate:

Programming standards have deteriorated significantly.

My argument: Some years ago academics developed design systems (Jordan etc) to stop young programmers writing spaghetti code (with lots of 'goto' statements). Then Knuth developed PASCAL to force young programmers to write properly structured code (goto statement not included).

Well, nobody could get a job if they hadn't done one of these 'design' courses and didn't know PASCAL.

Turns out that we old guys (especially FORTRAN scientific types) had been writing structured code for years, and we controlled our goto statements.

And 'they' had to add a goto to PASCAL (we grinned) because without it the code can become very inefficient (especially with PUSH/POP overheads).

Now the latest is "Object Oriented" (shouldn't that actually be orientated?) code with lots of over-heads. I know: there's plenty of cpu power and plenty of memory now. So now we are forced to buy hundreds of MB of RAM because so many young programmers say "there's plenty of memory". Nothing runs in 4 MB any more - too much over-head.

How very odd. I just ranted on this general subject on another list. The problem with the software development world is that software developers assume their program is the beneficiary of all system resources. No matter what the technological advance, software advances as fast, or even faster, to consume and overtake it. Today, developers code for their own convenience without regard or respect for system resources. While hard drives, physical RAM and CPU caches have become faster and larger, code has bloated even faster, negating the performance increase. From what I see of the entry-level people we hire and fire at work, schools aren't teaching programmers how computers work at even a fundamental level, so they have no real idea what "efficient" means. It is quite common to find some Object Oriented "expert" (that's someone with a degree and one job on their resume) who can't build a simple C char array. I've seen quite a few who have no clue at all that a C pointer corresponds to an instrumental, functional part of the CPU and is not merely a language syntax construct.

...

While scientists write A(1) = ... A(2) = ... A(3) = ... to cut out a nano-second or two, the new generation are writing functions (let alone for loops) to do the same thing.

Actually, I was reading recently in the Art of Unix Programming, that unrolling loops used to be good optimization, but with CPU caches now, it's better to keep the loop, since there is a better change that the entire code in the loop fits in the cache.

...

Anyway, why use Adobe - it takes five times as long to open a document than anything else I've seen, and the reader has no control - half the internet sites don't even let you copy it. And they keep changing it so that everybody MUST upgrade. And anyway, although it is massive, it is only a cheap version of Tex.

So you young guys, start writing faster/smarter code and forget the fancy, 'full of options', 42000 font code.

I still have an Amiga 3000 kicking around -- 25MHz 68030 and 16M RAM. (For that sytem 16M is HUGE -- especially back in 1990) I'm amazed that in 90% of practical situations there is no appreciable difference between using the Amiga and using my 1.8Ghz, 1G RAM linux system (or even "faster" Windows systems at work). The GUI is light and fast -- it even makes icewm feel bloated. The apps are a tiny fraction of the size of similar programs on the contemporary platforms and seem to load instantly. It still works as well as it does, because it was designed by a bunch of uber-geeks who had to care that the 68000 address range was only 16M.

Colin Carter

12:43

New subject: [suse-programming-e] Programming standards!

...

On Friday 15 April 2005 02:23, Colin Carter wrote: snip

...
Programming standards have deteriorated significantly.

My argument:

How very odd. I just ranted on this general subject on another list.

The problem with the software development world is that software developers assume their program is the beneficiary of all system resources. No matter what the technological advance, software advances as fast, or even faster, to consume and overtake it. Yes, I have noticed how most modern programs waste time poling the mirriad of open windows .... Today, developers code for their own convenience without regard or respect for system resources. While hard drives, physical RAM and CPU caches have become faster and larger, code has bloated even faster, negating the performance increase. Yes,

On Friday 15 April 2005 21:31, Synthetic Cartoonz wrote: this negates ANY argument that young programmers can write efficient code. These days modern PCs are 1000 times faster, and have 1000 times as much memory so, given that modern programmers have learnt something, their code ought to be at the very least 1000 times faster than ours, but it is not!

...

From what I see of the entry-level people we hire and fire at work, schools aren't teaching programmers how computers work at even a fundamental level,

Yes. And they are not interested in learning how memory is accessed et cetera. I asked if anybody knew how memory was loaded in an AMD64 and was 'told' "Don't worry, it is cached anyway." Well, I do worry because I want to know the trade-off between using 32 bit and 64 bit integer arrays. For example, is a 32 bit loaded via a 64 bit bus then masked/shifted? If so, then 64 bit would be faster, otherwise 32 bit would be faster. But the answer, even from AMD, is "Don't worry, it is cached." But one registered is not cached, the value must still be loaded from the cache. Of course this doesn't matter with pathetic little programs which interface with the user, but it does matter when the code takes up to 32 hours (yes hours) to execute.

...

so they have no real idea what "efficient" means. It is quite common to find some Object Oriented "expert" (that's someone with a degree and one job on their resume) who can't build a simple C char array. I've seen quite a few who have no clue at all that a C pointer corresponds to an instrumental, functional part of the CPU and is not merely a language syntax construct.

True. It is interesting that C++ programmers think that 'pointers' are wonderful - considering that addresses are the core of FORTRAN code! snip

...

Actually, I was reading recently in the Art of Unix Programming, that unrolling loops used to be good optimization, but with CPU caches now, it's better to keep the loop, since there is a better change that the entire code in the loop fits in the cache. Yes, I know. In fact FORTRAN compilers in the late sixties/early seventies were already unwrapping loops. FORTRAN compilers have always put loop counters into a register and counted backwards.

...

...
Anyway, why use Adobe - it takes five times as long to open a document than anything else I've seen, and the reader has no control - half the internet sites don't even let you copy it. And they keep changing it so that everybody MUST upgrade. And anyway, although it is massive, it is only a cheap version of Tex.

So you young guys, start writing faster/smarter code and forget the fancy, 'full of options', 42000 font code.

I still have an Amiga 3000 kicking around -- 25MHz 68030 and 16M RAM. (For that sytem 16M is HUGE -- especially back in 1990) I'm amazed that in 90% of practical situations there is no appreciable difference between using the Amiga and using my 1.8Ghz, 1G RAM linux system (or even "faster" Windows systems at work). The GUI is light and fast -- it even makes icewm feel bloated. The apps are a tiny fraction of the size of similar programs on the contemporary platforms and seem to load instantly. It still works as well as it does, because it was designed by a bunch of uber-geeks who had to care that the 68000 address range was only 16M. This sounds familiar.

I think it is a bit of the old K.I.S.S. Regards, Colin

Stefan Hundhammer

14:44

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 14:43, Colin Carter wrote:

...

Yes, I have noticed how most modern programs waste time poling the mirriad of open windows ....

Urgh - folks, can we stop the urban legends at some point, please? Everybody who has a minimum clue of how any kind of GUI programming works should know that those programs spend most of their time waiting on a socket, waiting for user input - on X11 (no matter what toolkit is being used - KDE, Gtk, OSF/Motif, Xt, ...) and on Win32. There is no "busy wait" in any such program I know. The other myths about object orientation etc. are similar: If you do non-trivial software, you simply NEED that kind of thing so you can have the kind of abstraction level you will want to have so the software can be mantained at all, much more for extended periods of time. The times of glorified "hello, world" programs are over. Most of the finite elements programs people were so fond of in the FORTRAN times are written by now - there are probably generic programs for that kind of thing. Today's software is supposed to do everything, including making coffee, cleaning your shoes and taking the dog out. You do not want to write that kind of software with a 70s era approach - monolithic blocks of code like people used to write with old Pascal, C, or (heaven forbid) FORTRAN. Sure, you CAN do that. But it hurts - big time. C's string handling for instance sucks - it is the source of most security holes that need to be fixed. Buffer overflows happen because C does not have a concept for variable length strings - it only has character pointers. What a nightmare. Been there, done that, hated it from the bottom of my guts. I have been programming since the mid-80s, and even though I also tend to bitch about many things, things have improved a lot since then. No more rebooting because a null pointer in C overwrote your PC's interrupt table at 0000:0000 on MS-DOS. Anybody remember what a PITA that was? No more dumbass 64k limits because certain ingenious inventors had considered that much memory "enough for everybody". No more nightmarish Turbo Pascal {Imyinclude.pas} to overcome the lack of a module or linker concept. And today, it's no more segfaults because the C string handling is so dumb. Use modern tools. Use tools like C++ - or, for that matter, C#, or even Java. Use predefined (meaning: well-tested) classes for common purposes. Do not repeat everybody's (and their mothers' ) mistakes by writing your own because you think you can do a better job at that. You argue this comes at a price - and the price is performance and system resources. That may be right, but I rather sacrifice some MB of RAM rather than experience random crashes because nobody can debug a software of that complexity written with outdated tools any more. My time (and, for that matter, my nerves) are way more precious to me than some MB of RAM saved. I could keep ranting a lot more like that, but other duties are calling right now. ;-) Just my 2 Cents (well, make that 4 - or 6) ;-) -- Stefan Hundhammer Penguin by conviction. YaST2 Development Programmed with MS-DOS, SunOS, Solaris, HP-UX, Win32 (just enough to hate it) and X11 / OSF/Motif, Qt since 1984 with about a dozen programming languages SUSE Linux Products GmbH Nuernberg, Germany

Jerry Feldman

14:59

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 10:44 am, Stefan Hundhammer wrote:

...

C's string handling for instance sucks - it is the source of most security holes that need to be fixed. Buffer overflows happen because C does not have a concept for variable length strings - it only has character pointers. What a nightmare. I disagree with your opinion. A C string is an array of characters. One must remember that C is NOT a high level language like FORTRAN or COBOL, it was designed as an implementation language. C++ implements a variable length string based on the C strings.

Some other languages use a tld type of string where the first byte of the string contains a length. Languages like BASIC are more flexible because their string handling is performed under the covers. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Colin Carter

16 Apr 16 Apr

03:17

New subject: [suse-programming-e] Programming standards!

On Saturday 16 April 2005 00:59, Jerry Feldman wrote:

...

On Friday 15 April 2005 10:44 am, Stefan Hundhammer wrote:

...
C's string handling for instance sucks - it is the source of most security holes that need to be fixed. Buffer overflows happen because C does not have a concept for variable length strings - it only has character pointers. What a nightmare.

I disagree with your opinion. A C string is an array of characters. One must remember that C is NOT a high level language like FORTRAN or COBOL, it was designed as an implementation language. C++ implements a variable length string based on the C strings.

Some other languages use a tld type of string where the first byte of the string contains a length. Languages like BASIC are more flexible because their string handling is performed under the covers.

Yes. In particular FORTRAN had a nice standard of first BYTE holding size. Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate Bill Gates and mates just changing definitions.) But, it is only hidden from the C++ programmer. Do C++ programmers think that the OS has no idea where the allocated string space ends? The OS is now burdened with keeping track of how much space it has allocated for the string; and it will probably have to abandon that piece of memory and allocate a new chunk of memory and shift the rubbish over when the (blind) C++ programmer inserts too many characters for the allocated space. All must agree that a bad programmer will over-run string space in any language!

...

-- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Regards, Colin

Jerry Feldman

17 Apr 17 Apr

14:09

New subject: [suse-programming-e] Programming standards!

On Sat, 16 Apr 2005 13:17:22 +1000 Colin Carter wrote:

...

Yes. In particular FORTRAN had a nice standard of first BYTE holding size. Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate Bill Gates and mates just changing definitions.) NULL is a macro in C referring to the null pointer. The string termination is simply a null byte, not NULL. Different languages have different standards. The length byte in a FORTRAN string is hidden from the programmer. Languages like BASIC have some very sophisticated string manipulation routines built=in.

...

But, it is only hidden from the C++ programmer. Do C++ programmers think that the OS has no idea where the allocated string space ends? The OS is now burdened with keeping track of how much space it has allocated for the string; and it will probably have to abandon that piece of memory and allocate a new chunk of memory and shift the rubbish over when the (blind) C++ programmer inserts too many characters for the allocated space. The OS has nothing to do with this. The C++ sting is a class. It uses the C-style string as its basis, so if there is any changing of the size of a C++ string, no buffer overflow will occur if the implementation coded the underlying class correctly. In C, you can implement a similar thing: typedef struct _String { size_t slen; /* Length of string - eg. strlen */ size_t alen; /* Amount allocated for the string */ char *s; /* pointer to string */ } STRING; slen is a bit redundant but used for efficiency. alen is used because we might allocate more than slen. Here is a possible string copy function: int STRING_Copy(STRING *dstr, const STRING *sstr) { if (dstr->alen < (sstr->slen + 1)) { if (dstr->alen > 0) /* if we have an allocated string, free it */ free(dstr->s); dstr->s = malloc(sstr->alen); dstr->alen = sstr->alen; return -1; /* Return failure */ } strcpy(dstr->s, sstr->s); return 0; } If the programmer always uses the supplied functions, then this method will work fine. Note that I did not code for cases where dstr or sstr are NULL since most implementations don't do this for strcpy.

...

All must agree that a bad programmer will over-run string space in any

...

language! Not entirely true. If the string structure is hidden from the programmer, then the implementation will protect the strings. Of course, a poorly written implementation could cause the same thing.

I recently discovered a serious error in a compiler at the highest optimization level, where a loop: for (i = x; i > min && i < max; i += step) The compiler generated the following: for (i = x; i < max; i += step) This worked fine when step was > 0, but failed when step was < 0. The bug has since been corrected. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Colin Carter

16:22

New subject: [suse-programming-e] Programming standards!

On Monday 18 April 2005 00:09, Jerry Feldman wrote:

...

On Sat, 16 Apr 2005 13:17:22 +1000

Colin Carter wrote:

...
Yes. In particular FORTRAN had a nice standard of first BYTE holding size.

...
Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate Bill Gates and mates just changing definitions.)

NULL is a macro in C referring to the null pointer. The string termination is simply a null byte, not NULL. A macro you say! This I didn't know. I think of a NUL as being the ASCII character/byte being a full house of zero bits. The same as used at the end of a C string. You know, like ACK, BEL, HT, ETX, et cetera To make a FORTRAN character string acceptable to a C routine one might use:

...

...
But, it is only hidden from the C++ programmer. Do C++ programmers think that the OS has no idea where the allocated string space ends? The OS is now burdened with keeping track of how much space it has allocated for the string; and it will probably have to abandon that

Different languages have different standards. The length byte in a FORTRAN string is hidden from the programmer. Languages like BASIC have some very sophisticated string manipulation routines built=in. Yes. piece of

...
memory and allocate a new chunk of memory and shift the rubbish over when the (blind) C++ programmer inserts too many characters for the allocated space.

The OS has nothing to do with this. The C++ sting is a class. It uses the C-style string as its basis, so if there is any changing of the size of a C++ string, no buffer overflow will occur if the implementation coded the underlying class correctly. In C, you can implement a similar thing: typedef struct _String { size_t slen; /* Length of string - eg. strlen */ size_t alen; /* Amount allocated for the string */ char *s; /* pointer to string */ } STRING; slen is a bit redundant but used for efficiency. alen is used because we might allocate more than slen. Here is a possible string copy function: int STRING_Copy(STRING *dstr, const STRING *sstr) { if (dstr->alen < (sstr->slen + 1)) { if (dstr->alen > 0) /* if we have an allocated string, free it */ free(dstr->s); dstr->s = malloc(sstr->alen); dstr->alen = sstr->alen; return -1; /* Return failure */ } strcpy(dstr->s, sstr->s); return 0; } If the programmer always uses the supplied functions, then this method will work fine. Note that I did not code for cases where dstr or sstr are NULL since most implementations don't do this for strcpy. This is interesting. But it appears to me that the cpu overhead must be reasonably high cf FORTRAN fixed length character strings (which the

StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate. How does the C macro work? programmer must manage tightly).

...

...
All must agree that a bad programmer will over-run string space in any language!

Not entirely true. If the string structure is hidden from the programmer, then the implementation will protect the strings. Of course, a poorly written implementation could cause the same thing. Agreed! I recently discovered a serious error in a compiler at the highest optimization level, where a loop: for (i = x; i > min && i < max; i += step)

The compiler generated the following: for (i = x; i < max; i += step)

This worked fine when step was > 0, but failed when step was < 0. The bug has since been corrected.

-- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9 Interesting info, Thanks Jerry. Colin

Jerry Feldman

21:54

New subject: [suse-programming-e] Programming standards!

On Mon, 18 Apr 2005 02:22:26 +1000 Colin Carter wrote:

...

A macro you say! This I didn't know. I think of a NUL as being the ASCII character/byte being a full house of zero bits. The same as used at the end of a C string. You know, like ACK, BEL, HT, ETX, et cetera To make a FORTRAN character string acceptable to a C routine one might use:

StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate.

How does the C macro work? More specifically, a C Preprocessor macro. These are defined using the #define command. #define NULL 0 or #define NULL ((void *)0)

The C or C++ source is first preprocessed through the C preprocessor before the compiler's lex step. Most modern compilers combine the preprocessor with lex, but it is stipp a unique command on many Linux and Unix systems. Other languages, like FORTRAN on Linux and Unix also use the preprocessor. In C, if you want to define a constant, you generally use a macro. In C+ +, the const keyword lets you define a true constant. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Colin Carter

18 Apr 18 Apr

09:22

New subject: [suse-programming-e] Programming standards!

On Monday 18 April 2005 07:54, Jerry Feldman wrote:

...

On Mon, 18 Apr 2005 02:22:26 +1000

Colin Carter wrote:

...
A macro you say! This I didn't know. I think of a NUL as being the ASCII character/byte being a full house of zero bits. The same as used at the end of a C string. You know, like ACK, BEL, HT, ETX, et cetera To make a FORTRAN character string acceptable to a C routine one might use:

StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate.

How does the C macro work?

More specifically, a C Preprocessor macro. These are defined using the #define command. #define NULL 0 or #define NULL ((void *)0)

Jerry, this is where I normally get into trouble. Some C functions require an integer and some a pointer, and when examples just supply a NULL I inevitably get a incorrect type warning. I understand what you said (above), and I know that a full house of zeros can be an integer or a NULL pointer, but how can the C include files define it BOTH ways above, or does the preprocessor just insert a zero? Colin

...

The C or C++ source is first preprocessed through the C preprocessor before the compiler's lex step. Most modern compilers combine the preprocessor with lex, but it is stipp a unique command on many Linux and Unix systems. Other languages, like FORTRAN on Linux and Unix also use the preprocessor. In C, if you want to define a constant, you generally use a macro. In C+ +, the const keyword lets you define a true constant. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Synthetic Cartoonz

02:00

New subject: [suse-programming-e] Programming standards!

On Sunday 17 April 2005 12:22, Colin Carter wrote:

...

On Monday 18 April 2005 00:09, Jerry Feldman wrote:

...
On Sat, 16 Apr 2005 13:17:22 +1000

Colin Carter wrote:

...
Yes. In particular FORTRAN had a nice standard of first BYTE holding size.

...
Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate

Bill Gates and mates just changing definitions.)

NULL is a macro in C referring to the null pointer. The string termination is simply a null byte, not NULL.

A macro you say! This I didn't know. I think of a NUL as being the ASCII character/byte being a full house of zero bits. The same as used at the end of a C string. You know, like ACK, BEL, HT, ETX, et cetera To make a FORTRAN character string acceptable to a C routine one might use: StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate.

How does the C macro work?

Your perspective of the C string is by and large correct. In C macro and pre-processor defined value are interchangeable terms. Whatever is defined as NULL for your compiler ultimately translates to the value zero. NULL is most correctly used to describe a pointer value, but depending on it's context in the code, NULL could be used as a 0 value char, long, pointer, etc, though some compilers might cough up warnings about type conversions. Though there is really nothing wrong with it, most C code wouldn't terminate a string by stuffing NULL into a character position. (assuming char a[2] and char * b pointing to something usable...) a[0] = NULL; or *b = NULL; Instead you'll usually see: a[0] = '\0'; or *b = '\0'; [snip]

Colin Carter

09:48

New subject: [suse-programming-e] Programming standards!

...

On Sunday 17 April 2005 12:22, Colin Carter wrote:

...
On Monday 18 April 2005 00:09, Jerry Feldman wrote:

...
On Sat, 16 Apr 2005 13:17:22 +1000

Colin Carter wrote:

...
Yes. In particular FORTRAN had a nice standard of first BYTE holding size.

Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate

Bill Gates and mates just changing definitions.)

NULL is a macro in C referring to the null pointer. The string termination is simply a null byte, not NULL.

A macro you say! This I didn't know. I think of a NUL as being the ASCII character/byte being a full house of zero bits. The same as used at the end of a C string. You know, like ACK, BEL, HT, ETX, et cetera To make a FORTRAN character string acceptable to a C routine one might use: StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate.

How does the C macro work?

Your perspective of the C string is by and large correct.

In C macro and pre-processor defined value are interchangeable terms. Whatever is defined as NULL for your compiler ultimately translates to the value zero.

NULL is most correctly used to describe a pointer value, but depending on it's context in the code, NULL could be used as a 0 value char, long, pointer, etc, though some compilers might cough up warnings about type conversions. Hi Synthetic, and thanks, This is what I thought, but because, as you said, some programmers use

On Monday 18 April 2005 12:00, Synthetic Cartoonz wrote: the symbol NULL as a substitute for an integer zero, I became confused.

...

Though there is really nothing wrong with it, most C code wouldn't terminate a string by stuffing NULL into a character position. (assuming char a[2] and char * b pointing to something usable...)

a[0] = NULL; or *b = NULL;

Instead you'll usually see:

a[0] = '\0'; or *b = '\0';

How about: char Buffer[] = "Hello Charlie"; Buffer[5] = '\0' Is this 'stuffing a NUL' ? Or would you do a strncpy() ? In FORTRAN we have a nice (called obsolete) facility called EQUIVALENCE used like: integer (kind=4):: iArray(64) character(kind=1,len=256):: cArray equivalence (iArray, cArray) This same memory can be referenced as a character string or an array of integers. Maybe someone can answer this: In the old cpu's the copy of an integer array required, in Assembler, the loading of each element into a register followed by the storage of the element into the new array. And the associated maintanace of a counter in another register. Whereas copy of a string involved placing the address of each string into a register, and the number of characters to move, followed by one assembler instruction and the hardware did the rest at the speed of light. Thus, copying the string equivalent was much faster than copying an array of integers. I have done no Assembler since M$ Windows 95. Does anybody know enough about modern cpu's to answer my question? Regards, Colin

Synthetic Cartoonz

11:36

New subject: [suse-programming-e] Programming standards!

On Monday 18 April 2005 05:48, Colin Carter wrote:

...

On Monday 18 April 2005 12:00, Synthetic Cartoonz wrote:

...
...
To make a FORTRAN character string acceptable to a C routine one might use: StringFred(1:6) = "Hello" // char(0) where // is the FORTRAN symbol for concatenate.

How does the C macro work? [snip] NULL is most correctly used to describe a pointer value, but depending on it's context in the code, NULL could be used as a 0 value char, long,

On Sunday 17 April 2005 12:22, Colin Carter wrote: [snip] pointer, etc, though some compilers might cough up warnings about type conversions.

Hi Synthetic, and thanks, This is what I thought, but because, as you said, some programmers use the symbol NULL as a substitute for an integer zero, I became confused.

...
Though there is really nothing wrong with it, most C code wouldn't terminate a string by stuffing NULL into a character position. (assuming char a[2] and char * b pointing to something usable...)

a[0] = NULL; or *b = NULL;

Instead you'll usually see:

a[0] = '\0'; or *b = '\0';

How about: char Buffer[] = "Hello Charlie"; Buffer[5] = '\0' Is this 'stuffing a NUL' ?

Yes. You turned Buffer into the string "Hello".

...

Or would you do a strncpy() ?

In FORTRAN we have a nice (called obsolete) facility called EQUIVALENCE used like:

integer (kind=4):: iArray(64) character(kind=1,len=256):: cArray equivalence (iArray, cArray)

This same memory can be referenced as a character string or an array of integers.

char Buffer[] = "Hello Charlie"; int * danger = (int *) Buffer; There ya go. Same thing in C. "danger" has to be managed very carefully, or you'll end up dereferencing danger outside of the space of Buffer. IF this must be done it would be safer to define Buffer to a multiple of the sizeof the largest type that you were doing "equivalence". (Which is basically what you do in your FORTRAN equivalance example: integer 4 * 64 == 256 . character 1 * 256 == 256 .) Maybe like.... char Buffer[ sizeof(long) * 5]; long * danger = (long *) Buffer; strcpy(Buffer, "Hey There Charlie"); assumming 4 byte longs then *danger would point to the value incorporating "Hey ". Depending on the cpu word/byte endian organization that value as a long might be 0x48657920 or some other juggling of those four hex pairs. Still, in C or any other language permitting real address pointers this is flirting with doom if you are not careful. However, a long time ago I recall seeing something similar. A C compiler implemented long words as the smallest allocation unit, so the assembly implementation of the standard C libraries could copy, compare, and count multiple bytes at a time resulting in very fast string/memory operations. The string handling code relied on the fact that a flag was set when any byte within the long register was zero, so the code knew when it reached the actual end of the strings. This, of course, is entirely dependent on CPU architecture.

...

Maybe someone can answer this: In the old cpu's the copy of an integer array required, in Assembler, the loading of each element into a register followed by the storage of the element into the new array. And the associated maintanace of a counter in another register.

...

Whereas copy of a string involved placing the address of each string into a register, and the number of characters to move, followed by one assembler instruction and the hardware did the rest at the speed of light.

...

Thus, copying the string equivalent was much faster than copying an array of integers.

This would be CPU architecture dependant. If a CPU does not support this, then the implementation of strcpy and similar functions in the libraries would look just like the long, slow(er) integer copy operation.

...

I have done no Assembler since M$ Windows 95. Does anybody know enough about modern cpu's to answer my question?

Colin Carter

19 Apr 19 Apr

14:09

New subject: [suse-programming-e] Programming standards!

Thanks Jerry and "Synthetic" for your responses. On Monday 18 April 2005 21:36, Synthetic Cartoonz wrote:

...

On Monday 18 April 2005 05:48, Colin Carter wrote:

...
On Monday 18 April 2005 12:00, Synthetic Cartoonz wrote:

...
On Sunday 17 April 2005 12:22, Colin Carter wrote:

[snip]

...
...
NULL is most correctly used to describe a pointer value, but depending on it's context in the code, NULL could be used as a 0 value char, long, pointer, etc, though some compilers might cough up warnings about type conversions.

snip >

...
How about: char Buffer[] = "Hello Charlie"; Buffer[5] = '\0' Is this 'stuffing a NUL' ?

Yes. You turned Buffer into the string "Hello".

...
Or would you do a strncpy() ?

In FORTRAN we have a nice (called obsolete) facility called EQUIVALENCE used like:

snip >

...
This same memory can be referenced as a character string or an array of integers.

char Buffer[] = "Hello Charlie"; int * danger = (int *) Buffer;

There ya go. Same thing in C. "danger" has to be managed very carefully, or you'll end up dereferencing danger outside of the space of Buffer.

Bingo! Thanks for this - you don't mind if I use it... ;-)

...

IF this must be done it would be safer to define Buffer to a multiple of the sizeof the largest type that you were doing "equivalence". (Which is basically what you do in your FORTRAN equivalance example: integer 4 * 64 == 256 . character 1 * 256 == 256 .)

Yes, I definitely do this. Jerry also mentioned the correct use of multiples of "natural" units. I think I will use 64 bit integers (on my AMD64) because I think it ought to be faster than 32 bit, and memory is 'cheaper' than cpu speed, and one pays for memory once, but for cpu time every time you run the code.

...

char Buffer[ sizeof(long) * 5]; long * danger = (long *) Buffer;

strcpy(Buffer, "Hey There Charlie");

assumming 4 byte longs then *danger would point to the value incorporating "Hey ". Depending on the cpu word/byte endian organization that value as a long might be 0x48657920 or some other juggling of those four hex pairs.

And I'm gonna use this info too. I can code blind in FORTRAN, but always a bit nervous in C.

...

Still, in C or any other language permitting real address pointers this is flirting with doom if you are not careful. But the rewards... However, a long time ago I recall seeing something similar. A C compiler implemented long words as the smallest allocation unit, so the assembly implementation of the standard C libraries could copy, compare, and count multiple bytes at a time resulting in very fast string/memory operations. The string handling code relied on the fact that a flag was set when any byte within the long register was zero, so the code knew when it reached the actual end of the strings. This, of course, is entirely dependent on CPU architecture. I am glad that someone else knows of this. snip

Regards, Colin

Jerry Feldman

14:30

New subject: [suse-programming-e] Programming standards!

On Tuesday 19 April 2005 10:09 am, Colin Carter wrote:

...

Jerry also mentioned the correct use of multiples of "natural" units. I think I will use 64 bit integers (on my AMD64) because I think it ought to be faster than 32 bit, and memory is 'cheaper' than cpu speed, and one pays for memory once, but for cpu time every time you run the code. 64-bit systems generally use the LP64 method, which means that both pointers and long integers are 64-bits while int remains at 32-bits. Most Unix and Linux systems use this. You will not pay a penalty for using a 32-bit int. C and C++ will automatically align your data for you: struct { char a; /* aligned on at least a 64-bit boundary probably 128 */ long b; /* aligned on a 64-bit boundary */ }; The above structure is 16 bytes long with 7 bytes inserted as a filler between a and b. (on a 32-bit system, the filler would be 3 bytes). -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Jerry Feldman

18 Apr 18 Apr

12:29

New subject: [suse-programming-e] Programming standards!

On Monday 18 April 2005 5:48 am, Colin Carter wrote:

...

Thus, copying the string equivalent was much faster than copying an array of integers.

I have done no Assembler since M$ Windows 95. Does anybody know enough about modern cpu's to answer my question? There are many different CPUs today, but let's look at some of the RISC and 64-bit CPUs. Most of these CPUs are much more efficient when copying data aligned on a natural boundary: For instance, a 32-bit quantity aligns on a 32-bit boundary, a 64-bit quantity on a 64-bit boundary. Additionally, all CPUs use techniques, such as pipelining and cache. nearly all CPUs today do have byte and word (16-bit) instructions, but the copying byte by byte is slow: 1. load byte into register 2. test value, branch on 0 3. store byte 4. goto 1.

Another technique is speculation. Some operations will be performed even after the branch is taken, and invalidated later. One of the ways a string copy is made more efficient is to load a register full of bytes (32-bit or 64-bit). Well written, highly optimized libraries will take advantage of the CPU. I happen to be most familiar with Digital's (now HP's) Alpha chip. The early Alphas did not have byte and word instructions. The current most popular chips today are the x86 series. The newer x86-64 chips have 16 64-bit registers, but 32-bit (legacy) code will only use 8. In addition, there are 6 segment registers (3 for 64-bit code). In any case, accessing properly aligned data is much faster than accessing unaligned data. -- Jerry Feldman Partner Technology Access Center (contractor) (PTAC-MA) Hewlett-Packard Co. 550 King Street LKG2a-X2 Littleton, Ma. 01460 (978)506-5243

Philipp Thomas

19 Apr 19 Apr

23:39

New subject: [suse-programming-e] Programming standards!

Colin Carter [18 Apr 2005 19:48:20 +1000]:

...

This is what I thought, but because, as you said, some programmers use the symbol NULL as a substitute for an integer zero, I became confused.

That's from the pre-ANSI days, when void and 'void *' didn't exist. The C standard defines NULL to be (void *)0. Philipp

Colin Carter

20 Apr 20 Apr

09:48

New subject: [suse-programming-e] Programming standards!

On Wednesday 20 April 2005 09:39, Philipp Thomas wrote:

...

Colin Carter [18 Apr 2005 19:48:20 +1000]:

...
This is what I thought, but because, as you said, some programmers use the symbol NULL as a substitute for an integer zero, I became confused.

That's from the pre-ANSI days, when void and 'void *' didn't exist. The C standard defines NULL to be (void *)0.

Philipp

Thanks, I'll keep that definition in mind. Colin

ac

15:59

New subject: [suse-programming-e] Programming standards!

Dear friends, Colin and Jerry! Would you please stop flooding this mailing list with helicopters? It was (not accidentally) named 'suse-programming', after all. Best wishes, AC

Jerry Feldman

12:16

New subject: [suse-programming-e] Programming standards!

On Tuesday 19 April 2005 7:39 pm, Philipp Thomas wrote:

...

That's from the pre-ANSI days, when void and 'void *' didn't exist. The C standard defines NULL to be (void *)0. Not entirely true: "NULL which expands to an implementation-defined null pointer constant". ISO/IEC 9899:1999

Also, the void keyword did exist, at least back in 1980 when I was working on porting Xenix to a Raytheon machine. I believe that ANSI '89 defined the (void *) type at the universal pointer. The null pointer constant: "An integer constant expression with the value of 0, or such an expression cast to type void *". In C, both K&R and ANSI, the integer 0 has a special consideration. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Michael Stevens

15:38

New subject: [suse-programming-e] Programming standards!

On Wednesday 20 April 2005 14:16, Jerry Feldman wrote:

...

On Tuesday 19 April 2005 7:39 pm, Philipp Thomas wrote:

...
That's from the pre-ANSI days, when void and 'void *' didn't exist. The C standard defines NULL to be (void *)0.

Not entirely true: "NULL which expands to an implementation-defined null pointer constant". ISO/IEC 9899:1999

A rather vague definition. I was recently looking at what to do with NULL in C++. So I looked it up in Stroustroup. In section 5.1.1 "Zero" there is the following paragraph: In C, it has been popular to define a macro NULL to represent the zero pointer. Because of C++ tighter type checking, the use of plain 0, rather then and suggested NULL macro leads to fewer problems. If you feel you must use NULL, use "const int NULL = 0;" The const qualifer prevents accidental redefinition of NULL and ensures that NULL can be used where a constant is required. I really don't understand what Bjarne means. Particularly the mention of 'tighter type checking' and 'fewever problems' seem odd. Of course macros are evil so that is a good reason not to use NULL. The ISO C++ standard only refers to NULL in the conext of the <cxxx> header files. -- ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________

Jerry Feldman

16:58

New subject: [suse-programming-e] Programming standards!

...

On Wednesday 20 April 2005 14:16, Jerry Feldman wrote:

...
On Tuesday 19 April 2005 7:39 pm, Philipp Thomas wrote:

...
That's from the pre-ANSI days, when void and 'void *' didn't exist. The C standard defines NULL to be (void *)0.

Not entirely true: "NULL which expands to an implementation-defined null pointer constant". ISO/IEC 9899:1999

A rather vague definition. I was recently looking at what to do with NULL in C++. So I looked it up in Stroustroup. In section 5.1.1 "Zero" there is the following paragraph:

In C, it has been popular to define a macro NULL to represent the zero pointer. Because of C++ tighter type checking, the use of plain 0, rather then and suggested NULL macro leads to fewer problems. If you feel you must use NULL, use "const int NULL = 0;" The const qualifer prevents accidental redefinition of NULL and ensures that NULL can be used where a constant is required.

I really don't understand what Bjarne means. Particularly the mention of 'tighter type checking' and 'fewever problems' seem odd. Of course macros are evil so that is a good reason not to use NULL. The ISO C++ standard only refers to NULL in the conext of the <cxxx> header files. The C++ language has true constants. First, every C++ function must be fully prototyped unlike C. NULL is a macro defined in stdio.h. In the C++ context, it might be more

On Wednesday 20 April 2005 11:38 am, Michael Stevens wrote: proper to use: const void * NULL = 0; Then, the C++ NULL constant is a true pointer with a value of 0. const int NULL = 0 is somewhat problematical in a 64-bit environment, since pointers are 64-bits. But, since C++ is fully prototyped, it will be widened appropriately. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Matthias Hopf

21 Apr 21 Apr

09:23

New subject: [suse-programming-e] Programming standards!

...

const void * NULL = 0; Then, the C++ NULL constant is a true pointer with a value of 0. const int NULL = 0 is somewhat problematical in a 64-bit environment, since pointers are 64-bits. But, since C++ is fully prototyped, it will be widened appropriately.

Not on stdarg variable argument lists, if the underlying function expects a pointer. Matthias -- Matthias Hopf __ __ __ Maxfeldstr. 5 / 90409 Nuernberg (_ | | (_ |__ mat@mshopf.de Phone +49-911-74053-715 __) |_| __) |__ labs www.mshopf.de

Michael Stevens

09:48

New subject: [suse-programming-e] Programming standards!

On Thursday 21 April 2005 11:23, Matthias Hopf wrote:

...

...
const void * NULL = 0; Then, the C++ NULL constant is a true pointer with a value of 0. const int NULL = 0 is somewhat problematical in a 64-bit environment, since pointers are 64-bits. But, since C++ is fully prototyped, it will be widened appropriately.

Not on stdarg variable argument lists, if the underlying function expects a pointer.

I guess Bjarne recommends the 'const int NULL = 0' definition purely for C backward compatibility. In C++ it would seem that the 'const void * NULL = 0' definition would be a good thing as opposed to the literal '0' which Bjarne is recommending. Michael ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________

Randall R Schulz

12:31

New subject: [suse-programming-e] Programming standards!

Michael, On Thursday 21 April 2005 02:48, Michael Stevens wrote:

...

...

In C++ it would seem that the 'const void * NULL = 0' definition would be a good thing as opposed to the literal '0' which Bjarne is recommending.

I used these routinely in my C++ work: const void *NIL = 0; const char NUL = 0; I would use a naked 0 in source code only where the context was actually integer, not pointer or character, in which case I'd use on of these.

...

Michael

Randall Schulz

Colin Carter

13:22

New subject: [suse-programming-e] Programming standards!

On Thursday 21 April 2005 22:31, Randall R Schulz wrote:

...

Michael,

On Thursday 21 April 2005 02:48, Michael Stevens wrote:

...
...

In C++ it would seem that the 'const void * NULL = 0' definition would be a good thing as opposed to the literal '0' which Bjarne is recommending.

I used these routinely in my C++ work:

const void *NIL = 0; const char NUL = 0;

I would use a naked 0 in source code only where the context was actually integer, not pointer or character, in which case I'd use on of these.

...
Michael

Randall Schulz

I like this idea - it makes it clear about what you mean, whereas (to me anyway) there is always confusion with NULL which is not even NUL. How do you write that a pointer ptr is pointing nowhere? That is, the value of the pointer is zero. I mean so that it is clear that it is not pointing to a zero value. Pardon my ignorance, but my preference is for FORTRAN and we are positively discouraged from using pointers because all of our variables are in fact addresses (pointers) and not 'values' as in C. Regards, Colin Colin

Randall R Schulz

13:41

New subject: [suse-programming-e] Programming standards!

Colin, On Thursday 21 April 2005 06:22, Colin Carter wrote:

...

On Thursday 21 April 2005 22:31, Randall R Schulz wrote:

...
Michael,

On Thursday 21 April 2005 02:48, Michael Stevens wrote:

...
...

In C++ it would seem that the 'const void * NULL = 0' definition would be a good thing as opposed to the literal '0' which Bjarne is recommending.

I used these routinely in my C++ work:

const void *NIL = 0; const char NUL = 0;

I should have written this, for the sake of pedanticness: const void *NIL = (void *) 0; const char NUL = (char) 0;

...

...
...

I like this idea - it makes it clear about what you mean, whereas (to me anyway) there is always confusion with NULL which is not even NUL.

How do you write that a pointer ptr is pointing nowhere?

A pointer of any sort that compares equal to 0 or which was initialized or assigned 0 is guaranteed not to be a valid pointer. The consequences of dereferencing such a pointer is undefined. On some systems it will generate a fault of some sort. On others you silently read or corrupt low memory (virtual or physical, as the case may be).

...

That is, the value of the pointer is zero. I mean so that it is clear that it is not pointing to a zero value. Pardon my ignorance, but my preference is for FORTRAN and we are positively discouraged from using pointers because all of our variables are in fact addresses (pointers) and not 'values' as in C.

As far as FORTRAN goes, de gustibus...

...

Regards, Colin

Randall Schulz

Jerry Feldman

12:32

New subject: [suse-programming-e] Programming standards!

On Thursday 21 April 2005 5:23 am, Matthias Hopf wrote:

...

...
const void * NULL = 0; Then, the C++ NULL constant is a true pointer with a value of 0. const int NULL = 0 is somewhat problematical in a 64-bit environment, since pointers are 64-bits. But, since C++ is fully prototyped, it will be widened appropriately.

Not on stdarg variable argument lists, if the underlying function expects a pointer. stdarg variables lists are C not C++, although C++ does allow them because C requires them. But, you are correct, in a variable length argument list (eg. prototyped with ...), the implementation is not required to widen and int.

-- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

William A. Mahaffey III

18 Apr 18 Apr

12:45

New subject: [suse-programming-e] Programming standards!

Jerry Feldman wrote:

...

On Sat, 16 Apr 2005 13:17:22 +1000 Colin Carter wrote:

...
Yes. In particular FORTRAN had a nice standard of first BYTE holding

size.

...
Now it "appears" that C++ does not need to have this number because the string ends with a NULL (or strictly correct being a NUL. I hate

Bill

...
Gates and mates just changing definitions.)

NULL is a macro in C referring to the null pointer. The string termination is simply a null byte, not NULL. Different languages have different standards. The length byte in a FORTRAN string is hidden from the programmer. Languages like BASIC have some very sophisticated string manipulation routines built=in.

...
But, it is only hidden from the C++ programmer. Do C++ programmers think that the OS has no idea where the allocated string space ends? The OS is now burdened with keeping track of how much space it has allocated for the string; and it will probably have to abandon that

piece of

...
memory and allocate a new chunk of memory and shift the rubbish over when the (blind) C++ programmer inserts too many characters for the allocated space.

The OS has nothing to do with this. The C++ sting is a class. It uses the C-style string as its basis, so if there is any changing of the size of a C++ string, no buffer overflow will occur if the implementation coded the underlying class correctly. In C, you can implement a similar thing: typedef struct _String { size_t slen; /* Length of string - eg. strlen */ size_t alen; /* Amount allocated for the string */ char *s; /* pointer to string */ } STRING; slen is a bit redundant but used for efficiency. alen is used because we might allocate more than slen. Here is a possible string copy function: int STRING_Copy(STRING *dstr, const STRING *sstr) { if (dstr->alen < (sstr->slen + 1)) { if (dstr->alen > 0) /* if we have an allocated string, free it */ free(dstr->s); dstr->s = malloc(sstr->alen); dstr->alen = sstr->alen; return -1; /* Return failure */ } strcpy(dstr->s, sstr->s); return 0; } If the programmer always uses the supplied functions, then this method will work fine. Note that I did not code for cases where dstr or sstr are NULL since most implementations don't do this for strcpy.

Why not use realloc instead of free & malloc ?

...

...
All must agree that a bad programmer will over-run string space in any

...
language!

Not entirely true. If the string structure is hidden from the programmer, then the implementation will protect the strings. Of course, a poorly written implementation could cause the same thing.

I recently discovered a serious error in a compiler at the highest optimization level, where a loop: for (i = x; i > min && i < max; i += step)

The compiler generated the following: for (i = x; i < max; i += step)

This worked fine when step was > 0, but failed when step was < 0. The bug has since been corrected.

Which compiler (just curious) ? -- William A. Mahaffey III --------------------------------------------------------------------- Remember, ignorance is bliss, but willful ignorance is LIBERALISM !!!!

Jerry Feldman

12:54

New subject: [suse-programming-e] Programming standards!

On Monday 18 April 2005 8:45 am, William A. Mahaffey III wrote:

...

Jerry Feldman wrote:

...

...
int STRING_Copy(STRING *dstr, const STRING *sstr) { if (dstr->alen < (sstr->slen + 1)) { if (dstr->alen > 0) /* if we have an allocated string, free it */ free(dstr->s); dstr->s = malloc(sstr->alen); dstr->alen = sstr->alen; return -1; /* Return failure */ } strcpy(dstr->s, sstr->s); return 0; } If the programmer always uses the supplied functions, then this method will work fine. Note that I did not code for cases where dstr or sstr are NULL since most implementations don't do this for strcpy.

Why not use realloc instead of free & malloc ? That is because, in this example, there may not have been a pre-allocated string. But, I could have used realloc, by setting the pointer to NULL, but I was just offering a quick example of how to implement a string in C. if (dstr->alen == 0) { /* if we do not have an allocated string * make the ptr NULL */ dstr->s = NULL; } dstr->s = realloc(dstr->s, sstr->alen);

...

Which compiler (just curious) ? I'd rather not say, but it is a proprietary compiler and the bug was quickly fixed. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Randall R Schulz

16 Apr 16 Apr

02:04

New subject: [suse-programming-e] Programming standards!

Stefan, On Friday 15 April 2005 07:44, Stefan Hundhammer wrote:

...

On Friday 15 April 2005 14:43, Colin Carter wrote:

...
Yes, I have noticed how most modern programs waste time poling the mirriad of open windows ....

Urgh - folks, can we stop the urban legends at some point, please?

...

Thank you. I get really sick of the same old uninformed criticisms of "programs and programmers these days." Clearly the people who spout these critiques don't understand what it's like to develop software today. Between the feature and schedule pressure commonly brought to bear on software developers, it's no surprise software isn't what it should be. Are modern programs optimum in their use of computational resources? Hell no. Management wouldn't tolerate what it takes in terms of development investment to make it so, and it has nothing to do with O-O languages or lazy programming. As program and system size and complexity have grown, technologies like Object-Oriented languages are a necessity to manage that complexity. It is the primary reason those techniques were developed, and they are by no means trivial in what the bring to the working programmer and software designer. Ultimately, all engineering is embodies trade-offs (technical, social, organizational, market, etc.), and software engineering is no exception. Randall Schulz

Colin Carter

03:21

New subject: [suse-programming-e] Programming standards!

Bingo! On Saturday 16 April 2005 12:04, Randall R Schulz wrote:

...

Stefan,

On Friday 15 April 2005 07:44, Stefan Hundhammer wrote:

...
On Friday 15 April 2005 14:43, Colin Carter wrote:

...
Yes, I have noticed how most modern programs waste time poling the mirriad of open windows ....

Urgh - folks, can we stop the urban legends at some point, please?

...

Thank you.

I get really sick of the same old uninformed criticisms of "programs and programmers these days." Clearly the people who spout these critiques don't understand what it's like to develop software today. Between the feature and schedule pressure commonly brought to bear on software developers, it's no surprise software isn't what it should be.

Are modern programs optimum in their use of computational resources? Hell no. Management wouldn't tolerate what it takes in terms of development investment to make it so, and it has nothing to do with O-O languages or lazy programming.

As program and system size and complexity have grown, technologies like Object-Oriented languages are a necessity to manage that complexity. It is the primary reason those techniques were developed, and they are by no means trivial in what the bring to the working programmer and software designer.

Ultimately, all engineering is embodies trade-offs (technical, social, organizational, market, etc.), and software engineering is no exception.

Randall Schulz

Yes Randall, that is the last nail in the coffin. I have found it gets worse and worse. Makes me wanna run away. Once we went to the corner shop which had the best burgers, but now the best burger joint is out of business because the dumb buyers go for the cheapest burger. Regards, Colin

Colin Carter

02:59

New subject: [suse-programming-e] Programming standards!

On Saturday 16 April 2005 00:44, Stefan Hundhammer wrote:

...

On Friday 15 April 2005 14:43, Colin Carter wrote:

...
Yes, I have noticed how most modern programs waste time poling the mirriad of open windows ....

Urgh - folks, can we stop the urban legends at some point, please?

Everybody who has a minimum clue of how any kind of GUI programming works should know that those programs spend most of their time waiting on a socket, waiting for user input - on X11 (no matter what toolkit is being used - KDE, Gtk, OSF/Motif, Xt, ...) and on Win32. There is no "busy wait" in any such program I know.

Sorry mate - no myth. While working in the UK I became very frustrated waiting for some not-so-large number-cruncher to execute. I noticed that if I closed or covered all windows I could then the code ran faster. The young guy (an exceptional young programmer) revisited his code and 'disabled' every button on the screen except 'stop' and the execution time changed from over 10 minutes to less than one minute. You will also find that if your very simple window covers the desktop the code runs faster. Well, in M$ anyway; I'm a Linux Newbie.

...

The other myths about object orientation etc. are similar: If you do non-trivial software, you simply NEED that kind of thing so you can have the kind of abstraction level you will want to have so the software can be mantained at all, much more for extended periods of time. Not true: I have seen OO maintenance programmers really stuff up systems because they were unaware of how the 'objects' are relied on by other parts of the code. One such 'stuff up' cost the company months of work.

...

Most of the finite elements programs people were so fond of in the FORTRAN times are written by now - there are probably generic programs for that kind of thing. I don't think so. My best friend is steering the English uni gurus in this field - they can't get the finite element code accurate enough - he keeps exposing deficiencies in the code. His code runs on high speed multi- processor machines (He gave up on the Alpha) for one or two days. He is also quite critical of the 'slowness' of the code. No, the finite element code has a way to go yet.

...

Today's software is supposed to do everything, including making coffee, snip> I agree, but Why? I would be happier if the code did less, but was more robust and ran faster.

...

C's string handling for instance sucks - it is the source of most security holes that need to be fixed. Buffer overflows happen because C does not have a concept for variable length strings - it only has character pointers. What a nightmare. Maybe, but I think that there is a far worse problem: C strings do not have well defined lengths. Which means that the O.S. is always messing around allocating memory, and leaving unused bits floating.

...

I have been programming since the mid-80s, and even though I also tend to a beginner! bitch about many things, things have improved a lot since then. No more rebooting because a null pointer in C overwrote your PC's interrupt table at 0000:0000 on MS-DOS. Anybody remember what a PITA that was? Yes, I agree. But that was a function of small machines having limited memory and OS. The machine I worked on in the seventies, a 60 bit Cyber, was multi tasking in a big way (eg handled users from hundreds of kilometres away) and the only thing that crashed it was the cleaning lady unplugging it to at 6am to use the power point. (The thing re-started so efficiently that it took the operators a month to work out why things were 'funny'.)

snip>

...

And today, it's no more segfaults because the C string handling is so dumb.

I have to disagree with you and agree with Synthetooz who said: Konquerer segfaults more than anything else I know. Is that written in C? end quote. Except for Kaffeine Media Player which crashes with a seg fault at every 'close'.

...

Use modern tools. Use tools like C++ - or, for that matter, C#, or even Java. Use predefined (meaning: well-tested) classes for common purposes. Do not repeat everybody's (and their mothers' ) mistakes by writing your own because you think you can do a better job at that. Oh no, and Oh yes we can!

...

You argue this comes at a price - and the price is performance and system resources. That may be right, but I rather sacrifice some MB of RAM rather than experience random crashes because nobody can debug a software of that complexity written with outdated tools any more. My time (and, for that matter, my nerves) are way more precious to me than some MB of RAM saved. What do you mean "sacrifice memory". That is sacrosanct ! Any programmer who says "There's plenty of memory" would not get a job with me. A good programmer keeps tight reign on RAM and cpu time. I could keep ranting a lot more like that, but other duties are calling right now. ;-) Yeah, me too.

Just my 2 Cents (well, make that 4 - or 6) ;-) Mine too ;-) Stefan Hundhammer Penguin by conviction. YaST2 Development

...

Programmed with MS-DOS, SunOS, Solaris, HP-UX, Win32 (just enough to hate it) and X11 / OSF/Motif, Qt since 1984 with about a dozen programming languages A fair range there. But no really big machines, and no DEC/VAX. About a century ago my mate was working in Scotland and I was in Australia, and we were interacting in real time via "VAX Phone", That is like an up-market version of chat. And we transmitted large chunks of code "at the touch of a button". VAX was so far ahead of it's time. It is a pity that the PC money giants bought and killed it.

And I must agree with you about the M$ system. Hence my desire to try to learn X11 and Xt. Now give me a break, don't say I should jump right into Qt - I've got to learn to walk first ;-) I am still trying to clear my head of the M$ stuff. Cheers, and thanks fro the debate, Colin

Michael Stevens

09:16

New subject: [suse-programming-e] Programming standards!

On Saturday 16 April 2005 04:59, Colin Carter wrote:

...

On Saturday 16 April 2005 00:44, Stefan Hundhammer wrote:

...
On Friday 15 April 2005 14:43, Colin Carter wrote:

...
Yes, I have noticed how most modern programs waste time poling the mirriad of open windows ....

Urgh - folks, can we stop the urban legends at some point, please?

Everybody who has a minimum clue of how any kind of GUI programming works should know that those programs spend most of their time waiting on a socket, waiting for user input - on X11 (no matter what toolkit is being used - KDE, Gtk, OSF/Motif, Xt, ...) and on Win32. There is no "busy wait" in any such program I know.

Sorry mate - no myth. While working in the UK I became very frustrated waiting for some not-so-large number-cruncher to execute. I noticed that if I closed or covered all windows I could then the code ran faster. The young guy (an exceptional young programmer) revisited his code and 'disabled' every button on the screen except 'stop' and the execution time changed from over 10 minutes to less than one minute. You will also find that if your very simple window covers the desktop the code runs faster. Well, in M$ anyway; I'm a Linux Newbie.

Certainly such accidental coupling of execution to display code is not uncommon. You just need to watch the Buttons flicker on the YaST 'System Backup' applet to realise this even happens to SuSE! Michael -- ___________________________________ Michael Stevens Systems Engineering 34128 Kassel, Germany Navigation Systems, Estimation and Bayesian Filtering http://bayesclasses.sf.net ___________________________________

Jerry Feldman

15 Apr 15 Apr

12:05

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 2:23 am, Colin Carter wrote:

...

Programming standards have deteriorated significantly.

My argument: Some years ago academics developed design systems (Jordan etc) to stop young programmers writing spaghetti code (with lots of 'goto' statements). Then Knuth developed PASCAL to force young programmers to write properly structured code (goto statement not included).

Well, nobody could get a job if they hadn't done one of these 'design' courses and didn't know PASCAL.

Turns out that we old guys (especially FORTRAN scientific types) had been writing structured code for years, and we controlled our goto statements.

And 'they' had to add a goto to PASCAL (we grinned) because without it the code can become very inefficient (especially with PUSH/POP overheads).

Now the latest is "Object Oriented" (shouldn't that actually be orientated?) code with lots of over-heads. I know: there's plenty of cpu power and plenty of memory now. So now we are forced to buy hundreds of MB of RAM because so many young programmers say "there's plenty of memory". Nothing runs in 4 MB any more - too much over-head. As been previously answered, Wirth developed Pascal, not Knuth. Yordan developed structured code. Programming standards (eg. ANSI, ISO, et. al.) did not eliminate the goto statements. Many programming standards were developed in business so that code written by one programmer could be maintained by others resulting in a much lower cost of maintenance.

I once worked on a COBOL Personal Trust system for a bank. The code looked like this: ALTER R5RETURN TO GO TO S1. GOTO P1. S1. <--- more of the same ---> : P1. <--- do some processing ---> GOTO R5RETURN. : R5RETURN. GOTO. To a COBOL programmer, this code was horrendous. However, I had a background in assembler, and the code had been ported from IBM assembler to COBOL on a Burroughs mainframe. What this was in assembler was something like (if I can remember old 360 mainframe assembler): BALR r5, s1 Or branch to s1, storing the return value in register 5. In essence these were a series subroutine calls. The COBOL ALTER statement was a self modifying code statement that inserted the return address into the GOTO statement in paragraph R5RETURN. In today's world, programming standards for languages like COBOL, FORTRAN, C, C++, PASCAL, JAVA are very important because we write code that must work on many different platforms and OS's. I want my code to be able to run effectively on Linux (32 and 64-bit), HP-UX, Solaris, Tru64 Unix and more. If I follow the language standard and the Unix standards, my code should be portable. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Colin Carter

12:50

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 22:05, Jerry Feldman wrote:

...

On Friday 15 April 2005 2:23 am, Colin Carter wrote:

...
Programming standards have deteriorated significantly.

My argument: Some years ago academics developed design systems (Jordan etc) to stop young programmers writing spaghetti code (with lots of 'goto' statements). Then Knuth developed PASCAL to force young programmers to write properly structured code (goto statement not included).

Well, nobody could get a job if they hadn't done one of these 'design' courses and didn't know PASCAL.

Turns out that we old guys (especially FORTRAN scientific types) had been writing structured code for years, and we controlled our goto statements.

And 'they' had to add a goto to PASCAL (we grinned) because without it the code can become very inefficient (especially with PUSH/POP overheads).

Now the latest is "Object Oriented" (shouldn't that actually be orientated?) code with lots of over-heads. I know: there's plenty of cpu power and plenty of memory now. So now we are forced to buy hundreds of MB of RAM because so many young programmers say "there's plenty of memory". Nothing runs in 4 MB any more - too much over-head.

As been previously answered, Wirth developed Pascal, not Knuth. Yordan developed structured code. Oops - yes you are right about PASCAL, and I never caref for Yordan hence the incorrect spelling of the name of a guy who was trying to tell us how to suck eggs. Programming standards (eg. ANSI, ISO, et. al.) did not eliminate the goto statements. Many programming standards were developed in business so that code written by one programmer could be maintained by others resulting in a much lower cost of maintenance.

I once worked on a COBOL Personal Trust system for a bank. The code looked like this:

ALTER R5RETURN TO GO TO S1. GOTO P1. S1. <--- more of the same --->

P1. <--- do some processing ---> GOTO R5RETURN.

R5RETURN. GOTO.

To a COBOL programmer, this code was horrendous. However, I had a background in assembler, and the code had been ported from IBM assembler to COBOL on a Burroughs mainframe. What this was in assembler was something like (if I can remember old 360 mainframe assembler): BALR r5, s1 Or branch to s1, storing the return value in register 5. In essence these were a series subroutine calls. The COBOL ALTER statement was a self modifying code statement that inserted the return address into the GOTO statement in paragraph R5RETURN.

In today's world, programming standards for languages like COBOL, FORTRAN, C, C++, PASCAL, JAVA are very important because we write code that must work on many different platforms and OS's. I want my code to be able to run effectively on Linux (32 and 64-bit), HP-UX, Solaris, Tru64 Unix and more. If I follow the language standard and the Unix standards, my code should be portable. Yes, I agree with having standards. FORTRAN had them a long time ago. Even had the standard that there were no 'reserved' words.

If you're an old COBOL guy you are probably aware that a very large proportion of today's active code is still in COBOL (and FORTRAN).

...

-- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Colin

Jerry Feldman

13:41

New subject: [suse-programming-e] Programming standards!

On Friday 15 April 2005 8:50 am, Colin Carter wrote:

...

If you're an old COBOL guy you are probably aware that a very large proportion of today's active code is still in COBOL (and FORTRAN). Yes. My first language was FORTRAN on an IBM 7044. I learned COBOL and Assembler a bit later.

There are very few new applications written in COBOL, but there are COBOL compilers available on all systems. BTW: I am aware of the memory layouts of both the AMD64 (in both 32-bit and 64-bit mode) as well as EM64T (Intel's x86-64) as well as the Itanium. The advantage of the AMD64 and EM64T over the Itanium today is that existing 32-bit code and 32-bit Operating systems run well. The AMD64 handles NUMA better than Intel today. Both these chips will serve the desktops and low end servers well for the next few years, and the Itanium is better for the high end servers. Note that Linux has been a 64-bit OS since 1994 when it was ported to the Alpha by Jim Paradis and Linus. Actually Jim did the first port of 32-bit Linux, then Linus followed shortly with a full 64-bit port. But, going back to standards. I was involved in porting some Burroughs COBOL apps to IBM 370 COBOL. What a mess: numbers: Both used BCD numbers, in COBOL, for example picture 9999.99. In IBM, the above picture would pack to 4 bytes, because IBM always used a nybble for a sign. Burroughts addressed at the nybble level, and this picture would fit into 3 bytes because Burroughs did not require a sign nybble. In the picture s9999.99, on Burroughs this would be 3.5 bytes or 7 nybbles. Another Burroughsism: 01 foo. 02 bar1 99999.99. 02 bar2 99999.99. 01 fubar redefines foo picture 99999999999999. In this case you could add fubar to another similar structure, and as long as the fields did not overflow, you would get the right amount. On IBM you were guaranteed to crash with an 0C7 exception. This is one example of why programming standards (as in ISO or ANSI) are important. Back in those days, interoperability was something that was not done very often. Today, as I mentioned, when I write some code, it must be portable to 64-bit, 32-bit, little-endian, big-endian, and various OS's. By writing to a standard, my code should be reasonably portable. -- Jerry Feldman Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9

Colin Carter

16 Apr 16 Apr

02:07

New subject: [suse-programming-e] Programming standards!

Hi fellows. Sorry I missed the big debate. I was asleep (literally, because I am in Sydney, Australia). continue below... On Friday 15 April 2005 23:41, Jerry Feldman wrote:

...

On Friday 15 April 2005 8:50 am, Colin Carter wrote:

...
If you're an old COBOL guy you are probably aware that a very large proportion of today's active code is still in COBOL (and FORTRAN).

Yes. My first language was FORTRAN on an IBM 7044. I learned COBOL and Assembler a bit later. My first machine was a smaller version ("desk top"), prior to these big ones; followed by an IBM, which I think was a 7095? (Big boy of the military era.) FORTRAN, Assembler, COBOL = normal learning curve, but I never worked COBOL. There are very few new applications written in COBOL, but there are COBOL compilers available on all systems. You may be correct about COBOL now, but only a few years ago I was reading that most of the active code (mostly written years ago for the big insurance companies, banks et cetera) was still COBOL. Maybe!

...

BTW: I am aware of the memory layouts of both the AMD64 (in both 32-bit and 64-bit mode) as well as EM64T (Intel's x86-64) as well as the Itanium. I'll have to pick your brains on this one if I am to code the AMD64 properly. :-)

snip>

...

But, going back to standards. I was involved in porting some Burroughs COBOL apps to IBM 370 COBOL. What a mess: snip> nybbles. Yeah man! I don't heard this word much these days. Remember 7 bit ASCII code?

Some people think that BCD numbers were stupid; not realizing that they were important in money handling (preventing the crooked programmer from snip the part cent/penny).

...

Jerry Feldman

Regards, Colin

6946

Age (days ago)

6952

Last active (days ago)

List overview

Download

40 comments

12 participants

participants (12)

ac
Colin Carter
Jeffrey L. Taylor
Jerry Feldman
Jerry Feldman
Matthias Hopf
Michael Stevens
Philipp Thomas
Randall R Schulz
Stefan Hundhammer
Synthetic Cartoonz
William A. Mahaffey III