This code is based on a recent patch to glheretic needed to make
heretic work correctly when compiled using gcc-3.3.*:
#include
#include
void strupr_OLD (char *s)
{
while (*s) {
*s++ = toupper(*s);
}
}
void strupr_NEW (char *s)
{
while (*s) {
*s = toupper(*s);
s++;
}
}
int main(void)
{
char test_old[] = "this is using strupr_OLD";
char test_new[] = "this is using strupr_NEW";
strupr_OLD(test_old);
printf("%s\n", test_old);
strupr_NEW(test_new);
printf("%s\n", test_new);
return 0;
}
When compiled WITH optimization:
markgray@soyo:/usr/src/packages/BUILD/bug> gcc -O2 -Wall -W bug.c
markgray@soyo:/usr/src/packages/BUILD/bug> ./a.out
HIS IS USING STRUPR_OLD
THIS IS USING STRUPR_NEW
markgray@soyo:/usr/src/packages/BUILD/bug>
Notice that there is NO warning given and that the old code has an
off-by-one error in it.
When compiled WITHOUT optimization:
markgray@soyo:/usr/src/packages/BUILD/bug> gcc -Wall -W bug.c
bug.c: In function `strupr_OLD':
bug.c:7: warning: operation on `s' may be undefined
markgray@soyo:/usr/src/packages/BUILD/bug> ./a.out
THIS IS USING STRUPR_OLD
THIS IS USING STRUPR_NEW
markgray@soyo:/usr/src/packages/BUILD/bug>
It does give a nice warning message -- but the code it produces works
as the original programmer intended. Why is there no warning given
when optimization is used -- and the warning is needed?
The programmer who posted the patch said that the change is due to an
ISO C change made to allow more aggressive optimization. Does anyone
have a list of what kinds of problematic code one should watch out for
in old C code?
I would argue that the old code has long been considered to be a
perfectly well defined C idiom (unlike the newby mistake i=i++), and
that code very much like it is very common in older C programs --
glheretic, for one, is still very buggy even after this one fix, so I
suspect that there are more "ISO C gotchas" lurking in it.
Looking at the kernel source code:
markgray@soyo:/usr/src/linux> find -name '*.[ch]' -print0|xargs --null \
grep -P '\*\w+\+\+\s*='|wc -l
3137
markgray@soyo:/usr/src/linux>
That's 3137 occurrences of '*symbol++ =' -- and looking at them in
context shows that a lot of them are part of a 'while(*something)'
loop. What exactly makes the old heretic code so wrong in the eyes of
ISO C? A list of example problematic code would be most welcome. (A
proper warning from gcc when using optimization would be nice as
well.)