[opensuse-programming] isprint() and isgraph() are locale sensitive, but ...
Can anyone suggest a reason why C functions isprint() and isgraph() are both locale sensitive (as they should be), whereas regex character classes [:print:] and [:graph:] are not (when used in perl) ? /Per Jessen, Zürich --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
Per Jessen wrote:
Can anyone suggest a reason why C functions isprint() and isgraph() are both locale sensitive (as they should be), whereas regex character classes [:print:] and [:graph:] are not (when used in perl) ?
same thing for isalnum() and [:alnum:]. Does anyone know where the correct behaviour of 1) the C-function and 2) the regex character class is defined? /Per Jessen, Zürich --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Thu, 11 Sep 2008 08:59:14 +0200, Per Jessen wrote:
Does anyone know where the correct behaviour of 1) the C-function
The ISO C Standard, which has to be bought.
and 2) the regex character class is defined?
I'd guess it's defined in POSIX. Philipp --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
Philipp Thomas wrote:
On Thu, 11 Sep 2008 08:59:14 +0200, Per Jessen wrote:
Does anyone know where the correct behaviour of 1) the C-function
The ISO C Standard, which has to be bought.
Surely there is somewhere on-line where the behaviour is described as well? I don't know many C-programmers that have a copy of the ISO standard.
and 2) the regex character class is defined?
I'd guess it's defined in POSIX.
Another standard you have to buy, IIRC. Again, surely the character classes (and the locale-dependent variations) are described somewhere on-line? /Per Jessen, Zürich --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Thu, 2008-09-11 at 14:29 +0200, Per Jessen wrote:
Philipp Thomas wrote:
On Thu, 11 Sep 2008 08:59:14 +0200, Per Jessen wrote:
Does anyone know where the correct behaviour of 1) the C-function
I think it is more likely to be useful to know the information on locale, like in http://linux.about.com/library/cmd/blcmdl7_locale.htm At least that tells which locale-related environment variables effect which of these functions. After that, it depends on how each specific locale defines these things. What does the Thai locale define as a printable character? It is fully up to the Thai locale itself. isprint and friends are merely the messengers, so to speak. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Kapellgränd 7 P.O. Box 4205 SE-102 65 Stockholm, Sweden Office: Int +46 8-615 60 20 Mobile: Int +46 70-815 1696 And remember: It is RSofT and there is always something under construction. It is like talking about large city with all constructions finished. Not impossible, but very unlikely. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On 09/09/2008 01:28 PM, Per Jessen wrote:
Can anyone suggest a reason why C functions isprint() and isgraph() are both locale sensitive (as they should be), whereas regex character classes [:print:] and [:graph:] are not (when used in perl) ?
Having implemented the isprint() et. al. (ctype) functions for a C compiler, I can say very definitively that the C standards require these to be locale sensitive as well as many other C language functions. Perl is governed by different standards and regex itself is POSIX. -- Jerry Feldman <gaf@blu.org> Boston Linux and Unix user group http://www.blu.org PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846
participants (4)
-
Jerry Feldman
-
Per Jessen
-
Philipp Thomas
-
Roger Oberholtzer