[Bug 248859] New: Wrong encodings in gcal messages
https://bugzilla.novell.com/show_bug.cgi?id=248859 Summary: Wrong encodings in gcal messages Product: openSUSE 10.2 Version: Final Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Translations AssignedTo: kssingvo@novell.com ReportedBy: uli@novell.com QAContact: ke@novell.com uli@daubechies:~> LANG=de_DE.UTF-8 gcal -q MC -n --today Ewige Feiertagsliste: Das Jahr 2007 ist KEIN Schaltjahr Neujahr (MC) + Mo, 1 Jan 2007 = -56 Tage Fasching/Fastnacht (MC) * Di, 20 Feb 2007 = -6 Tage St D�ote Day (MC) + Mo, 26 Feb 2007 St D�ote Day (MC) + Di, 27 Feb 2007 = +1 Tag Mi-Car�e Day (MC) * Di, 20 Mä 2007 = +22 Tage Gründonnerstag (MC) * Do, 5 Apr 2007 = +38 Tage [...] "St Dévote Day" is encoded in iso-8859-1, whereas "Gründonnerstag" is UTF-8 as expected. Using fr_FR.UTF-8, only "Mi-Carême Day" and "St Dévote Day" are in iso-8859-1, all other days are encoded correctly. I would assume this is an encoding error in the message files. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #1 from mfabian@novell.com 2007-02-26 06:40 MST ------- gcal contains non-ASCII msgids. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #2 from mfabian@novell.com 2007-02-26 07:16 MST ------- “Gründonnerstag” has an ASCII msgid: #: src/hd-data.c:1584 msgid "Maundy Thursday" msgstr "Gründonnerstag" “St Dévote Day” has a non-ASCII msgid: #: src/hd-data.c:1912 msgid "St Dévote Day" msgstr "Sainte-Dévote" all the msgids which are non-ASCII don't work currently, the translation is never used. In French the translation of “St Dévote Day” is fuzzy and in German it is translated. In both cases the msgid is used and not the translation. And the msgid is used “as is”, i.e. no encoding conversion is done. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #3 from mfabian@novell.com 2007-02-26 07:18 MST ------- A test with gettext gcal "St Dévote Day" shows that the German translation “Sainte-Dévote” is never found, no matter whether the above command is run in de_DE or de_DE.UTF-8 locale, i.e. no matter what input encoding for the message to translate is used. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #4 from kssingvo@novell.com 2007-02-26 07:24 MST ------- Thanks Mike. But I already investigated it, and noticed that the string enconding in Monaco's holidays is done as properly as in France's holidays. I think we only need to fix it there. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #5 from mfabian@novell.com 2007-02-26 08:15 MST ------- It is interesting whether gettext can cope with non-ASCII msgids or not. I know that it is better to avoid non-ASCII msgids but I didn't know that it doesn't work at all in the current implementation of gettext. For example LC_ALL=en_US.UTF-8 msgunfmt /usr/share/locale/de/LC_MESSAGES/gcal.mo
/dev/null
shows *warnings* like this: read-mo.c:236: warning: The following msgid contains non-ASCII characters. This will cause problems to translators who use a character encoding different from yours. Consider using a pure ASCII msgid instead. Switzerland/Z�rich As this is a *warning* and not an *error* it seems to suggest that doing this can work if you know what you are doing. Of course it may cause problems to translators using different encodings for their .po files, for example choosing ISO-8859-1 encoding for msgids would make translations into languages which cannot be encoded in ISO-8859-1 impossible (i.e. it would make translation into Czech, Japanese, .. impossible). But if one used UTF-8 for the msgids, the above problem vanishes and translation into all languages would still be possible *if* all translators used UTF-8 for their .po files and *if* gettext could handle the non-ASCII msgids. But apparently gettext cannot do this at all. If this doesn't work at all with gettext, gettext should print an error and not a warning. But I think with UTF-8 this could work just fine and therefore it would probably be best to fix gettext handle non-ASCII msgids correctly. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #6 from mfabian@novell.com 2007-02-27 07:49 MST ------- Created an attachment (id=121310) --> (https://bugzilla.novell.com/attachment.cgi?id=121310&action=view) bug-248859-iso-8859-1.po I tested with the attached small sample .po file and it worked just fine. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #7 from mfabian@novell.com 2007-02-27 08:16 MST ------- I did the following to test it: mfabian@magellan:/tmp/bug-248859$ msgfmt bug-248859-iso-8859-1.po -o bug-248859-iso-8859-1.mo mfabian@magellan:/tmp/bug-248859$ mfabian@magellan:/usr/share/locale/de/LC_MESSAGES$ ll bug-248859-iso-8859-1mo lrwxrwxrwx 1 root root 40 27. Feb 15:35 bug-248859-iso-8859-1.mo -> /tmp/bug-248859/bug-248859-iso-8859-1.mo mfabian@magellan:/usr/share/locale/de/LC_MESSAGES$ gettext bug-248859-iso-8859-1 "St Dévote Day" Sainte-Dévotemfabian@magellan:/usr/share/locale/de/LC_MESSAGES$ This looks OK. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #8 from mfabian@novell.com 2007-02-27 09:22 MST ------- It's strange why the simple test which works fine with my test-po file fails with the .po file from gcal: mfabian@magellan:/usr/share/locale/de/LC_MESSAGES$ LANG=de_DE gettext bug-248859-iso-8859-1 "St Dévote Day" Sainte-Dévotemfabian@magellan:/usr/share/locale/de/LC_MESSAGES$ LANG=de_DE gettext gcal "St Dévote Day" St Dévote Daymfabian@magellan:/usr/share/locale/de/LC_MESSAGES$ -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #9 from mfabian@novell.com 2007-02-27 10:10 MST ------- Currently, gcal doesn't regenerate the .mo files during the build, it just uses the .gmo files found in the source tar-ball. Regenerationg the .mo files makes it work for non-ASCII msgids. But it works perfectly only if the msgid has a translation. If there is no translation, the msgid will be printed “as is”, without any encoding conversion. As all the msgids in gcal are currently Latin1, this means that it will still be wrong in UTF-8 locales for all mgids which have no translations or are marked as fuzzy. Therefore, regenerating the .mo files will fix the problem perfectly for the German translations because these are 100% complete. But the problem Uli reported for the French translations will remain because the strings Uli mentioned are msgids where the translation is marked as fuzzy, therefore the msgid is printed “as is”. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #10 from mfabian@novell.com 2007-02-27 10:23 MST ------- I submitted this change to STABLE: ------------------------------------------------------------------- Tue Feb 27 18:14:31 CET 2007 - mfabian@suse.de - Bugzilla #248859: fix part of the problem (for all translated messages) by regenerating the .mo files. The encoding problem still exists for all msgids which are non-ASCII *and* which are untranslated or fuzzy. ------------------------------------------------------------------- -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #11 from mfabian@novell.com 2007-02-27 10:23 MST ------- mfabian@magellan:/var/tmp/abuild/x86_64$ LANG=de_DE.UTF-8 gcal -q MC -n --today Ewige Feiertagsliste: Das Jahr 2007 ist KEIN Schaltjahr Neujahr (MC) + Mo, 1 Jan 2007 = -57 Tage Fasching/Fastnacht (MC) * Di, 20 Feb 2007 = -7 Tage Sainte-Dévote (MC) + Mo, 26 Feb 2007 = -1 Tag Sainte-Dévote (MC) + Di, 27 Feb 2007 Mi-Carême (MC) * Di, 20 Mä 2007 = +21 Tage Gründonnerstag (MC) * Do, 5 Apr 2007 = +37 Tage Karfreitag (MC) * Fr, 6 Apr 2007 = +38 Tage Ostersonntag (MC) + So, 8 Apr 2007 = +40 Tage Ostermontag (MC) + Mo, 9 Apr 2007 = +41 Tage Tag der Arbeit (MC) + Di, 1 Mai 2007 = +63 Tage Christi Himmelfahrt (MC) * Do, 17 Mai 2007 = +79 Tage Pfingstsonntag (MC) + So, 27 Mai 2007 = +89 Tage Pfingstmontag (MC) + Mo, 28 Mai 2007 = +90 Tage Fronleichnam (MC) + Do, 7 Jun 2007 = +100 Tage Mariä Himmelfahrt (MC) * Mi, 15 Aug 2007 = +169 Tage Allerheiligen (MC) + Do, 1 Nov 2007 = +247 Tage Nationalfeiertag (MC) + So, 18 Nov 2007 = +264 Tage Nationalfeiertag (MC) + Mo, 19 Nov 2007 = +265 Tage Mariä Empfängnis (MC) + Sa, 8 Dez 2007 = +284 Tage Heiligabend (MC) * Mo, 24 Dez 2007 = +300 Tage 1'ter Weihnachtstag (MC) + Di, 25 Dez 2007 = +301 Tage Silvester/Neujahrsvorabend (MC) * Mo, 31 Dez 2007 = +307 Tage mfabian@magellan:/var/tmp/abuild/x86_64$ LANG=fr_FR.UTF-8 gcal -q MC -n --today Liste permanente des jours de fête: L'année 2007 N'EST PAS une année bissextile Jour de l'An (MC) + Lu, 1 Jan 2007 = -57 jours Mardi Gras (MC) * Ma, 20 Fé 2007 = -7 jours St D�vote Day (MC) + Lu, 26 Fé 2007 = -1 jour St D�vote Day (MC) + Ma, 27 Fé 2007 Mi-Car�me Day (MC) * Ma, 20 Mar 2007 = +21 jours Jeudi Saint (MC) * Je, 5 Avr 2007 = +37 jours Vendredi Saint (MC) * Ve, 6 Avr 2007 = +38 jours Dimanche de Pâques (MC) + Di, 8 Avr 2007 = +40 jours Lundi de Pâques (MC) + Lu, 9 Avr 2007 = +41 jours Fête du Travail (MC) + Ma, 1 May 2007 = +63 jours Ascension du Christ (MC) * Je, 17 May 2007 = +79 jours Dimanche de la Pentecôte (MC) + Di, 27 May 2007 = +89 jours Lundi de la Pentecôte (MC) + Lu, 28 May 2007 = +90 jours Fête de Corpus Christi (MC) + Je, 7 Jui 2007 = +100 jours Ascension de la Vierge (MC) * Me, 15 Ao� 2007 = +169 jours Toussaint (MC) + Je, 1 Nov 2007 = +247 jours Fête Nationale (MC) + Di, 18 Nov 2007 = +264 jours Fête Nationale (MC) + Lu, 19 Nov 2007 = +265 jours Immaculée Conception (MC) + Sa, 8 Dé 2007 = +284 jours Veille de Noël (MC) * Lu, 24 Dé 2007 = +300 jours Fête de Noël (MC) + Ma, 25 Dé 2007 = +301 jours Sylvester/New Year's Eve (MC) * Lu, 31 Dé 2007 = +307 jours mfabian@magellan:/var/tmp/abuild/x86_64$ -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #12 from mfabian@novell.com 2007-02-27 10:35 MST ------- Two fix the remaining problem for French and the other languages with untranslated and fuzzy messages there are 2 solutions: 1) translate all Languages 100% 2) make gettext convert the encoding of untranslated msgids as well. Currently gettext does not do this because it assumes the the encoding of the msgid is not known. The msgid is merged into the .po file from the program source code and does not have to have the same encoding as the .po file. But using such mixed encodings is stupid anyway, if one really wants to use non-ASCII msgids, one should make sure that the program source code and *all* .po files stick to the same encoding, preferably UTF-8 Any other encoding will work as well as long as all .po files and the program source code use the same encoding. But of course encodings other than UTF-8 will make it impossible to add translations for languages which cannot be written using that encoding. Therefore, think it is reasonable for gettext to assume that the encoding of the msgid is the same as the encoding of the .po file and try to convert from that encoding to locale encoding when the msgid is printed (just as it is already done for the msgstr). I cannot see any disadvantages in doing this: If the msgid was in the same encoding as the .po file, everything will work well. And this is the most likely and useful case (Everything UTF-8). If the msgid was in a different encoding after all, the conversion might fail and one might need to fall back to printing the msgid “as is”. Or, the conversion might succeed but the result might be garbage nevertheless. That is pretty much t he situation we have now, therefore I think we would only gain from such an improvement in gettext. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 ------- Comment #13 from mfabian@novell.com 2007-02-27 10:53 MST ------- I created a new bug for the gettext issue, see bug #249431. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=248859 mfabian@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #14 from mfabian@novell.com 2007-02-27 10:54 MST ------- Closing this bug as fixed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com