On Fri, 5 Jul 2013 14:56:40 +0200 Frank Sundermeyer wrote: Hi,
The fact that the data is inconsistent also makes the msggrep output useless for your purpose.
Karl rightfully pointed out that this is not true ;-), because there
are several syntax alternatives which would make my egrep fail...
So I had a second look at msggrep...
msggrep will not only normalize the data so it can reliably be parsed,
it can also be called with --no-wrap, which will put the whole msgstr
into a single line. That in turn will make sure my script works
even if the original data is spread over several lines.
The modified script attached will make use of that and hopefully will
turn the script into a reliable solution to extract and parse the
needed data.
BUT (yes, of course there is a downside ;-)) it will definitely require
a separator at the end of each translator record - otherwise it will not
be possible to determine where a record begins or ends.
Another possibility would be to require each record to begin with a
4-digit year entry, but IMHO a separator would be more flexible.
The following data is correctly parsed by the attached script
(delimiter is ";"):
msgid
"translator-credits"
msgstr "玛丽苏 ,2012 - 2013;"
"Sign Guo Yunhe "
"