I've tried to create an ICU4C file from a gettext .po file with a sed
script like this:
/^#/ d /* delete comments */
:a;/"$/{N;s/"\n"//;ba} /* merge quoted lines in loop */
/^msgid /s/msgid (.*)/\1/ /* convert msgids */
s/msgstr "(.*)"/\{ "\1" }/ /* convert msgstrs */
and it already works pretty well (ignoring plural forms), but for some reason it doesn't convert the last msgid/msgstr couple, unless I don't merge the quotes twice. But then the syntax for the other stuff becomes wrong. Any ideas? Doesn't have to use sed
.
Those ICU files are the only ones accepted by genrb
, and I'd like to use the ResourceBundle in PHP.
I've accomplished my goal through a shell script. Here's the rough idea:
#!/usr/bin/env bash
# remove comments
sed -r -e '/^#/ d' < de.po >de.icu.txt
# merge strings
sed -i de.icu.txt -r -e ':L;/"$/{N;s/"\n"//;b L}'
# delete gettext header
sed -i -e '1,2 d' de.icu.txt
# convert into ICU format
sed -i de.icu.txt -r -e '
# delete untranslated
/msgid ".+"/{
N
/msgstr ""/{
N;s/msgid ".+"\nmsgstr ""\n//
}
}
# generate ICU txt
/msgid /s/msgid (.*)/\1/
s/msgstr "(.*)"/\{ "\1" }/'
sed -i -e '1i de {' -e '$ a\\n}' de.icu.txt
There's probably a nicer way, but it does the job.