Search code examples
phpregexcsvpo

export .po file into .csv


I am looking for a simple way to create an excel or CSV file from a .po localization file.

I couldn't find any via Google, so i'm thinking of writing it myself in PHP. The PO file has such structure

msgid "Titre" msgstr "Titre"

So i guess i need my PHP script to parse the .po file looking for "the first bit of text between comma after each occurrence of the keyword msgstr".

I assume that's the job for a regex, so i tried that, but it does not return anything:

$po_file = '/path/to/messages.po';

if(!is_file($po_file)){
    die("you got the filepath wrong dude.");
}

$str = file_get_contents($po_file);
// find all occurences of msgstr "SOMETHING"
preg_match('@^msgstr "([^/]+)"@i', $str, $matches);
$msgstr = $matches[1];

var_dump($msgstr);

Solution

  • There is a nice pear library. File_Gettext

    If you look at the source File/Gettext/PO.php you see the regex pattern that you'll need:

    $matched = preg_match_all('/msgid\s+((?:".*(?<!\\\\)"\s*)+)\s+' .
                              'msgstr\s+((?:".*(?<!\\\\)"\s*)+)/',
                              $contents, $matches);
    
    for ($i = 0; $i < $matched; $i++) {
        $msgid = substr(rtrim($matches[1][$i]), 1, -1);
        $msgstr = substr(rtrim($matches[2][$i]), 1, -1);
    
        $this->strings[parent::prepare($msgid)] = parent::prepare($msgstr);
    }
    

    Or just use the pear lib:

    include 'File/Gettext/PO.php';
    
    $po = new File_Gettext_PO();
    $po->load($poFile);
    $poArray = $po->toArray();
    
    foreach ($poArray['strings'] as $msgid => $msgstr) {
        // write your csv as you like...
    }