Search code examples
phpexplodegettextsubstr

How to extract text between predefined strings in php


We're changing or translation system from GetText to MYSQL Database. I want to put all the translations strings & translation ID from the original ".po" file into database.

For this I need to read the file and loop through each line, which is easy. The difficult part is when I see "msgid" or "msgstr" I need to extract the datas and insert this into a database.

Original file looks like this :

msgid "inactive_ad_detail_text"
msgstr "This ad doesn't exists"
msgid "breadcrumb_search"
msgstr "Search the site"
(... etc etc ...)

How can I extract the name of the the id (msgid) and the text (msgstr) between quotation marks ?

Also, I have some escaped text and two lines text like :

msgid "question_fill_form"
msgstr ""
"Please fill the form"
"<br>All fields are mandatory"

or

msgid "offer_contact_error"
msgstr ""
"Error detected "
"please click \"<em>restart</em>\" on the right side."

I think I need to detect [msgid "] the the last ["] quotation mark before the end-of-line but I really have no clue how to achieve in PHP.

Thanks for you help, Lio


Solution

  • There is a library for this. PHP-po-parser

    // Parse a po file
    $fileHandler = new Sepia\FileHandler('es.po');
    
    $poParser = new Sepia\PoParser($fileHandler);
    $entries  = $poParser->parse();
    // $entries contains every entry in es.po file.
    
    // Update entries
    $msgid = 'Press this button to save';
    $entries[$msgid]['msgstr'] = 'Pulsa este botón para guardar';
    $poParser->setEntry($msgid, $entries[$msgid]);
    // You can also change translator comments, code comments, flags...
    

    If you don't use composer, you can include the files in order or use an autoloader to load these.

    require_once('Sepia/InterfaceHandler.php');
    require_once('Sepia/StringHandler.php');
    require_once('Sepia/FileHandler.php');
    require_once('Sepia/PoParser.php');