Search code examples
phpcsvpreg-match

php - converting badly formatted txt to csv


i have a badly formatted text file which i would like to convert to csv.

Here's an example:

100910 NA/1-2013-99636 VIA DEI PESCATORI 2/A LODI APR 8 2013 4:24PM DANNEGGIATO -10% 200 2700 0 0 NO
148013 NA/1-2014-146194 CAVALLOTTI SNC LODI GEN 3 2014 3:37PM DANNEGGIATO -10% 0 0 2 0 NO
160032 NA/1-2014-158129 PAOLO GORINI SNC LODI MAG 6 2014 11:51AM DANNEGGIATO -10% 2 0 2 0 NO
54900 NA/1-2014-158070 STRADA VECCHIA CREMONESE SNC LODI MAG 6 2014 9:53AM DANNEGGIATO +10% 10 0 10 0 NO
100910 NA/1-2013-99636 VIA DEI PESCATORI 2/A LODI APR 8 2013 4:24PM DANNEGGIATO -10% 200 2700 0 0 NO
147959 NA/1-2014-146140 DOSSENA SNC LODI GEN 3 2014 10:45AM DANNEGGIATO -10% 200 0 200 0 NO

That is roughly in this form :

[number] [id] [awfully formatted street] ['LODI'] [timestamp] [damaged or not] [percentage] [squaremeters] [squaremeters] [squaremeters] [squaremeters] [asbest-crumbled or not]

My problem is how to extract the 3rd part, [awfully formatted street]. Basically it's the string after [id] preceding the string ['LODI'] (but ['LODI'] must be just before [timestamp] )

Should i explode() each line by spaces and then traversing the array backwards, overtake [timestamp], overtake ['LODI'] and joining the values before array[id], i.e array [1]? Or is there a smarter (elegant) way to do this, perhaps with preg_match()?

Thanks for any hint!


Solution

  • <?php
        // read file line by line
        $line = '148013 NA/1-2014-146194 CAVALLOTTI SNC LODI GEN 3 2014 3:37PM DANNEGGIATO -10% 0 0 2 0 NO';
    
        //start by seperating the string on LODI
        $lodi_split = explode('LODI', $line);
    
        // Now split the first occ into an array on space
        $bits = explode(' ', $lodi_split[0]);
    
        $address = '';
        // start reading occurance from occ 2 to loose the first 2 fields
        for ($i=2; $i < count($bits); $i++ ) {
            $address .= $bits[$i] . ' ';
        }
        echo $address . PHP_EOL;
    

    Result is

    CAVALLOTTI SNC