Search code examples
phplaravelvalidation

Import validated tab delimited file to MySQL with Laravel


Im trying to upload a tab delimited text file. The issue could be that the columns is not always in the same order. Once uploaded id like to do the following steps:

  1. Validate column names. If validation passes -> 2
  2. Read line by line in the file (preferably skip column names) and validate some of the values. If a line passes validation -> 3.
  3. Create a model object and store it in an array for later bulk insertion.
  4. Repeat for all lines in the file.
  5. When all lines are done and everything is validated, bulk insert all the objects.

So far i have done the upload part and tried out some different solutions. But I'm pretty much stuck right now. I'll paste my code from the controller (keep in mind that there is 40 columns in the file i have just written some of them):

public function store()
{

    $file = Input::file('file');

    $rules = array(
        'file' => 'required|mimes:txt'
    );

    $validator = Validator::make(array('file'=> $file), $rules);

    if($validator->passes()){

        foreach(file($file) as $row) {

            $row = explode("\t", $row);

            $validator = Validator::make(array(
                'X'=> $row[0],
                'Y'=> $row[1],
                'year'=> $row[2],
                'provnr'=> $row[3],
                'id'=> $row[4]

            ), Sample::$insertRules);

            if($validator->passes()){

                $sample = New Sample;
                $sample->X                  = $row[0];
                $sample->Y                  = $row[1];
                $sample->year               = $row[2];
                $sample->provnr             = $row[3];
                $sample->costumer_id        = $row[4];

                $sample->save();

            } else {

                Session::flash('notice', 'Something is wrong!');
                Session::flash('sample', $row[3]);

                return Redirect::to('import');
            }

        }

        exit;

        Session::flash('success', 'Upload successfully');
        return Redirect::to('import');
    }
    else {
        // redirect back with errors.
        return Redirect::to('import')->withInput()->withErrors($validator);
    }

}

This works fine (no bulk insertion yet, ill tend to that later), but i'm not really fond of it, i think theres a better solution. Don't mind the validation, it's not really finished yet.

What i end up with, if i print_r all the rows you can find below this text. Right now its kind of hard since the column names is the first array. Somewhere along the line id like to remove it. Since it won't pass the other rows validation.

Array ( [0] => x [1] => y [2] => År [3] => Provnr ) 
Array ( [0] => 1315903.24 [1] => 6213877.72 [2] => 2014 [3] => 223 )
Array ( [0] => 1315819.62 [1] => 6213937.42 [2] => 2014 [3] => 224 )

So, i'm interested in you guys thoughts about this. Have you got any ideas?

I think it would be better to in some way convert the array, to objects. So i get something like a database result. I just don't know how. I would like to be able to write $row['X'] in my foreach loop. I think that would be much better. Is there some kind of way to make that possible? The column names would be each objects identification.

Im really thankful for you're help!

Edit:

So now the issue lies in the special characters of my file. I have made some edits according to Bogdans comments. I have included the full $columnMap The code now looks like this:

    $file = Input::file('file');


    $rules = array(
        'file' => 'required|mimes:txt'
    );

    $validator = Validator::make(array('file'=> $file), $rules);

    if($validator->passes()){

        $columns = [];

        $columnMap = [
            'x'                         =>      'X',
            'y'                         =>      'Y',
            'Ar'                        =>      'Year',
            'Provnr'                    =>      'Provnr',
            'Markning'                  =>      'costumer_id',
            'pH'                        =>      'pH',
            'P_AL'                      =>      'P_AL',
            'P_HCl'                     =>      'P_HCl',
            'K_AL'                      =>      'K_AL',
            'K_HCl'                     =>      'K_HCl',
            'Mg_AL'                     =>      'Mg_AL',
            'Cu_HCl'                    =>      'Cu_HCl',
            'K_Mg_kvot'                 =>      'K_Mg_kvot',
            'Bor'                       =>      'Bor',
            'Ca_AL'                     =>      'Ca_AL',
            'fe'                        =>      'fe',
            'al'                        =>      'al',
            'Mullhalt'                  =>      'Mullhalt',
            'Total_lerhalt'             =>      'Total_lerhalt',
            'Sand_grovmo'               =>      'Sand_grovmo',
            'Volymvikt'                 =>      'Volymvikt',
            'T_varde'                   =>      'T_värde',
            'S_varde'                   =>      'S_värde',
            'Basmattnadsgrad'           =>      'Basmättnadsgrad',
            'Cd_HNO3'                   =>      'Cd_HNO3',
            'Kalkbehov'                 =>      'Kalkbehov',
            'Jordart'                   =>      'Jordart',
            'Fin_lerhalt'               =>      'Fin_lerhalt',
            'Zn'                        =>      'Zn',
            'Cu'                        =>      'Cu',
            'Cr'                        =>      'Cr',
            'Ni'                        =>      'Ni',
            'Pb'                        =>      'Pb',
            'Hg'                        =>      'Hg',
            'Mineralkvave_Kg_N_ha'      =>      'Mineralkväve_Kg_N_ha',
            'Mineralkvave_NO3_N'        =>      'Mineralkväve_NO3_N',
            'Mineralkvave_NH4_N'        =>      'Mineralkväve_NH4_N',
            'Mineralkvave_djup'         =>      'Mineralkväve_djup',
            'Cystnematoder_potatis'     =>      'Cystnematoder_potatis',
            'Cystnematoder_betor'       =>      'Cystnematoder_betor',
            'Cystnematoder_spannmal'    =>      'Cystnematoder_spannmål',
            'Ovrigt'                    =>      'Övrigt'
        ];

        foreach(file($file) as $i => $row)
        {
            $row = explode("\t", $row);

            if($i == 0)
            {
                $columns = $row;

                array_walk($columns, function (&$item)
                {
                    $item = str_replace(
                        ['ä', 'å', 'ö', 'Ä', 'Å', 'Ö'],
                        ['a', 'a', 'o', 'A', 'A', 'O'],
                        utf8_encode($item)
                    );
                });

                continue;
            }

            $_row = array();
            array_walk($row, function ($value, $index) use (&$_row, $columns, $columnMap)
            {
                $_row[$columnMap[$columns[$index]]] = $value; //*The issue is here.*
            });
            $row = $_row;

            $validator = Validator::make($row, Sample::$insertRules);

            if($validator->passes()){

                $sample = New Sample;

                foreach ($row as $property => $value)
                    $sample->{$property} = $value;

                $sample->save();
            } else
            {
                Session::flash('notice', 'Something is wrong!');

                return Redirect::to('import');
            }
        }

I had to utf8_encode the header of the file, otherwise it wouldn't work. Could this be some issue with the text file? It seems to be working until it is time to do array_walk on the row with values. I then get this error: Undefined index: Ovrigt

If I var_dump $columns it looks like this:

array(42) { [0]=> string(1) "x" [1]=> string(1) "y" [2]=> string(2) "Ar" [3]=> string(6) "Provnr" [4]=> string(8) "Markning" [5]=> string(2) "pH" [6]=> string(4) "P_AL" [7]=> string(5) "P_HCl" [8]=> string(4) "K_AL" [9]=> string(5) "K_HCl" [10]=> string(5) "Mg_AL" [11]=> string(6) "Cu_HCl" [12]=> string(9) "K_Mg_kvot" [13]=> string(3) "Bor" [14]=> string(5) "Ca_AL" [15]=> string(2) "fe" [16]=> string(2) "al" [17]=> string(8) "Mullhalt" [18]=> string(13) "Total_lerhalt" [19]=> string(11) "Sand_grovmo" [20]=> string(9) "Volymvikt" [21]=> string(7) "T_varde" [22]=> string(7) "S_varde" [23]=> string(15) "Basmattnadsgrad" [24]=> string(7) "Cd_HNO3" [25]=> string(9) "Kalkbehov" [26]=> string(7) "Jordart" [27]=> string(11) "Fin_lerhalt" [28]=> string(2) "Zn" [29]=> string(2) "Cu" [30]=> string(2) "Cr" [31]=> string(2) "Ni" [32]=> string(2) "Pb" [33]=> string(2) "Hg" [34]=> string(20) "Mineralkvave_Kg_N_ha" [35]=> string(18) "Mineralkvave_NO3_N" [36]=> string(18) "Mineralkvave_NH4_N" [37]=> string(17) "Mineralkvave_djup" [38]=> string(21) "Cystnematoder_potatis" [39]=> string(19) "Cystnematoder_betor" [40]=> string(22) "Cystnematoder_spannmal" [41]=> string(8) "Ovrigt " }

If var_dump $columnMap it looks like this:

array(42) { ["x"]=> string(1) "X" ["y"]=> string(1) "Y" ["Ar"]=> string(4) "Year" ["Provnr"]=> string(6) "Provnr" ["Markning"]=> string(11) "costumer_id" ["pH"]=> string(2) "pH" ["P_AL"]=> string(4) "P_AL" ["P_HCl"]=> string(5) "P_HCl" ["K_AL"]=> string(4) "K_AL" ["K_HCl"]=> string(5) "K_HCl" ["Mg_AL"]=> string(5) "Mg_AL" ["Cu_HCl"]=> string(6) "Cu_HCl" ["K_Mg_kvot"]=> string(9) "K_Mg_kvot" ["Bor"]=> string(3) "Bor" ["Ca_AL"]=> string(5) "Ca_AL" ["fe"]=> string(2) "fe" ["al"]=> string(2) "al" ["Mullhalt"]=> string(8) "Mullhalt" ["Total_lerhalt"]=> string(13) "Total_lerhalt" ["Sand_grovmo"]=> string(11) "Sand_grovmo" ["Volymvikt"]=> string(9) "Volymvikt" ["T_varde"]=> string(8) "T_värde" ["S_varde"]=> string(8) "S_värde" ["Basmattnadsgrad"]=> string(16) "Basmättnadsgrad" ["Cd_HNO3"]=> string(7) "Cd_HNO3" ["Kalkbehov"]=> string(9) "Kalkbehov" ["Jordart"]=> string(7) "Jordart" ["Fin_lerhalt"]=> string(11) "Fin_lerhalt" ["Zn"]=> string(2) "Zn" ["Cu"]=> string(2) "Cu" ["Cr"]=> string(2) "Cr" ["Ni"]=> string(2) "Ni" ["Pb"]=> string(2) "Pb" ["Hg"]=> string(2) "Hg" ["Mineralkvave_Kg_N_ha"]=> string(21) "Mineralkväve_Kg_N_ha" ["Mineralkvave_NO3_N"]=> string(19) "Mineralkväve_NO3_N" ["Mineralkvave_NH4_N"]=> string(19) "Mineralkväve_NH4_N" ["Mineralkvave_djup"]=> string(18) "Mineralkväve_djup" ["Cystnematoder_potatis"]=> string(21) "Cystnematoder_potatis" ["Cystnematoder_betor"]=> string(19) "Cystnematoder_betor" ["Cystnematoder_spannmal"]=> string(23) "Cystnematoder_spannmål" ["Ovrigt"]=> string(7) "Övrigt" }

If i var_dump the first $row with values it looks like this:

array(42) { [0]=> string(10) "1315903.24" [1]=> string(10) "6213877.72" [2]=> string(4) "2014" [3]=> string(3) "223" [4]=> string(4) "6510" [5]=> string(3) "6.8" [6]=> string(4) "10.0" [7]=> string(0) "" [8]=> string(3) "9.5" [9]=> string(0) "" [10]=> string(4) "12.0" [11]=> string(0) "" [12]=> string(3) "0.8" [13]=> string(0) "" [14]=> string(5) "150.0" [15]=> string(0) "" [16]=> string(0) "" [17]=> string(0) "" [18]=> string(0) "" [19]=> string(0) "" [20]=> string(0) "" [21]=> string(0) "" [22]=> string(0) "" [23]=> string(0) "" [24]=> string(0) "" [25]=> string(0) "" [26]=> string(0) "" [27]=> string(0) "" [28]=> string(0) "" [29]=> string(0) "" [30]=> string(0) "" [31]=> string(0) "" [32]=> string(0) "" [33]=> string(0) "" [34]=> string(0) "" [35]=> string(0) "" [36]=> string(0) "" [37]=> string(0) "" [38]=> string(0) "" [39]=> string(0) "" [40]=> string(0) "" [41]=> string(12) "J038790-14 " }

Right now i have no idea what could be the issue. Is there something else i can post so you guys get a better understanding? Since "Ovrigt" is last, it seems to work fine up until that one. A wierd thing is that there seems to be a blank space in "Ovrigt " when i var_dump $columns.. Could that be it?


Solution

  • Assuming the first line of the file always contains the columns names, here's a quick approach to accomplish that:

    // Define the array that will contain the columns names
    $columns = [];
    
    foreach($rows as $i => $row)
    {
        $row = explode("\t", $row);
    
        // For the first row get the get the column names
        if ($i == 0)
        {
            $columns = $row;
            continue; // Skip this iteration
        }
    
        // Map each column name to every item in the row
        $_row = array();
        array_walk($row, function ($value, $index) use (&$_row, $columns)
        {
            $_row[$columns[$index]] = $value;
        });
        $row = $_row;
    
        // You can now access row items with
        // $row['X'], $row['Y'], etc.
        ...
    }
    

    If you know exactly that all columns will be present in the file (regardless of order), you can go a step further and map the column name from the file to the coresponding model property name. This will make it even more seamless to validate and populate each model:

    // Define the array that will contain the columns names
    $columns = [];
    
    // Define the column mapping
    // 'column name' => 'model property name'
    $columnMap = [
        'X'      => 'X',
        'Y'      => 'Y',
        'Ar'     => 'year',
        'Provnr' => 'provnr',
        'id'     => 'customer_id'
    ];
    
    foreach($rows as $i => $row)
    {
        // Assuming the column names are always on the first row
        if ($i == 0)
        {
            $columns = $row;
    
            // Replace Swedish special chars with simple Latin chars
            array_walk($columns, function (&$item)
            {
                $item = str_replace(
                    ['ä', 'å', 'ö', 'Ä', 'Å', 'Ö'],
                    ['a', 'a', 'o', 'A', 'A', 'O'],
                    trim($item)
                );
            });
            continue; // Skip this iteration
        }
    
        // Map each column name to every item in the row
        $_row = array();
        array_walk($row, function ($value, $index) use (&$_row, $columns, $columnMap)
        {
            $_row[$columnMap[$columns[$index]]] = $value;
        });
        $row = $_row;
    
        // Pass the $row directly to the validator because the mapped keys will work
        $validator = Validator::make($row, Sample::$insertRules);
    
        if($validator->passes()){
    
            $sample = New Sample;
    
            // You can iterate over the row and assign each value automatically
            // because each item key is mapped to a property name
            foreach ($row as $property => $value)
                $sample->{$property} = $value;
    
            $sample->save();
        }
        else
        {
            Session::flash('notice', 'Something is wrong!');
            Session::flash('sample', $row[3]);
    
            return Redirect::to('import');
        }
    }