Search code examples
yii2yii2-basic-appphpoffice

Yii2 update data by excel file long load time


I'm using Yii2 to create a tool to manage work load of my team. So everyday i need to import large amount of data (larger than 5k) to db using Excel and the loadtime usually take about 20-30 mins. Is there any way to improve the load time?

Please help me with this.

Here's the code I used:

public function actionImportExcel()
{
    $inputFile = 'uploads/importexcel/import.csv';
    try{
        $inputFileType = \PHPExcel_IOFactory::identify($inputFile);
        $objReader = \PHPExcel_IOFactory::createReader($inputFileType);
        $objPHPExcel = $objReader->load($inputFile);
    }catch(Exception $e)
    {
            die('Error');
    }

    $sheet = $objPHPExcel->getSheet(0);
    $highestRow = $sheet->getHighestRow();
    $highestColumn = $sheet->getHighestColumn();

    for( $row = 1; $row <= $highestRow; $row++)
    {
        $rowData = $sheet->rangeToArray('A'.$row.':'.$highestColumn.$row,NULL,TRUE,FALSE);

        if($row == 1)
        {
            continue;
        }
    $test = $rowData[0][0];

    $ext = Sku3d::find()->where(['sku' => $test])->exists();



    if($ext){
        $one = Sku3d::find()->where(['sku' => $test])->one();
        $one->status = $rowData[0][14];
        $one->round = $rowData[0][19];
        $one->source = $rowData[0][29];
        $one->modeler = $rowData[0][30];
        if($one->datesubmit == NULL || $one->datesubmit == ""){
        $one->save();   
        }else{
        $day = DateTime::createFromFormat('Y-m-d', $one->datesubmit);
        $one->monthsubmit=date("Y-m-t", strtotime($one->datesubmit));
        $one->save();
        }
        if($rowData[0][14] == "Approved"){
            $one->approvedate = $rowData[0][16];
            if($one->approvedate == NULL || $one->approvedate == ""){
            $one->save();   
            }else{
            $one->approvemonth=date("Y-m-t", strtotime($one->approvedate));
            $one->save();
            }
        }else{
            $one->approvedate = Null;
            $one->approvemonth = Null;
        }

        $one->save();

        // print_r($one->getErrors());

        // die;

    }
    else{


        }

    }       

}

Thank you!


Solution

  • You're executing very similar queries when you check if the row already exist and when you are loading it from DB.

    If you are expecting more new records then existing one, you can load all skus to the array before the cycle, then check if the sku is among existing.

    Before the for cycle:

    $existingSkus = Sku3d::find()
        ->select(['sku'])
        ->indexBy('sku')
        ->column();
    

    Then inside your for cycle:

    if (array_key_exists($test, $existingsSkus)) {
        $one = Sku3d::find()->where(['sku' => $test])->one();
        // ...
    }
    

    If you are expecting that most of rows in import already exist in DB and you are going to update them, then you can skip the exists() query and load the data directly.

    $one = Sku3d::find()->where(['sku' => $test])->one();
    if(!empty($one)) {
        // ... update existing row loaded in $one
    } else {
        // ... create new row
    }
    

    Another problem in your code is that you call the save multiple times for each updated row.

    if($one->datesubmit == NULL || $one->datesubmit == ""){
        $one->save(); //first save
    } else {
        // ...
        $one->save(); //first save - else branch
    }
    // ...
    if ($rowData[0][14] == "Approved"){
        $one->approvedate = $rowData[0][16];
        if ($one->approvedate == NULL || $one->approvedate == ""){
            $one->save(); // second save
        } else {
            $one->approvemonth=date("Y-m-t", strtotime($one->approvedate));
            $one->save(); //second save - else branch
        }
    }else{
        $one->approvemonth = Null;
    }
    $one->save(); //third save when the previous condition is true, second save otherwise
    

    Do you really need to call the save before you are done with all changes? Doing one save at the end will be faster then doing 2 or 3 saves for each row.

    Also if there are many new rows in each import, you might want to use batch insert instead of creating and saving new model for each row.