Search code examples
phpconvertapi

How I can process files in a directory all at one time?


I have a directory with 100 of xlxs files. Now what I want to do is to convert all these files into PDF all at one time or some at one time. The conversion process is working fine at the moment with foreach and cron. But it can process or convert files one at a time which increase waiting time at the user end who is waiting for PDF files.

I am thinking about parallel processing at this time but don't know how to implement this.

Here is my current code

$files = glob("/var/www/html/conversions/xlxs_files/*");

if(!empty($files)){
  $now   = time();
  $i = 1;
   foreach ($files as $file) {
      if (is_file($file) && $i <= 8) {
        
        echo $i.'-----'.basename($file).'----'.date('m/d/Y H:i:s',@filemtime($file));
        echo '<br>';
        $path_parts = pathinfo(basename($file));
        
        $xlsx_file_name =  basename($file);
        
        $pdf_file_name =  $path_parts['filename'].'.pdf';
        
        echo '<br>';  
        
        try{
            $result = ConvertApi::convert('pdf', ['File' => $common_path.'xlxs_files/'.$xlsx_file_name],'xlsx');
            echo $log = 'conversion start for '.basename($file).' on '. date('d-M-Y h:i:s');
            echo '<br>';
            $result->getFile()->save($common_path.'pdf_files/'.$pdf_file_name);
            
            echo $log = 'conversion start for '.basename($file).' on '. date('d-M-Y h:i:s'); 
            echo '<br>';
            mail('[email protected]','test','test');
            unlink($common_path.'xlxs_files/'.$xlsx_file_name);
            
        }catch(Exception $e){
            $log_file_data = createAlogFile();
            $log = 'There is an error with your file .'. $xlsx_file_name.' -- '.$e->getMessage();
            file_put_contents($log_file_data, $log . "\n", FILE_APPEND);
            continue;
        }
        $i++;  
    }
}
}else{

   echo 'nothing to process';
} 

Any help will be highly appreciated. Thanks


Solution

  • Q : I am thinking about parallel processing at this time but don't know how to implement this.

    Fact #1:
    this is not a kind of a true-[PARALLEL] orchestration of the flow of processing.

    Fact #2:
    a standard GNU parallel (all details kindly read in man parallel) will help you maximise the performance of your processing pipeline, given the list of all files to convert and tweaking other parameters as the amounts of CPU/cores used and RAM-resources you may reserve/allocate to perform this batch conversion as fast as possible.

    ls _files_to_convert.mask_ | parallel --jobs _nCores_  \
                                          --load 99%        \
                                          --block _RAMblock_ \
                                          ...                 \
                                          --dry-run            \
                                          _converting_process_
    

    might serve as an immediate apetiser for what the GNU parallel is capable of.

    All credits and thanks are to go to Ole Tange.