Search code examples
phpexcelphpexcelcodeplexphpexcelreader

Reading empty cells as data cells


I am reading an Excel file where sometimes PHPExcel codeplex considers an empty cell as a value cell. For example:

If my Excel has 128 Columns, the maximum is DV sometime it will vary depend upon Excel and users requirement. Maximum is 126 columns. Minimum is 8 columns. And also maximum row is 250.

----------------------
A | B | ..... |DU |DV|
----------------------

I used following code to load Excel:

   set_include_path(get_include_path() . PATH_SEPARATOR . 'Classes/');
   include 'PHPExcel/IOFactory.php';    

   $inputFileName = $target_file; 
    try {
            $inputFileType = PHPExcel_IOFactory::identify($inputFileName);
            $objReader = PHPExcel_IOFactory::createReader($inputFileType);
            $objPHPExcel = $objReader->load($inputFileName);
    } catch(Exception $e) {
        die('Error loading file "'.pathinfo($inputFileName,PATHINFO_BASENAME).'": '.$e->getMessage());
    }

Now the problem is PHPExcel is showing the wrong maximum column and rows, maximum column is 126 (DV) and maximum row is 245.

But when I used this following code am getting wrong values:

 $lastColumn =  $objPHPExcel->setActiveSheetIndex(0)->getHighestColumn();
 $lastRow    =  $objPHPExcel->setActiveSheetIndex(0)->getHighestRow();

instead of 126 I am getting 1800+ (AMK) as maximum column and maximum row it showing as 65536.

How to fix this issue?

Even clearing and creating a new Excel this sometimes happens.

How to set maximum column and rows before reading (loading) Excel or how to avoid empty cells?


Solution

  • Rows and columns are counted by getHighestColumn() and getHighestRow() if they contain anything (including style or print information) rather than simply having content. These values are also calculated when the spreadsheet is loaded, so may not be accurate if you subsequently add new rows or columns to the worksheet.

    Instead, use the getHighestDataColumn() and getHighestDataRow() methods to return the highest row and column that actually contain data values in cells. While less efficient, they actually calculate the highest cell reference when called, which is slower, but always accurate.

    Note also that a cell which may appear empty to you may still contain a null or an empty string. This is still a data value.