Search code examples
node.jscsvperlsave-asexceljs

How to Flatten / Recompile Excel Spreadsheet Using sheetjs or exceljs on Write


We use excel as a configuration file for clients. However, our processes only run on linux servers. We need to take a master file, update all the client workbooks with the new information, and commit to GitLab. The users then check it out, add their own changes, commit back to GitLab and a process promotes the workbook to Server A.

This process works great using nodeJS (exceljs)

Another process on a different server is using perl to pick up the workbook and then saves each sheet as a csv file.

The problem is, what gets written out is the data from the ORIGINAL worksheet and not the updated changes. This is true of both perl and nodejs. Code for perl and nodejs xlsx to csv is at the end of the post.

Modules Tried: perl : Spreadsheet::ParseExcel; Spreadsheet::XLSX; nodejs: node-xlsx, exceljs

I assume it has to do with Microsoft using XML inside the excel wrapper, it keeps the old version as history and since it was the original sheet name, it gets pulled instead of the updated latest version.

When I manually open in Excel, everything is correct with the new info as expected.

When I use "Save as..." instead of "Save" then the perl process is able to correctly write out the updated worksheet as csv. So our workaround is having the users always "Save as.." before committing their extra changes to GitLab. We'd like to rely on training, but the sheer number of users and clients makes trusting that the user will "Save AS..." is not practical.

Is there a way to replicate a "Save As..." during my promotion to Server A or at least be able to tell if the file had been saved correctly? I'd like to stick with excelJS, but I'll use whatever is necessary to replicate the "Save as..." which seems to recompile the workbook.

In addition to nodejs, I can use perl, python, ruby - whatever it takes - to make sure the csv creation process picks up the new changes.

Thanks for your time and help.

#!/usr/bin/env perl 
    
     use strict; 
     use warnings; 
     use Carp; 
     use Getopt::Long; 
     use Pod::Usage; 
     use File::Basename qw/fileparse/; 
     use File::Spec; 
     use Spreadsheet::ParseExcel; 
     use Spreadsheet::XLSX; 
     use Getopt::Std;
       
     my %args = (); 
     my $help = undef; 
     GetOptions( 
     \%args, 
     'excel=s', 
     'sheet=s', 
     'man|help'=>\$help, 
     ) or die pod2usage(1); 
      
     pod2usage(1) if $help; 
     pod2usage(-verbose=>2, exitstatus=>0, output=>\*STDOUT) unless $args{excel} || $args{sheet}; 
     
     pod2usage(3) if $help; 
     pod2usage(-verbose=>2, exitstatus=>3, output=>\*STDOUT) unless $args{excel}; 
     
    
     if (_getSuffix($args{excel}) eq ".xls") { 
     my $file = File::Spec->rel2abs($args{excel}); 
      
     if (-e $file) { 
     print _XLS(file=>$file, sheet=>$args{sheet}); 
     } else { 
     exit 1;
     die "Error: Can not find excel file. Please check for exact excel file name and location. \nError: This Program is CASE SENSITIVE. \n"; 
     } 
     } 
     elsif (_getSuffix($args{excel}) eq ".xlsx") { 
     my $file = File::Spec->rel2abs($args{excel}); 
     
     
     if (-e $file) { 
      print _XLSX(file=>$file, sheet=>$args{sheet});
     } 
     else { 
     exit 1;
     die "\nError: Can not find excel file. Please check for exact excel file name and location. \nError: This Program is CASE SENSITIVE.\n"; 
     }
     } 
     else { 
     exit 5; 
     }
      
      
     sub _XLS { 
     my %opts = ( 
     file => undef, 
     sheet => undef, 
     @_, 
     ); 
      
     my $aggregated = (); 
     my $parser = Spreadsheet::ParseExcel->new(); 
     my $workbook = $parser->parse($opts{file}); 
      
     if (!defined $workbook) { 
     exit 3;
     croak "Error: Workbook not found"; 
     } 
      
     foreach my $worksheet ($workbook->worksheet($opts{sheet})) { 
     
     if (!defined $worksheet) { 
       exit 2;
       croak "\nError: Worksheet name doesn't exist in the Excel File. Please check the WorkSheet Name. \nError: This program is CASE SENSITIVE.\n\n";
     }
     
     my ($row_min, $row_max) = $worksheet->row_range(); 
     my ($col_min, $col_max) = $worksheet->col_range(); 
    
     foreach my $row ($row_min .. $row_max){ 
     foreach my $col ($col_min .. $col_max){ 
     my $cell = $worksheet->get_cell($row, $col); 
     if ($cell) {
         $aggregated .= $cell->value().','; 
     }
     else {
         $aggregated .= ',';
     }
     }
     $aggregated .= "\n"; 
     } 
     } 
     return $aggregated; 
     } 
      
      
      
     sub _XLSX { 
     eval {
     my %opts = ( 
     file => undef, 
     sheet => undef, 
     @_,
     ); 
    
      
     my $aggregated_x = (); 
     my $excel = Spreadsheet::XLSX->new($opts{file}); 
      
    foreach my $sheet ($excel->worksheet($opts{sheet})) {
    
        if (!defined $sheet) { 
          exit 2;
          croak "Error: WorkSheet not found"; 
        } 
    
     if ( $sheet->{Name} eq $opts{sheet}) { 
            $sheet->{MaxRow} ||= $sheet->{MinRow}; 
            foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) { 
                    $sheet->{MaxCol} ||= $sheet->{MinCol}; 
                            foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) { 
                            my $cell = $sheet->{Cells}->[$row]->[$col]; 
                            if ($cell) {
                            $aggregated_x .= $cell->{Val}.','; 
                            }
                            else {
                                    $aggregated_x .= ',';
                            } 
                    }
             $aggregated_x .=  "\n"; 
            } 
     }
     } 
     return $aggregated_x; 
     } 
     
     };
     if ($@) {
     exit 3;
     }
     
     
     sub _getSuffix { 
     my $f = shift; 
     my ($basename, $dirname, $ext) = fileparse($f, qr/\.[^\.]*$/); 
     return $ext; 
     }
     
     
     
      sub _convertlwr{ 
      my $f = shift; 
      my ($basename, $dirname, $ext) = fileparse($f, qr/\.[^\.]*$/); 
      return $ext; 
      }
var xlsx = require('node-xlsx')
var fs = require('fs')
var obj = xlsx.parse(__dirname + '/test2.xlsx') // parses a file
var rows = []
var writeStr = ""

//looping through all sheets
for(var i = 0; i < obj.length; i++)
{
    var sheet = obj[i]
    //loop through all rows in the sheet
    for(var j = 0; j < sheet['data'].length; j++)
    {
            //add the row to the rows array
            rows.push(sheet['data'][j])
    }
}

//creates the csv string to write it to a file
for(var i = 0; i < rows.length; i++)
{
    writeStr += rows[i].join(",") + "\n"
}

//writes to a file, but you will presumably send the csv as a      
//response instead
fs.writeFile(__dirname + "/test2.csv", writeStr, function(err) {
    if(err) {
        return console.log(err)
    }
    console.log("test.csv was saved in the current directory!")

Solution

  • The answer is its impossible. In order to update data inside a workbook that has excel functions, you must open it in Excel for the formulas to trigger. It's that simple.

    You could pull the workbook apart, create your own javascript functions, run the data through it and then write it out, but there are so many possible issues that it is not recommended.

    Perhaps one day Microsoft will release a linux Excel engine API for linux. But its still unlikely that such a thing would work via command line without invoking the GUI.