Search code examples
excelcoldfusioncoldfusion-10cfspreadsheet

ColdFusion CFSpreadsheet reads empty cells


I give my client a template that they are supposed to populate and then they upload the spreadsheet and I read the file with cfspreadsheet in order to copy the data into a database table.

Pretty easy. The template has only one column in it. The client can not upload a sheet with more than one column in it. This used to work.

So the one column header is ING_CAS but when I read the file in with cfspreadsheet I get COL_2, COL_3, ING_CAS. So not only are the blank cells getting read they are also being given default names because of this attribute headerrow="1".

I'm at a loss here. I keep downloading the template and selecting the extraneous blank rows and columns and deleting them but I have no control over the file once the client gets it.

Is there some strange setting I am missing that will make cfspreadsheet ignore blank cells?

<cfspreadsheet action="read" src="#theFile#" query="SpreadSheetData" headerrow="1">
<cfdump var="#SpreadSheetData#" />

I ended up writing a helper function that stripped out COL_(n) columns.

<cffunction name="CleanExcelQuery" access="public" returntype="query" output="false" hint="Strips out blank column headers picked up on read.">
    <cfargument name="SpreadSheetQuery" type="query" required="true" />

    <cfset var theColumnHeaders = SpreadSheetQuery.columnList>
    <cfset var theNewColumnHeaders = "">

    <cfloop list="#theColumnHeaders#" index="h">
        <cfif uCase(left(h,  4)) IS NOT "COL_">
            <cfset theNewColumnHeaders = ListAppend( theNewColumnHeaders,  h )>
        </cfif>
    </cfloop>

    <cfquery name="newSpreadSheetQuery" dbtype="query">
        Select #theNewColumnHeaders#
        From SpreadSheetQuery
    </cfquery>

    <cfreturn newSpreadSheetQuery />
</cffunction>

Solution

  • cfspreadsheet only omits cells that are completely blank: no value or format (such as when you select a cell and use "clear all"). If it is picking up "extra" columns, it is because one or more of the cells in that column have a value or a custom cell format. Meaning they are not really "blank".

    If you know the column position, you can use the columns attribute to only read only the values in that column. For example, to read column C:

       <cfspreadsheet action="read"
                src="c:/path/to/file.xls"
                columns="3"
                headerrow="1"
                query="qResult" />
    

    But I am not sure I understand why this is an issue. If you only need one column, simply ignore the other columns in your code. Can you elaborate on why this is causing an issue?