Search code examples
cexcel-2007

Read from excel file in C


I want to read from an excel file in C. The excel 2007 file contains about 6000 rows and 2 columns. I want to store the contents in a 2-D array in C. If there exists a C library or any other method then please let me know.


Solution

  • Excel 2007 stores the data in a bunch of files, most of them in XML, all crammed together into a zip file. If you want to look at the contents, you can rename your .xlsx to whatever.zip and then open it and look at the files inside.

    Assuming your Excel file just contains raw data, and all you care about is reading it (i.e., you do not need/want to update its contents and get Excel to open it again), reading the data is actually pretty easy. Inside the zip file, you're looking for the subdirectory xl\worksheets\, which will contain a number of .xml files, one for each worksheet from Excel (e.g., a default workbook will have three worksheets named sheet1.xml, sheet2.xml and sheet3.xml).

    Inside of those, you're looking for the <sheet data> tag. Inside of that, you'll have <row> tags (one for each row of data), and inside of them <c> tags with an attribute r=RC where RC is replaced by the normal row/column notation (e.g., "A1"). The <c> tag will have nested <v> tag where you'll find the value for that cell.

    I do feel obliged to add a warning though: while reading really simple data can indeed be just this easy, life can get a lot more complex in a hurry if you decide to do much more than reading simple rows/columns of numbers. Trying to do anything even slightly more complex than that can get a lot more complex in a hurry.