Search code examples
apache-poisax

Using Apache POI HSSFListener how to identify date type


Due to millions of records to handle, I have to handle SAX parsing using the technique given here. http://poi.apache.org/spreadsheet/how-to.html#event_api. However Date type is convered to Number. How to distinguish between Date and Number when using HSSFListener?


Solution

  • The Excel file formats store all dates as numbers with special formatting rules. It's not that HSSF is converting it, you're getting exactly what Excel stores in the file!

    If you want good examples of HSSF Event processing for .xls files, I would suggest you take a look at XLS2CSVmra from Apache POI and ExcelExtractor from Apache Tika

    Both show a number of common things you'll want to do, including detecting Date cells and formatting them appropriately for display