Search code examples
javacsvfile-iojava-6

Read .csv file in GB's using Java


I have following two requirements:

  1. To read a CSV file and put rows line by line into the database (RDSMS) without any data manipulation.
  2. To read a CSV file and put this data into the database (RDBMS). In this case, row Z might be dependent on row B. So need to have a staging DB (in-memory or another a staging RDBMS)

I am analyzing multiple ways to accomplish this:

  • Using Core java, and read file in Producer-consumer way.
  • Using Apache Camel and BeanIO to read the csv file.
  • Using SQL to read the file.

Wanted to know, if is there an already industry defined preferred way to do such kind of tasks?

I found few links on stackoverflow, but I am looking for more options:

I am using Java6 for implementation.


Solution

  • you should use NIO package to do such stuff in GBs. NIO is asynchronous, fastest till date and most reliable. you can simple read files in chunks via NIO packaging and then insert into db using bulk commands rather than single insertion. Single insertion take lot of your CPU cycles and may cause OOM errors.