Search code examples
javacsvmergemultiple-columnssupercsv

Merging multiple csv files in java


I have 3 csv files. Each file contains 10 columns but only 1 of them has different values.

Dutch Dutch

French French

English English

As shown in the pictures, only the destination name is different per language.

What I want to become is csv file looking like this: Merged csv's

How do I put the columns between the existing columns and how do I change their headers as wel (DEST_NAME_DISPLAY --> DEST_NAME_EN)?

So far I've little experience with csv manipulation besides reading a csv file with Super csv

Example of what i know so far

public List<City> readWithCsvMapReader(String file, String source) throws IOException, AuthenticationException, RepositoryException {
    ICsvMapReader mapReader = null;
    FileInputStream stream = new FileInputStream(file);
    InputStreamReader reader = new InputStreamReader(stream, ENCODING);
    try {
        mapReader = new CsvMapReader(reader, PREFERENCE);
        final String[] header = mapReader.getHeader(true);
        final CellProcessor[] processors = getProcessors();
        Map<String, Object> locationMap;
        while ((locationMap = mapReader.read(header, processors)) != null) {
            /*do some logic*/
        }
    } finally {
        if (mapReader != null) {
            mapReader.close();
        }
    }
}

public CellProcessor[] getProcessors() {
    final CellProcessor[] processors = new CellProcessor[]{
            new NotNull(), // COU_ID
            new NotNull(), // COU_NAME
            new NotNull(), // DEST_ID
            new NotNull(), // DEST_NAME_DISPLAY
            new Optional(), //DEST_PARENT_ID
            new NotNull(), // DEST_STATION_COUNT
            new NotNull(), // DEST_CHILD_DEST_COUNT
            new NotNull(), // DEST_FLAG_APT
            new NotNull(), // DEST_FLAG_WIN
            new NotNull(), // DEST_FLAG_DEL
    };
    return processors;
}

Solution

  • Open all three CSV files.

    In each iteration, read one line from each file, get all cell values from one and the different cell's values from the other two. Merge into a new row containing all needed data, write this row to output CSV.

    Close all files. Treat for exceptional states (different number of lines in each file, different ID on current row for each file, ...).