Search code examples
javacsvjacksonfasterxml

How to rename Columns via Lambda function - fasterXML


Im using the FasterXML library to parse my CSV file. The CSV file has the column names in its first line. Unfortunately I need the columns to be renamed. I have a lambda function for this, where I can pass the red value from the csv file in and get the new value.

my code looks like this, but does not work.

CsvSchema csvSchema =CsvSchema.emptySchema().withHeader();
ArrayList<HashMap<String, String>> result = new ArrayList<HashMap<String, String>>();       
MappingIterator<HashMap<String,String>> it = new CsvMapper().reader(HashMap.class)
                                                    .with(csvSchema )
                                                    .readValues(new File(fileName));
            
            
            while (it.hasNext()) 
              result.add(it.next());
            
        
            System.out.println("changing the schema columns.");

            for (int i=0; i < csvSchema.size();i++) { 
                
                String name = csvSchema.column(i).getName();
                String newName = getNewName(name);
                csvSchema.builder().renameColumn(i, newName);
                
            }
            csvSchema.rebuild();

when i try to print out the columns later, they are still the same as in the top line of my CSV file.

Additionally I noticed, that csvSchema.size() equals 0 - why?


Solution

  • You could instead use uniVocity-parsers for that. The following solution streams the input rows to the output so you don't need to load everything in memory to then write your data back with new headers. It will be much faster:

    public static void main(String ... args) throws Exception{
    
        Writer output = new StringWriter(); // use a FileWriter for your case
    
        CsvWriterSettings writerSettings = new CsvWriterSettings(); //many options here - check the documentation
        final CsvWriter writer = new CsvWriter(output, writerSettings);
    
        CsvParserSettings parserSettings = new CsvParserSettings();  //many options here as well
        parserSettings.setHeaderExtractionEnabled(true); // indicates the first row of the input are headers
    
        parserSettings.setRowProcessor(new AbstractRowProcessor(){
    
            public void processStarted(ParsingContext context) {
                writer.writeHeaders("Column A", "Column B", "... etc");
            }
    
            public void rowProcessed(String[] row, ParsingContext context) {
                writer.writeRow(row);
            }
    
            public void processEnded(ParsingContext context) {
                writer.close();
            }
        });
    
        CsvParser parser = new CsvParser(parserSettings);
        Reader reader = new StringReader("A,B,C\n1,2,3\n4,5,6"); // use a FileReader for your case
        parser.parse(reader); // all rows are parsed and submitted to the RowProcessor implementation of the parserSettings.
    
        System.out.println(output.toString());
        //nothing else to do. All resources are closed automatically in case of errors.
    }
    

    You can easily select the columns by using parserSettings.selectFields("B", "A") in case you want to reorder/eliminate columns.

    Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).