Search code examples
univocity

Univocity - parse each TSV file row to different Type of class object


I have a tsv file which has fixed rows but each row is mapped to different Java Class.

For example.

recordType  recordValue1
recordType  recordValue1 recordValue2

for First row I have follofing class:

public class FirstRow implements ItsvRecord {

    @Parsed(index = 0)
    private String recordType;

    @Parsed(index = 1)
    private String recordValue1;

    public FirstRow() {
    }
}

and for second row I have:

public class SecondRow implements ItsvRecord {

    @Parsed(index = 0)
    private String recordType;

    @Parsed(index = 1)
    private String recordValue1;

    public SecondRow() {
    }
}

I want to parse the TSV file directly to the respective objects but I am falling short of ideas.


Solution

  • Use an InputValueSwitch. This will match a value in a particular column of each row to determine what RowProcessor to use. Example:

    Create two (or more) processors for each type of record you need to process:

    final BeanListProcessor<FirstRow> firstProcessor = new BeanListProcessor<FirstRow>(FirstRow.class);
    final BeanListProcessor<SecondRow> secondProcessor = new BeanListProcessor<SecondRow>(SecondRow.class);
    

    Create an InputValueSwitch:

    //0 means that the first column of each row has a value that 
    //identifies what is the type of record you are dealing with
    InputValueSwitch valueSwitch = new InputValueSwitch(0);
    
    //assigns the first processor to rows whose first column contain the 'firstRowType' value
    valueSwitch.addSwitchForValue("firstRowType", firstProcessor);
    
    //assigns the second processor to rows whose first column contain the 'secondRowType' value
    valueSwitch.addSwitchForValue("secondRowType", secondProcessor);
    

    Parse as usual:

    TsvParserSettings settings = new TsvParserSettings(); //configure...
    // your row processor is the switch
    settings.setProcessor(valueSwitch);
    
    TsvParser parser = new TsvParser(settings);
    
    Reader input = new StringReader(""+
            "firstRowType\trecordValue1\n" +
            "secondRowType\trecordValue1\trecordValue2");
    
    parser.parse(input);
    

    Get the parsed objects from your processors:

    List<FirstRow> firstTypeObjects = firstProcessor.getBeans();
    List<SecondRow> secondTypeObjects = secondProcessor.getBeans();
    

    The output will be*:

    [FirstRow{recordType='firstRowType', recordValue1='recordValue1'}]
    
    [SecondRow{recordType='secondRowType', recordValue1='recordValue1', recordValue2='recordValue2'}]
    
    • Assuming you have a sane toString() implemented in your classes

    If you want to manage associations among the objects that are parsed:

    If your FirstRow should contain the elements parsed for records of type SecondRow, simply override the rowProcessorSwitched method:

        InputValueSwitch valueSwitch = new InputValueSwitch(0) {
        @Override
        public void rowProcessorSwitched(RowProcessor from, RowProcessor to) {
            if (from == secondProcessor) {
                List<FirstRow> firstRows = firstProcessor.getBeans();
                FirstRow mostRecentRow = firstRows.get(firstRows.size() - 1);
    
                mostRecentRow.addRowsOfOtherType(secondProcessor.getBeans());
                secondProcessor.getBeans().clear();
            }
        }
    };
    
    • The above assumes your FirstRow class has a addRowsOfOtherType method that takes a list of SecondRow as parameter.

    And that's it!

    You can even mix and match other types of RowProcessor. There's another example here that demonstrates this.

    Hope this helps.