Search code examples
javaparsingunivocity

univocity - How to parse string from selected char


I have next .csv file:

tt0102057, 6, 2010-06-19, Hook

tt0102059, 7 ,2013-06-23, Hot Shots!

tt0102070, 5, 2010-02-10, Hudson Hawk

I need to parse value from 1st column to the int value, not String. So i need to skip first two symbols and take the rest as integer.

How can I do this using univocity parser on the stage of parsing?

The code works and save data to beans:

    BeanListProcessor<univMovie> rowProcessor = new BeanListProcessor<univMovie>(univMovie.class);
    CsvParserSettings settings = new CsvParserSettings();
    settings.getFormat().setLineSeparator("\n");
    settings.setProcessor(rowProcessor);
    settings.setHeaderExtractionEnabled(true);

    CsvParser parser = new CsvParser(settings);
    parser.parse(new FileReader("src/main/resources/movie.csv"));
    List<univMovie> beans = rowProcessor.getBeans();

Solution

  • You have many options:

    The easiest is to, on your univMovie class, add a @Replace annotation above the field that will receive that data:

    @Parsed
    @Replace(expression = "tt", replacement = "")
    int yourField;
    

    If your fields can be trickier and a regex is not going to be an easy/clear solution, you can put the @Parsed annotation on a method that will set that field for you:

    @Parsed
    void setYourField(String value){
        String cleanValue = someMethodToCleanYourValue(value);
        yourField = Integer.parseInt(cleanValue);
    }
    

    You can also tell the processor to convert multiple fields with:

    rowProcessor.convertIndexes(Conversions.replace("tt", ""))
        .set(0); //one or more indexes
    

    Hope this helps