I want to read a CSV file with BeanIO and I want only the lines start with "CA" skipping the rest of the lines. I need the values "0" "1" "2" and "3" "4" "5" of lines "CA"
AA123
BA456
CA789
CA012
CA345
DA678
EA901
BeanIO has a XML mapper.
<stream name="InfoCSV" format="csv">
<record name="info" class="com.example.Info" minOccurs="0" maxOccurs="unbounded">
<field name="digit1" />
<field name="digit2" />
<field name="digit3" />
</record>
</stream>
How do I filter the lines? I don't know how do the XML parser
First, from the data you have shown, you must use a fixedlength
format parser and not a csv
:
<stream name="InfoCSV" format="fixedlength" />
Appendix A par 7 Streams have a configuration setting called ignoreUnidentifiedRecords
that you need to ignore the records/lines that doesn't start with "CA".
You also need to tell the parser how to identify the record/lines you are interested in. Section 4.2.1 explains how record identification works with rid="true"
and the literal
attribute. If we assume that the first 2 characters identify the record/line you are interested in we have:
<field name="id" position="0" length="2" rid="true" literal="CA" />
Putting it all together:
<stream name="InfoCSV" format="fixedlength" ignoreUnidentifiedRecords="true">
<record name="info" class="com.example.Info" minOccurs="0" maxOccurs="unbounded">
<field name="id" position="0" length="2" rid="true" literal="CA"/>
<field name="digit1" position="2" length="1" />
<field name="digit2" position="3" length="1" />
<field name="digit3" position="4" length="1" />
</record>
</stream>