Search code examples
javacsvbean-io

beanIO: identify different records with literal


SITUATION:

I use beanIO 2.1.0 to read a csv-file into different kind of objects.

This is my csv-File. A list of animals (color, type, number of legs). In my list are also animals without a type (last row).

brown;cat;4
white;dog;4
brown;dog;4
black;;8

I want to read the csv-file into different animal-objects. If the type is 'cat' it should be a cat-object. The same with dog. If the type isn't cat or dog, e.g. empty or an unknown animal-type, then it should be an animal-object.

Here the belonging beanIO-mapping:

<beanio xmlns="http://www.beanio.org/2012/03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="animalFile" format="csv" >
    <parser>
      <property name="delimiter" value=";"/>
    </parser>
    <record name="animal" class="zoo.Cat">
      <field name="color" />
      <field name="type" rid="true" literal="cat"/>
      <field name="legs"/>
    </record>
    <record name="animal" class="zoo.Dog">
      <field name="color" />
      <field name="type"  rid="true" literal="dog"/>
      <field name="legs"/>
    </record>
    <record name="animal" class="zoo.Animal" >
      <field name="color" />
      <field name="type"/>
      <field name="legs"/>
    </record>
  </stream>
  </beanio>

My program reads the csv-file, parses it with beanIO and calls the toString-method of the parsed objects.

This is the output. It looks fine:

CAT: brown;cat;4
DOG: white;dog;4
DOG: brown;dog;4
ANIMAL: black;;8

PROBLEM:

Now I just change the order of the animals in the csv-file. In the second row is the unknown animal-type:

brown;cat;4
black;;8
white;dog;4
brown;dog;4

This ist the new output! When the first unknown animal is found, then all the following rows are also unknown animals.

CAT: brown;cat;4
ANIMAL: black;;8
ANIMAL: white;dog;4
ANIMAL: brown;dog;4

QUESTION:

Is it a bug in beanIO or can I configure it in the beanIO-mapping?


Solution

  • EDIT: Updated answer after comments from OP.

    This is not a bug in BeanIO. You have two options to identify a record with. First, you have the literal attribute as you used it so far. Secondly you can also use a regular expression (regex) to identify records with.

    You want to match an Animal object when the type field is not cat or dog, or as you stated when it is an empty string/object.

    Your type field definition could be one of two for the Animal record.

    <field name="type" rid="true" regex="\s*" />
    

    Here it will match whenever the type field contains spaces as defined by the java regular expressions.

    OR

    <field name="type" rid="true" regex=""^(?:(?!\b(cat|dog)\b).)*$" />
    

    This will match any record where the type field doesn't contain the words cat or dog.

    Try it with this Animal record:

    <record name="animal" class="zoo.Animal" >
      <field name="color" />
      <field name="type" rid="true" regex=""^(?:(?!\b(cat|dog)\b).)*$" />
      <field name="legs"/>
    </record>
    

    Off-topic. Technically you are not reading a CSV file because then your delimiter must be a comma. Instead, you have a delimited format which uses a semi-colon (;) as a delimiter.

    I would also suggest that you make the names of your record definitions unique in your xml mapping file. The record name is used in error messages for reporting the location of a problem. If you have the same record name for all records, you will not know where to look for the problem.