Search code examples
apache-kafkavoltdb

Voltdb - Kafka Importer - Change delimiter used to import


I have a kafka topic delimited by ";" and I want to import to a voltdb table.

I did not find in the official documentation anything about change the delimiter. The Kafka Importer (https://docs.voltdb.com/UsingVoltDB/exportimportkafka.php), give me only two options: CSV and TSV.

Is there any advanced configuration that allows me to only change the delimiter?

My deployment.xml:

<import>
    <configuration type="kafka" enabled="true" format="csv">
        <property name="topics">br-com-topic-ws</property>
        <property name="procedure">AUT.insert</property>
        <property name="brokers">liXXXX:9092</property>
    </configuration>
</import>

Example of my Kafka Topic:

000000ADS;20160202;20050202235900;18.99;99 000000JAM;20160202;20150201235900;18.05;20


Solution

  • The KafkaImporter uses a CSV/TSV Import Formatter by default, which as a few options, but doesn't have configurable options for the delimiter.

    You can implement a custom decoder to handle other formats. We have some test code that include an example custom formatter on Github here. There is a run.sh that includes a function jars that will build an OSGI bundle containing the custom formatter code. The build uses ant, which uses the build.xml file.