I am creating a Kafka sink connector for a very old database, the schema for which I cannot alter.
Unfortunately this database has a few columns with special characters which do not work with Avro out-of-the-box. Otherwise, the schema would look something like this:
{
"type": "record",
"name": "db2_topic",
"fields": [
{"name": "PERSON", "type": "string"},
{"name": "DATE", "type": "type": "long"},
{"name": "PRICE", "type": "string"}
{"name": "PRICE$", "type": "string"}
]
}
According to the Avro specs, "name" must match alphanumeric characters according to the regex [A-Za-z_][A-Za-z0-9_]*. The build fails because the name "PRICE$" contains a character outside of the regex.
Does anyone have experience working their way around this using Avro, or is there another compatible serializer that can be used for this?
Data isn't immediately serialized to Avro (plus, Avro is not the only format that Kafka Connect can use). It first passes into a Kafka Connect Struct object, which can be (minimally) transformed such as ReplaceField$Value
operator, and I'd recommend PRICE_USD
, perhaps.