I'm converting pipe delimited text file to parquet. to do so I'm using convert record processor.
Reader-csv reader
writer-ParquetSetWriter.
schema strategy- InferSchema in reader and inherit record schema in writer.
But I'm getting an error in ConvertRecord processor. It says that index for header "ColumnName" is 24 but has only 24 values.
Based on the input provided, it looks like you have a line that has 25 values because of an improperly delimited set of values. "On index 24, but there are only 24 values" would mean you are on position 25 from an offset of 0.
To debug this, if it's a really big CSV file, you can chain together SplitRecord
and ValidateRecord
to try to catch the line that has this problem.