I am currently using this Miller command to convert a CSV file into a JSON array file:
mlr --icsv --ojson --jlistwrap cat sample.csv > sample.json
It works fine, but the JSON array is too large.
Can Miller split the output into many smaller JSON files of X
rows each?
For example if the original CSV has 100 rows, can I modify the command to output 10 JSON Array files, with each JSON array holding 10 converted CSV rows?
Bonus points if each JSON Array can also be wrapped like this:
{
"instances":
//JSON ARRAY GOES HERE
}
you could run this
mlr --c2j --jlistwrap put -q '
begin {
@batch_size = 1000;
}
index = int(floor((NR-1) / @batch_size));
label = fmtnum(index,"%04d");
filename = "part-".label.".json";
tee > filename, $*
' ./input.csv
You will have a file named part-00xx
every 1000 record.