I have a script that converts excel file into csv using unoconv. I noticed that some records in the csv is added as a new line due to particular format in excel. I was wondering if there is anyway this can be handled in unix.
sample problematic data.
col1, col2, col3
jim,"washington dc
",123
correct data should be.
col1, col2, col3
jim,"washington dc",123
You may use this gnu sed
:
cat file
col1, col2, col3
jim,"washington dc
","12
3"
foo, bar, baz
123, abc, xyz
And sed
command:
sed -E ':a;N;;s/(,"[^"]*)\n/\1/;$!ba' file
col1, col2, col3
jim,"washington dc","123"
foo, bar, baz
123, abc, xyz