Search code examples
regexshellunixunoconv

Removing new line from CSV file


I have a script that converts excel file into csv using unoconv. I noticed that some records in the csv is added as a new line due to particular format in excel. I was wondering if there is anyway this can be handled in unix.

sample problematic data.

col1, col2, col3
jim,"washington dc
",123

correct data should be.

col1, col2, col3
jim,"washington dc",123

Solution

  • You may use this gnu sed:

    cat file
    

    col1, col2, col3
    jim,"washington dc
    ","12
    3"
    foo, bar, baz
    123, abc, xyz
    

    And sed command:

    sed -E ':a;N;;s/(,"[^"]*)\n/\1/;$!ba' file
    

    col1, col2, col3
    jim,"washington dc","123"
    foo, bar, baz
    123, abc, xyz