Search code examples
awksedcsh

Generating csv from text file in Linux command line with sed, awk or other


I have a file with thousands of lines that I would like to have it as a csv, for later processing.

The original file looks like this:

cc_1527 (ILDO_I173_net9 VSSA) capacitor_mis c=9.60713e-16
cc_1526 (VDD_MAIN Istartupcomp_I115_G7) capacitor_mis \
    c=4.18106e-16
cc_1525 (VDD_MAIN Istartupcomp_I7_net025) capacitor_mis \
    c=9.71462e-16
cc_1524 (VDD_MAIN Istartupcomp_I7_ST_net14) \
    capacitor_mis c=4.6011e-17
cc_1523 (VDD_MAIN Istartupcomp_I7_ST_net15) \
    capacitor_mis c=1.06215e-15
cc_1522 (VDD_MAIN ILDO_LDO_core_Istartupcomp_I7_ST_net16) \
    capacitor_mis c=1.37289e-15
cc_1521 (VDD_MAIN ILDO_LDO_core_Istartupcomp_I7_I176_G4) capacitor_mis \
    c=6.81758e-16

The problem here, is that some of the lines continue to the next one, indicated by the symbol "\".

The final csv format for the first 5 lines of the original text should be:

cc_1527,(ILDO_I173_net9 VSSA),capacitor_mis c=9.60713e-16
cc_1526,(VDD_MAIN Istartupcomp_I115_G7),capacitor_mis,c=4.18106e-16
cc_1525,(VDD_MAIN Istartupcomp_I7_net025),capacitor_mis,c=9.71462e-16

So, now everything is in one line only and the "\" characters have been removed.

Please notice that may exist spaces in the beginning of each line, so these should be trimmed before anything else is done.

Any idea on how to accomplish this. ?

Thanks in advance.

Best regards, Pedro


Solution

  • Using some of the more obscure features of sed (It can do more than s///):

    $ sed -E ':line /\\$/ {s/\\$//; N; b line}; s/[[:space:]]+/,/g' demo.txt
    cc_1527,(ILDO_I173_net9,VSSA),capacitor_mis,c=9.60713e-16
    cc_1526,(VDD_MAIN,Istartupcomp_I115_G7),capacitor_mis,c=4.18106e-16
    cc_1525,(VDD_MAIN,Istartupcomp_I7_net025),capacitor_mis,c=9.71462e-16
    cc_1524,(VDD_MAIN,Istartupcomp_I7_ST_net14),capacitor_mis,c=4.6011e-17
    cc_1523,(VDD_MAIN,Istartupcomp_I7_ST_net15),capacitor_mis,c=1.06215e-15
    cc_1522,(VDD_MAIN,ILDO_LDO_core_Istartupcomp_I7_ST_net16),capacitor_mis,c=1.37289e-15
    cc_1521,(VDD_MAIN,ILDO_LDO_core_Istartupcomp_I7_I176_G4),capacitor_mis,c=6.81758e-16
    

    Basically:

    • Read a line into the pattern space.

    • :line /\\$/ {s/\\$//; N; b line}: If the pattern space ends in a \, remove that backslash, read the next line and append it to the pattern space, and repeat this step.

    • s/[[:space:]]+/,/g: Convert every case of 1 or more whitespace characters to a single comma.

    • Print the result, and go back to the beginning with a new line.