Search code examples
linuxshellsedposix

How can I take a regular expression from the beginning of one line and copy it to the beginning of following lines?


I'm writing a script that converts text from pdf documents and formats it in CSV to be used later on. I've come to an issue where I need to append additional information onto certain lines to complete the data and don't know how to achieve it with sed. The document looks like so:

# "date","description","cost","total"
"31 01 19","Purchase from SHOP","1.23","1.23"
"Direct debit to COMPANY","2.34","3.57"
"Purchase from SHOP","3.45","7.02"
"01 02 19","Received from PERSON","1.23","5.79"
"Purchase to SHOP","4.56","10.35"

When it should look like this:

# "date","description","cost","total"
"31 01 19","Purchase from SHOP","1.23","1.23"
"31 01 19","Direct debit to COMPANY","2.34","3.57"
"31 01 19","Purchase from SHOP","3.45","7.02"
"01 02 19","Received from PERSON","1.23","5.79"
"01 02 19","Purchase to SHOP","4.56","10.35"

How could I achieve this with sed?

I have tried:

/^(\"[[:digit:]]{2} [[:digit:]]{2} [[:digit:]]{2}\",)/{
    h
    N
    /^(\"[^\"]*\",\"(0|[1-9][[:digit:]]{,2}(,[[:digit:]]{1,3})*)\.[[:digit:]]{2})\",?{2})/{
        G
        s/((.*))\n((.*))/\2,\1/
    }
}

But that does not seem to do anything, even with the regular expressions tested to ensure they match what I'm after. Am I doing something wrong here or is there a better way to do this?


Solution

  • This might work for you (GNU sed):

    sed -E 'N;/\n".. .. .."/!s/^([^,]+,).*\n/&\1/;P;D' file
    

    Append the following line and it does not start with a date, insert the previous lines date, print/delete the previous line and repeat.