Search code examples
shellawksed

sed append a line to itself or awk $0 equivalent?


I have this text file containing a list like

name
surname
office
address
home_phone
office_phone
my_other_value

and I want to transform that to

'name':'Name';
'surname':'Surname';
'office':'Office';
'address':'Address';
'home_phone':'Home_Phone';
'office_phone':'Office_Phone';
'my_other_value':'My_Other_Value';

I tried to do that with a sed oneliner, I can change the case without problem with

sed -r 's/(\b|_)([a-z])/\1\U\2/g' /tmp/list

but how do you print the non modified line before itself like you would do a "print $0" with Awk?

I ended it doing like this but it's a bit dirty

while read line; do echo -ne "'$line':"; echo "'$line';" | sed -r 's/(\b|_)([a-z])/\1\U\2/g'; done < <(cat /tmp/list)

Tried some sed prints but without success.


Solution

  • Requiring the solution to be a oneliner is disingenuous. If it takes multiple lines, it's probably better for legibility to keep it that way.

    Having said that,

    sed -r -e "s/.*/'&'/" -e 'h;s/(\b|_)([a-z])/\1\U\2/g;x;G;s/\n/:/;s/$/;/' /tmp/list
    

    seems to do what you are asking.

    Breaking it up into smaller parts,

    • sed -r selects extended regular expression syntax. This isn't portable, but since you used it in your attempt, I kept it. (Try -E if you don't have -r; though that's not POSIX-portable, either.)
    • -e "s/.*/'&'/" performs a substitution to wrap the contents of the current input line inside literal single quotes, using double quotes around the script fragment to protect it from the shell.
    • -e '...' should perhaps be broken up into smaller parts. I mashed them together to make a oneliner. Equivalently you could replace each semicolon with ' -e ' and break up the script on multiple lines. This uses the stronger single quotes to avoid having to backslash-escape backslashes, dollar signs, etc which are special inside double quotes, but just themselves inside single quotes.

    Let's break here to clarify a couple of concepts. sed commands operate on something called the pattern space, which usually contains the current input line; but as we will see below, you can change that. There is a second memory location called the hold space where you can store information which you need to refer back to later, as we soon will.

    • h copies the pattern space to the hold space. At this point, they both contain the input line with single quotes added around it.
    • s/(\b|_)([a-z])/\1\U\2/g is your original code to convert each word in the pattern space to proper case. This too uses some nonstandard facilities which are probably supported on Linux / GNU sed but not generally portable to other platforms.
    • x swaps the hold space (the original) with the pattern space (the proper-cased version). This is so that the append we do next appends the modified version after the original, rather than vice versa.
    • G appends the hold space to the end of the pattern space, separated by a newline.
    • s/\n/:/ replaces that newline with a colon.
    • s/$/;/ adds a semicolon at the end.

    Demo: https://ideone.com/wNjvxA

    If you need a properly portable solution, I would go with Awk. Somewhat ironically, a Perl or Python implementation might be quite portable in practice, although of course POSIX does not say anything about either of those (or Ruby, Haskell, Rust, Go, etc).