Search code examples
regexunixsedkshaix

using backreferences regex in sed


I would like to remove multiple spaces in a file with a single character. Example

cat      kill    rat
dog      kill    cat

I used the following regex, which seemed to matched in http://www.regexpal.com/ but wasn't working in sed.

([^ ])*([ ])*

I used the sed command like so:

sed s/\(\[\^\ \]\)*\(\[\ \]\)*/\$1\|/g < inputfile

I expect,

cat|kill|rat
dog|kill|cat

But I couldn't get it to work. Any help would be much appreciated. Thanks.

Edit: kindly note that cat/dog could be any character than whitespace.


Solution

  • sed backreferences with backslashes, so use \1 instead of $1. Surround your expressions with quotes:

    sed 's/match/replace/g' < inputfile
    

    Manpages are the best invention in Linux world: man sed

    Watch out for *, it can actually match NOTHING. If you want to replace multiple spaces with a '|', use this RE:

    sed -r 's/ +/\|/g'
    

    From man sed:

    -r, --regexp-extended
       use extended regular expressions in the script.
    

    You don't need any backreferences if you just want to replace all spaces. Replace (space) by \s if you want to match tabs too.