Search code examples
regexperliterationnewlineone-liner

Perl one-liner to split a line into multiple lines iteratively


I have a tricky problem and I'm wondering if there's a clever regex solution. I have input data that consists of two columns, but the first column needs to be split into multiple lines with the second column intact. For example, a file called test:

cat_;_dog_;_rat animal
chair_;_desk    object

The output needs to look like this:

cat animal
dog animal
rat animal
chair    object
desk    object

There are an arbitrary number of ; separators on each line. There is probably a way to do this in a one-liner, which I prefer since I'm piping the data in and out. I tried this:

perl -pe 's/(\w+)_;_(\w+)\t(.+)/$1\t$3\n$2\t$3/g' test

The first column has words (\w+) delimited by _;_, then a tab, and then the second column. But this only consumes one iteration of the data:

cat     animal
dog_;_rat       animal
chair   object
desk    object

I tried the following too just in case the /g global tag wasn't getting it right:

perl -pe 's/(\w+)(_;_(\w+))+\t(.+)/$1\t$4\n$3\t$4/g' test

It still only goes one round. Who's got some ideas?


Solution

  • perl -lane 'print "$_ $F[1]" for split /_;_/, $F[0];'
    
    • -n reads the input line by line and runs the code for each line;
    • -l removes newlines from input and adds them to output;
    • -a splits each input line on whitespace into the @F array;
    • split splits the first column on _;_, and for each value, it prints it ($_) followed by the second column.