Search code examples
bashshellsedposixsh

Remove suffix of a delimited multiline string


While trying to script safely handling filenames while handling newlines safely, I came across a difficult test case.

Given the input

a.b.c
.d.staging

where this input represents a single filename, I want to strip the .staging suffix. I would normally use something akin to | rev | cut -d. -f2- | rev for this, but this fails:

echo -ne "a.b.c\\n.d.staging" | rev | cut -d. -f2- | rev

yields

a.b
.d

In addition to having lost the c component in addition to the staging suffix, there's also a lone newline at the end there Markdown is hiding.

The best solution I've come up with so far is to use sed -e ':a' -e 'N' -e '$!ba' -e 's/\(.*\)\..*/\1/', which appears to work:

echo -ne "a.b.c\\n.d.staging" | sed -e ':a' -e 'N' -e '$!ba' -e 's/\(.*\)\..*/\1/'

yields

a.b.c
.d

which is the correct output.

This seems an inelegant solution, as it's hammering sed into handling newlines, which is something sed isn't great at doing.

Is there a more elegant solution? Ideally a POSIX-compatible one.


Solution

  • If you have the name in a variable, the newline is not an issue.

    $ fname=$'a.b.c\n.d.staging'
    $ echo "$fname"
    a.b.c
    .d.staging
    $ echo "${fname%.*}"
    a.b.c
    .d
    $