Search code examples
regexsedcharacter-class

Matching opposite of [[:blank:]] character class in sed


I cannot find a way to represent the inverse of a character class in sed. If this were a perl-like environment I would use [^\s]. However in sed this appears to match non-s, not non-whitespace.

On a line of text (from gdrive) I need to capture the first non-whitespace token and ignore everything after (and including) the first whitespace on the line.

Here's a fake but representative example of the input I am trying to parse:

19845fake-FaKeE-xbk534sWsbBQ              mydir                                    dir               2019-01-01 19:10:44

My original attempt at doing this was the line:

sed -rn 's/^([^\s]*).*$/\1/p'

At first it seemed to work until I noticed that this is cutting off at the first 's', rather than the first whitespace.

I have since attempted various permutations like:

#matches up to the first 's'
 sed -rn 's/([^\\s]*).*$/\1/p'

#matches only the first character
 sed -rn 's/^([^[[:blank:]]]*).*$/\1/p'

#matches nothing at all
 sed -rn 's/^([[^:blank:]]*).*$/\1/p'



sed -rn 's/^\s*([^\s]*).*$/\1/p'

Expected: 19845fake-FaKeE-xbk534sWsbBQ

Actual: 19845fake-FaKeE-xbk534


Solution

  • The character class is [:blank:] so to match the opposite of it, you just need [^[:blank:]]. This should work:

    sed -rn 's/^([^[:blank:]]*).*$/\1/p'