Search code examples
regexsedterminalwindows-subsystem-for-linuxksh

Regex to match a series of printable characters that may have a space and terminates in a /


Essentially, what I'm trying to do is homebrew an ls colorizer. ls --color=auto runs TERRIBLY slowly on my network drive. What I want to do is run ls --color=never --file-type -C and search for directories ( flagged with a trailing / ) then add control characters to add colors.

Files/folders may have symbols, numbers, letters, and one or more spaces in the name, but never consecutive spaces. Delimiters between files/folders are some number of tabs and/or spaces. Filenames will never start with a space, but may start with a symbol, particularly an underscore.

What I have now kinda sorta works, but occasionally misses things that it should match and occasionally matches things that it should not.

/bin/ls --color=never --file-type -aFC | 
sed -e 's/[\t ][[:print:]]*\//\e[34m&\e[37m' | 
sed -e 's/^[[:print:]]*\//\e[34m&\e[37m'

In the below list:

./ 02_Sidescan/ 05_UHRS/ 08_SVP/ SPL_Run/ ../ 03_SBP/ 06_MAG/ 09_Tide/ Tide_Run/ 01_Navigation/ 04_Multibeam/ 07_DelayedHeave/ DelayedHeave_Run/

EVERYTHING should have been blue, but the 08_SVP, 09_Tide, SPL_Run, and Tide_Run were all missed. I have no clue why this is. Can anyone help? Also, I need the pattern to NOT match consecutive spaces.

This is on Ubuntu running in WSL using the latest release of ksh93 ( u+m 1.0.6 ).

Thanks!

Edited to add a text example:

Input text:

./              02_Sidescan/   05_UHRS/          08_SVP/            SPL_Run/
../             03_SBP/        06_MAG/           09_Tide/           Tide_Run/
01_Navigation/  04_Multibeam/  07_DelayedHeave/  DelayedHeave_Run/

sed command:

sed -e 's/^[[:print:]]*\//_&_/; s/[     ][[:print:]]*\//_&_/'

output from sed:

_./_    _       02_Sidescan/   05_UHRS/_                 08_SVP/            SPL_Run/
_../_   _       03_SBP/        06_MAG/_          09_Tide/           Tide_Run/
_01_Navigation/__       04_Multibeam/  07_DelayedHeave/  DelayedHeave_Run/_

Solution

    1. You probably don't need two cases.
    2. You aren't using the g flag so only the first match on any line was changed.
    /bin/ls --color=never --file-type -aFC | 
    sed $'s/[[:print:]]*\\//\e[34m&\e[37m/g'
    

    It is impossible to reliably distinguish a directory name that begins with a space from one that doesn't but is preceded by a column space.

    If we assume directory names do not start with a space, an approach that will have fewer false positives is:

    /bin/ls --color=never --file-type -aFC | 
    sed -E $'s/([[:graph:]]+ ?)+\\//\e[34m&\e[37m/g'