Search code examples
stringbashshelldivide

How do I split a list of unquoted paths with embedded spaces into individual paths?


I'm not good at bash shell scripting in Ubuntu, so I need your help.

The problem is...

We use Perforce for SCM.

I try to get directories from //Development/ branches.

Until this time, everything was good.

But now I can't split branches by whitespace anymore, because of branches such as the following (note the embedded space):

//Development/graphic/release/Unity Provider

We need each branch on a separate line starting with //Development, but always I get the following result (note the unwanted line break):

//Development/graphic/release/Unity
Provider

How can I fix this?

Please help me. Thank you.

Below is a sample one-line string:

//Development/graphic/release/CM //Development/graphic/release/GManager //Development/graphic/release/Notification //Development/graphic/release/Core //Development/graphic/release/Provider //Development/graphic/release/WH //Development/graphic/release/Accessory //Development/graphic/release/Unity Provider //Development/graphic/release/tipManager

And, I want to get a result string like below (each branch name on its own line):

//Development/graphic/release/CM 
//Development/graphic/release/GManager 
//Development/graphic/release/Notification 
//Development/graphic/release/Core 
//Development/graphic/release/Provider 
//Development/graphic/release/WH 
//Development/graphic/release/Accessory 
//Development/graphic/release/Unity Provider 
//Development/graphic/release/tipManager

I also want to store the results in a list variable.

E.g., list[0] should contain //Development/graphic/release/CM.


Solution

  • I'm assuming that:

    • you want to split your input string into individual paths based on substrings starting with //[Development/], either at the start of the string, or, if inside, preceded by a single space.
    • regardless of whether the strings between //[Development/] instances contain spaces or not.
    str='//Development/graphic/release/CM //Development/graphic/release/GManager //Development/graphic/release/Notification //Development/graphic/release/Core //Development/graphic/release/Provider //Development/graphic/release/WH //Development/graphic/release/Accessory //Development/graphic/release/Unity Provider //Development/graphic/release/tipManager'    
    
    echo "$str" | sed 's# \(//\)#\'$'\n''\1#g'
    

    The above should work with any POSIX-compatible sed implementation.

    To capture the output in a variable, using command substitution:

    result=$(echo "$str" | sed 's# \(//\)#\'$'\n''\1#g')
    

    If you then want to process the result line by line:

    while read -r path; do echo "$path"; done <<<"$result"
    

    Explanation of the sed command:

    • # was - arbitrarily - chosen as the delimiter for sed's s (string substitution) command so as to make it easier to match / chars (customarily, / is used as the delimiter, which would necessitate \-escaping / instances in the regex and replacement string).
    •  \(//\) matches // if preceded by a space, i.e., inside the string.
    • \'$'\n'' effectively inserts a newline (\n) into the replacement string, using ANSI C quoting (required for OSX compatibility; on Linux, just \n would do).
    • \1 inserts the 1st (and only) capture group from the regex, i.e, //
    • g ensures that matching is global, i.e., that all substrings that match the regex are replaced.

    Result:

    //Development/graphic/release/CM
    //Development/graphic/release/GManager
    //Development/graphic/release/Notification
    //Development/graphic/release/Core
    //Development/graphic/release/Provider
    //Development/graphic/release/WH
    //Development/graphic/release/Accessory
    //Development/graphic/release/Unity Provider
    //Development/graphic/release/tipManager