Search code examples
regex

Regex to convert a string form to another


I've two strings of forms as below

  1. ,ghjgc b: Int)
  2. ,ghjg, b: Int)

I want the word before : to be removed - However, if the word is preceded by , - I don't want removal to happen. Thus, the output for the strings above would be

  1. ,ghjgc : Int)
  2. ,ghjg, b: Int)

I've written the regex as [^,] [^:[:space:]]*: - however it gives output as

  1. ,ghjg : Int) //note that 'c' at the end of ghjg is also removed
  2. ,ghjg, b: Int) //this is as expected

This problem is happening probably because at the start of regex, I have [^,] that also adds one character before space a part of regex. Need help in getting this fixed

I do much more processing on the string and thus my sed command looks like:

sed -e ' s/^.*func \{1,\}// s/ *\->.*$// s/:[^,)]\{1,\}/:/g s/(?<!,) [^:[:space:]]+:/\1:/g s/[, ]//g ' <<< "$string"

and I am getting compiler error at 5th line (line index starting from 1)


Solution

  • You may use

    ((^|[^,]) +)[^:[:space:]]+:
    

    and replace with \1:. See the regex demo.

    The point is to match either start of string or any char other than , + one or more spaces, and capture them into group #1, and then restore it with a $1 backreference.

    SED demo:

    echo ",ghjgc b: Int)" | sed -E 's/((^|[^,]) +)[^:[:space:]]+:/\1:/g'
    

    or a BRE version (demo):

    echo ",ghjgc b: Int)" | sed 's/\(\(^\|[^,]\) \{1,\}\)[^:[:space:]]\{1,\}:/\1:/g'
    

    OSX sed note: The BRE version that you have to use does not support \|, so you can't use alternation, and you can't have a match at the start of the string or at the char other than , in the same regex. Use

    's/\([^,] \{1,\}\)[^:[:space:]]\{1,\}:/\1:/g'
    

    and then, if you need matches of this kind at the start of the string:

    's/^\( *\)[^:[:space:]]\{1,\}:/\1:/'