Search code examples
regexpcreregexp-replace

RegEx to surround text that contains the string ";@tab;@tab;@tab;" with quotes


I'm trying to create a PCRE RegEx that surrounds strings that contain the string ;@tab;@tab;@tab; with quotes ("). The input consist of multiple lines of text containing 1 or more columns separated with tabs. The strings that have to be surrounded with quotes (") can be at the begging of the line, end of the line, or in the middle of the line. A line can contain 0 or more strings that have to be surrounded.
Example:

1   some text with spaces
1   some text with spaces   anotherValue
1   some text with spaces   anotherValue    3452.2
val_so  some text with spaces   anotherValue    3452.2
val_so space    some text with spaces   anotherValue    3452.2
some text with spaces   anotherValue    3   other text with spaces

1   some;t@b;t@b;t@b;text;t@b;t@b;t@b;with tabs
1   some;t@b;t@b;t@b;text;t@b;t@b;t@b;with tabs and spaces  anotherValue
1   some;t@b;t@b;t@b;text;t@b;t@b;t@b;with spaces   anotherValue    3452.2
val_so  some;t@b;t@b;t@b;text with spaces and;t@b;t@b;t@b; tabs anotherValue    3452.2
val_so space    some text with;t@b;t@b;t@b; spaces amd tabs anotherValue    3452.2
some text;t@b;t@b;t@b; with spaces  anotherValue    3   other text;t@b;t@b;t@b;with spaces;t@b;t@b;t@b;and tabs
;t@b;t@b;t@b; with spaces   anotherValue    3   other text;t@b;t@b;t@b;with spaces;t@b;t@b;t@b;and tabs
;t@b;t@b;t@b;;t@b;t@b;t@b; with spaces  anotherValue    3   other text;t@b;t@b;t@b;with spaces;t@b;t@b;t@b;and tabs

has to become

1   some text with spaces
1   some text with spaces   anotherValue
1   some text with spaces   anotherValue    3452.2
val_so  some text with spaces   anotherValue    3452.2
val_so space    some text with spaces   anotherValue    3452.2
some text with spaces   anotherValue    3   other text with spaces

1   "some;t@b;t@b;t@b;text;t@b;t@b;t@b;with tabs"
1   "some;t@b;t@b;t@b;text;t@b;t@b;t@b;with tabs and spaces"    anotherValue
1   "some;t@b;t@b;t@b;text;t@b;t@b;t@b;with spaces" anotherValue    3452.2
val_so  "some;t@b;t@b;t@b;text with spaces and;t@b;t@b;t@b; tabs"   anotherValue    3452.2
val_so space    "some text with;t@b;t@b;t@b; spaces amd tabs"   anotherValue    3452.2
"some text;t@b;t@b;t@b; with spaces"    anotherValue    3   "other text;t@b;t@b;t@b;with spaces;t@b;t@b;t@b;and tabs"
";t@b;t@b;t@b; with spaces" anotherValue    3   "other text;t@b;t@b;t@b;with spaces;t@b;t@b;t@b;and tabs"
";t@b;t@b;t@b;;t@b;t@b;t@b; with spaces"    anotherValue    3   "other text;t@b;t@b;t@b;with spaces;t@b;t@b;t@b;and tabs"

I tried the following PCRE RegEx search RegEx: \b(([^\t]+);t@b;([^\t]+))\b with g flag replace RegEx: "\2" but it matches strings across multiple lines (the match doesn't stop at the end of the line) but I want every line to be matched separately.


Solution

  • As the data is separated by tabs, and you don't want to cross newlines, you can exclude them from matching by adding a newline in the negated character class.

    You can omit the word boundaries, as it will for example not match the ; at the beginning and end in ;t@b;t@b;t@b;

    You don't need the capturing groups, because you want to replace the whole match between double quotes.

    [^\t\r\n]+;t@b;[^\t\r\n]+
    

    See a regex demo

    In the replacement use the whole match between double quotes.

    "$0"