Trying to remove strings that follow the pattern
Tag Starts With
Size:
and before the next COMMA (,) includes the -
character.
Example:
Size: XS-S-M-L-XL-2XL,
or
Size: XS-S-M,
etc.
WOULD get selected (including ,
)
but Size_S,
would be ignored because there is no -
I'm close with:
Size:(.*)-*(.?),
But still not stopping at ,
Here is 1 line of tags:
Athletics, Fitted, Mesh, Feature_Moisture Wicking, Material_Polyester 100%, , Material_Polyester 100%, Material_Polyester Over 50%, School, Style_Short Sleeves, Size_2XL, Size_L, Size_M, Size_S, Size_XL, Size_XS, Size: XS-S-M-L-XL-2XL, Uniforms, Unisex, V-Neck, VisibleLogos, Youth
To remove all size 'range' tags from my cells and only leave the single size tag.
Solution can be found here: regex101.com/r/VuTzba/1
In your pattern Size:(.*)-*(.?),
you are first matching until the end of the string using (.*)
.
After that the hyphen -*
and single character in the group (.?)
are optional so it will backtrack until the last comma as that is the only character that has to be matched.
To get a more exact match, you could use a repeating pattern to match the sizes:
Size: (?:\d*X[SL]|L|M|S)(?:-(?:\d*X[LS]|L|M|S))*,
Explanation
Size:
Match Size followed by a space(?:
Non capturing group
\d*X[SL]|L|M|S
match one of the listed items in the alternation)
Close group(?:
Non capturing group
-(?:\d*X[LS]|L|M|S)
Match a hyphen followed by any of the listed items)*,
Close group and repeat 0+ times and match a commaAs more broader pattern could be using a character class and list all the allowed characters Size: [XSML\d]+(?:-[XSML\d]+)*,
or match until the first comma Size:[^,]+,
Edit
To also match Size: 28W-30W-32W-34W-36W-38W-40W, Size: 28W-30W-32W-34W
or you could use extend the character class adding |\d+W
to it and end the pattern matching either a comma or assert the end of the string $
Size: (?:\d*X[SL]|L|M|S|\d+W)(?:-(?:\d*X[LS]|L|M|S|\d+W))*(?:,|$)