Search code examples
c#regexregular-language

Extract strings followed by number with space using regex


I have the follwing string

[1] weight | width | depth | 5.0 kg | 6.0 mm^3 | 10.12 cm^2

From that I need to extract the unit strings only

unit=kg
unit=mm^3
unit=cm^2

I tried the below regex

(?<unit>[^ -+0-9\\.\|\[\}\]]+)

But it is also giving the weight,width,depth values too. Also tried

(?<unit>[\D][^|]+)

but not worked. I think I need to extract the strings which are followed by number and space

Can you help me on this


Solution

  • You could use a character class to list the allowed characters:

    \b[0-9]+(?:\.[0-9]+)?[^\S\r\n]+(?<unit>[a-z^0-9]+)\b
    

    Regex demo

    Or more specific:

    \b[0-9]+(?:\.[0-9]+)?[^\S\r\n]+(?<unit>kg|[mc]m)\b
    

    See another regex demo

    If there should be either a pipe or the end of the string:

    \b[0-9]+(?:\.[0-9]+)?[^\S\r\n]+(?<unit>[^0-9\s]\S*)(?:[^\S\r\n]+\||$)
    

    Regex demo