Search code examples
regexregular-language

Regular expression for set of strings


I want to generate three regular expressions

  1. First is strings containing 2G
  2. Second is strings containing 3G
  3. Third is strings containing 4G

The set of strings are:

- 4 GB+ 2 GB Night 3G/2G Data    #matches the exp generated by 1 and 2
- 500 MB 4G/3G data              #matches the exp generated by 3 and four 
- 3GB 2G/3G/4G data              #matches all the three
- 2GB 4G/3G/2G Data

The expression I made is capturing '2GB' '3GB' '4GB' too. I want to get rid of exp containing 'B'. I am new to regular expression.Please suggest a correct expression for above.


Solution

  • To distinguish 3GB from 3G you need to make certain the term ends after the G. This can be done using a word boundary. In regex it's done with the sequence \b.

    So \b[2-4]G\b makes sure there isn't a letter, nor a digit, before the 2, 3 or 4, and the same after the G.

    Here's an illustration what it matches, and some that it doesn't, at regex101.