Search code examples
javascriptregexstringsplitcapture

How does one extract data from a string which possibly could get split by an optional comma character?


I've gotten strings like "Street number 1, Barcelona", or if there was no street data, just "Barcelona", I'm trying both expressing the patterns and capturing the city data by a single regex.

I already came up with a pattern which does select everything on a string that does not contain commas ...

^([^,]+)$

I also have a regex for capturing everything after the first comma ...

^.+?, (.*)$

Is it possible to merge both into a single regex and/or how could it be managed?


Solution

  • just testing

    If the OP's use case was just testing, following regex might help /^[^,]*,*[^,]+$/. It reads as follows ...

    • /^ ... $/ in between start and end ...
    • [^,]* optionally match anything that is not a comma ...
    • ,* maybe followed by an optional comma ...
    • [^,]+ followed (or standing just for itself) by at least one character which is not a comma.

    const regXOptionallySplittingComma = (/^[^,]*,*[^,]+$/);
    
    console.log(
      'test ... "" ?',
      regXOptionallySplittingComma.test("")
    );
    console.log(
      'test ... "," ?',
      regXOptionallySplittingComma.test(",")
    );
    console.log(
      'test ... " ," ?',
      regXOptionallySplittingComma.test(" ,")
    );
    console.log('\n');
    
    console.log(
      'test ... " , " ?',
      regXOptionallySplittingComma.test(" , ")
    );
    console.log(
      'test ... ",  " ?',
      regXOptionallySplittingComma.test(",  ")
    );
    console.log(
      'test ... "   " ?',
      regXOptionallySplittingComma.test("   ")
    );
    console.log('\n');
    
    console.log(
      'test ... "  Barcelona  " ?',
      regXOptionallySplittingComma.test("  Barcelona  ")
    );
    console.log(
      'test ... "Lloret de Mar" ?',
      regXOptionallySplittingComma.test("Lloret de Mar")
    );
    console.log('\n');
    
    console.log(
      'test ... ",  Barcelona  " ?',
      regXOptionallySplittingComma.test(",  Barcelona  ")
    );
    console.log(
      'test ... ",  Lloret de Mar  " ?',
      regXOptionallySplittingComma.test(",  Lloret de Mar  ")
    );
    console.log('\n');
    
    console.log(
      'test ... "Street number 1, Barcelona" ?',
      regXOptionallySplittingComma.test("Street number 1,  Barcelona")
    );
    console.log(
      'test ... "Street number 1, Lloret de Mar" ?',
      regXOptionallySplittingComma.test("Street number 1,  Lloret de Mar")
    );
    console.log('\n');
    
    console.log(
      'test ... "Street, number 1, Barcelona" ?',
      regXOptionallySplittingComma.test("Street, number 1,  Barcelona")
    );
    console.log(
      'test ... "Street number 1, Lloret de, Mar" ?',
      regXOptionallySplittingComma.test("Street number 1,  Lloret de, Mar")
    );
    .as-console-wrapper { min-height: 100%!important; top: 0; }

    extracting data

    If the OP's use case was extracting data, like getting the right hand side of a string which could be split by an optional comma, a possible solution does not necessarily need to be based on a regex ...

    function extractTrailingNonCommaChars(str) {
      const list = String(str).split(',');
    
      // - without testing for a maximum two partials split
      // - take always the last or most right handed partial ...
      // return list[list.length - 1].trim();
    
      // - with testing for a maximum two partials split ...
      const count = list.length;
      return (count <= 2) && list[count - 1].trim() || '';
    }
    
    console.log(
      '"Street number 1,  Barcelona  " =>',
      `"${ extractTrailingNonCommaChars("Street number 1,  Barcelona  ") }"`
    );
    console.log(
      '",  Lloret de Mar  " =>',
      `"${ extractTrailingNonCommaChars(",  Lloret de Mar  ") }"`
    );
    console.log(
      '" Barcelona" =>',
      `"${ extractTrailingNonCommaChars(" Barcelona") }"`
    );
    console.log(
      '"Lloret, de Mar" =>',
      `"${ extractTrailingNonCommaChars("Lloret, de Mar") }"`
    );
    
    console.log(
      '",,," =>',
      `"${ extractTrailingNonCommaChars(",,,") }"`
    );
    console.log(
      '",  " =>',
      `"${ extractTrailingNonCommaChars(",  ") }"`
    );
    console.log(
      '" , " =>',
      `"${ extractTrailingNonCommaChars(" , ") }"`
    );
    console.log(
      '"   " =>',
      `"${ extractTrailingNonCommaChars("   ") }"`
    );
    console.log(
      '"," =>',
      `"${ extractTrailingNonCommaChars(",") }"`
    );
    console.log(
      '"" =>',
      `"${ extractTrailingNonCommaChars("") }"`
    );
    console.log(
      '"Street, number 1,  Barcelona  " =>',
      `"${ extractTrailingNonCommaChars("Street, number 1,  Barcelona  ") }"`
    );
    console.log(
      '",  Lloret de, Mar  " =>',
      `"${ extractTrailingNonCommaChars(",  Lloret de, Mar  ") }"`
    );
    .as-console-wrapper { min-height: 100%!important; top: 0; }