Search code examples
javascriptregexsplitjscriptwsh

Split string by multiple delimiters, keep them and ignore them in double quotes


I know this question had been asked before in numerous variations but I can't seem to merge all the data to a working solution

Motivation

I have a JScript running under WSH. However, this is essentially a simple javascript | regexp question

I'm trying to parse a string. requirements:

  1. Split it by multiple delimiters: -, =
  2. Ignore delimiters wrapped in double quotes
  3. Keep the delimiters in the result

Example

This is the string I've been working with. double quotes are part of the string

"C:\\Users\\u 1\\a-b\\f1.txt" -CONFIG="C:\\Users\\u 1\\c=d\\f2.xfg"-ARGS=/quite /v1

expected results after split

  1. "C:\\Users\\u 1\\a-b\\f1.txt"
  2. -
  3. CONFIG
  4. =
  5. "C:\\Users\\u 1\\c=d\\f2.xfg"
  6. -
  7. ARGS
  8. =
  9. /quite /v1

Failed attempt

var str = '"C:\\Users\\u 1\\a-b\\f1.txt" -CONFIG="C:\\Users\\u 1\\c=d\\f2.xfg"-ARGS=/quite /v1';
var res = str .split(/-(?=(?:(?:[^"]*"){2})*[^"]*$)/);

Failed Result:

  1. \"C:\\Users\\u 1\\a-b\\f1.txt
  2. CONFIG=\"C:\\Users\\u 1\\c=d\\f2.xfg\"
  3. ARGS=/quite /v1

Solution

  • It's a weird thing to need... but here goes...

    var str = '"C:\\Users\\u 1\\a-b\\f1.txt" -CONFIG="C:\\Users\\u 1\\c=d\\f2.xfg"-ARGS=/quite /v1';
    var res = [];
    str.replace(/".*?"|-|=|[^-="\s]+(?:\s[^-="]+)?/g, function(m) { res.push(m); });
    console.log(res);

    split is tricky to use, as you need to define what you don't want; finding all matches is nicer (and in JavaScript, replace is the easiest way to do it - the other being the exec loop). I'm picking out:

    • quoted strings
    • dashes
    • equals
    • strings of characters that do not fit the above

    The fourth group would be a bit easier if " " was a valid thing to pick out, but you need to work a bit extra hard to disqualify plain blank spaces...

    That said, you have only one example, so it might well break with other, untested input.

    EDIT: wording, and simplification of the regex

    EDIT2: for @ndn, a version that handles \\n:

    var str = '"C:\\Users\\u 1\\a-b\\f1.txt" -CONFIG="C:\\Users\\u\\"3\\"\\c=d\\f2.xfg"-ARGS=/quite /v1';
    var res = [];
    str.replace(/"(?:\\"|[^"])*"|-|=|[^-="\s]+(?:\s[^-="]+)?/g, function(m) { res.push(m); });
    console.log(res);