Search code examples
javascriptregexstring-parsing

Parse Values from String Using Javascript


Trying to find the most efficient way to extract values from a large string.

EXT-X-DATERANGE:ID="PreRoll_Ident_Open",START-DATE="2016-12-14T120000.000z",DURATION=3,X-PlayHeadStart="0.000",X-AdID="AA-1QPN49M9H2112",X-TRANSACTION-VPRN-ID="1486060788",X-TrackingDefault="1",X-TrackingDefaultURI="http,//606ca.v.fwmrm.net/ad/l/1?s=g015&n=394953%3B394953&t=1485791181366184015&f=&r=394953&adid=15914070&reid=5469372&arid=0&auid=&cn=defaultImpression&et=i&_cc=15914070,5469372,,,1485791181,1&tpos=0&iw=&uxnw=394953&uxss=sg579054&uxct=4&metr=1031&init=1&vcid2=394953%3A466c5842-0cce-4a16-9f8b-a428e479b875&cr="s=0&iw=&uxnw=394953&uxss=sg579054&uxct=4&metr=1031&init=1&vcid2=394953%3A466c5842-0cce-4a16-9f8b-a428e479b875&cr="

I have the above as an example. The idea is to extract all caps string before : as object key, and everything in between quotes until next comma as its value. Then iterate entire string until this object is created.

nonParsed.substring(nonParsed.lastIndexOf("="")+1, nonParsed.lastIndexOf("","));

I had this concept as a start, but some help iterating through this and making it more efficient would be appreciated.

Final output would be something like --

{
  'EXT-X-DATERANGE:ID': 'PreRoll_Ident_Open',
  'START-DATE': '2016-12-14T120000.000z',
  'DURATION': '3',
  ...
}

Solution

  • It looks like the only property that messes up a predictable pattern is DURATION, which is followed by a number. Otherwise, you can rely on a naive pattern of alternating =" and ",.

    You could do something like

    str = str.replace(/DURATION=(\d+)/, `DURATION="$1"`);
    return str.split('",').reduce((acc, entry) => {
        let key = `'${entry.split('="')[0]}'`;
        let value = `'${entry.split('="')[1]}'`;
        acc[key] = value;
        return acc;
    }, {});
    

    Then add a bit of logic to the end to sort out the Duration if you needed to.