Search code examples
javascriptfunctiongoogle-tag-managerajaxformgoogle-datalayer

How do I capture the data within an object/array that is hidden as a textContent of a specific element?


<div id="ProductModal_10207555">
  <span class="ari-form">
    <span class="datasource hidden">  
      "{
         "itemid": "REF12345",
         "locationid": "54321",
         "leadValue": 15149,
         "productSku": "SKU12345",
         "itemYear": "2019",
         "itemMake": "Adidas",
         "itemModel": "Ultraboost",
         "itemPrice": "$139.00",
       }"
    </span>
  </span>
</div>

This is a hidden object from a form and I would like to extract the data in it. When I try to do DOM scraping I end up with a string value and it just shows me all the text that I am seeing in that specific object.

"{"itemid": "REF12345","locationid": "54321","leadValue": 15149,"productSku": "SKU12345","itemYear": "2019","itemMake": "Adidas","itemModel": "Ultraboost","itemPrice": "$139.00",}"

And what I would wish to capture is the " itemYear , " itemMake " and " itemModel " from this and hopefully send this entire string into a datalayer for Google Tag Manager where in I can capture each of those variables and their value separetely in Google Tag Manager. Is there a way to do this via native javascript as this is the language supported by GTM?

There are other span class="datasource hidden" within the site so I purposely put the parent alements in there since this is how we identify the element.

Sorry if this may be a quick one. I am still learning. Thank you.


Solution

  • It's in JSON format except for the "s at the very beginning and end, and the , just before the }, so once you get the textContent of the element, you can slice out the "s, the last comma, and then parse it:

    const { textContent } = document.querySelector('#ProductModal_10207555 .datasource.hidden');
    const json = textContent.match(/{[\s\S]+(?=,)/)[0] + '}';
    const obj = JSON.parse(json);
    
    console.log(
      obj.itemYear,
      obj.itemMake,
      obj.itemModel
    );
    <div id="ProductModal_10207555">
      <span class="ari-form">
        <span class="datasource hidden">  
          "{
             "itemid": "REF12345",
             "locationid": "54321",
             "leadValue": 15149,
             "productSku": "SKU12345",
             "itemYear": "2019",
             "itemMake": "Adidas",
             "itemModel": "Ultraboost",
             "itemPrice": "$139.00",
           }"
        </span>
      </span>
    </div>

    {.+(?=,) means: match a {, then match as many characters as possible, until coming to the last ,.

    It's quite an odd way of storing data. Plain JSON would make much more sense.

    Or, for the equivalent in ES5:

    var textContent = document.querySelector('#ProductModal_10207555 .datasource.hidden').textContent;
    var json = textContent.match(/{[\s\S]+(?=,)/)[0] + '}';
    var obj = JSON.parse(json);
    
    console.log(
      obj.itemYear,
      obj.itemMake,
      obj.itemModel
    );
    <div id="ProductModal_10207555">
      <span class="ari-form">
        <span class="datasource hidden">  
          "{
             "itemid": "REF12345",
             "locationid": "54321",
             "leadValue": 15149,
             "productSku": "SKU12345",
             "itemYear": "2019",
             "itemMake": "Adidas",
             "itemModel": "Ultraboost",
             "itemPrice": "$139.00",
           }"
        </span>
      </span>
    </div>