<div id="ProductModal_10207555">
<span class="ari-form">
<span class="datasource hidden">
"{
"itemid": "REF12345",
"locationid": "54321",
"leadValue": 15149,
"productSku": "SKU12345",
"itemYear": "2019",
"itemMake": "Adidas",
"itemModel": "Ultraboost",
"itemPrice": "$139.00",
}"
</span>
</span>
</div>
This is a hidden object from a form and I would like to extract the data in it. When I try to do DOM scraping I end up with a string value and it just shows me all the text that I am seeing in that specific object.
"{"itemid": "REF12345","locationid": "54321","leadValue": 15149,"productSku": "SKU12345","itemYear": "2019","itemMake": "Adidas","itemModel": "Ultraboost","itemPrice": "$139.00",}"
And what I would wish to capture is the " itemYear , " itemMake " and " itemModel " from this and hopefully send this entire string into a datalayer for Google Tag Manager where in I can capture each of those variables and their value separetely in Google Tag Manager. Is there a way to do this via native javascript as this is the language supported by GTM?
There are other span class="datasource hidden" within the site so I purposely put the parent alements in there since this is how we identify the element.
Sorry if this may be a quick one. I am still learning. Thank you.
It's in JSON format except for the "
s at the very beginning and end, and the ,
just before the }
, so once you get the textContent
of the element, you can slice out the "
s, the last comma, and then parse it:
const { textContent } = document.querySelector('#ProductModal_10207555 .datasource.hidden');
const json = textContent.match(/{[\s\S]+(?=,)/)[0] + '}';
const obj = JSON.parse(json);
console.log(
obj.itemYear,
obj.itemMake,
obj.itemModel
);
<div id="ProductModal_10207555">
<span class="ari-form">
<span class="datasource hidden">
"{
"itemid": "REF12345",
"locationid": "54321",
"leadValue": 15149,
"productSku": "SKU12345",
"itemYear": "2019",
"itemMake": "Adidas",
"itemModel": "Ultraboost",
"itemPrice": "$139.00",
}"
</span>
</span>
</div>
{.+(?=,)
means: match a {
, then match as many characters as possible, until coming to the last ,
.
It's quite an odd way of storing data. Plain JSON would make much more sense.
Or, for the equivalent in ES5:
var textContent = document.querySelector('#ProductModal_10207555 .datasource.hidden').textContent;
var json = textContent.match(/{[\s\S]+(?=,)/)[0] + '}';
var obj = JSON.parse(json);
console.log(
obj.itemYear,
obj.itemMake,
obj.itemModel
);
<div id="ProductModal_10207555">
<span class="ari-form">
<span class="datasource hidden">
"{
"itemid": "REF12345",
"locationid": "54321",
"leadValue": 15149,
"productSku": "SKU12345",
"itemYear": "2019",
"itemMake": "Adidas",
"itemModel": "Ultraboost",
"itemPrice": "$139.00",
}"
</span>
</span>
</div>