Search code examples
xpathscrapytags

scrape a data from a gtag function in a script tag using scrapy


I am scraping a website, its script tag contains the following code:

<script type="text/javascript">
        window.dataLayer = window.dataLayer || [];
          function gtag(){dataLayer.push(arguments);}
          gtag('js', new Date());

          
          gtag('set', 'content_group1', 'World');
          gtag('set', 'content_group2', 'AFP');
          gtag('config', 'UA-40396753-1', {
            'custom_map': {"dimension6":"Id","dimension1":"Category","dimension3":"Author","dimension5":"PublishedDate"}
          });              
          gtag('event', 'custom', {"Id":"news\/1696246","Category":"World","Categories":"World","Author":"AFP-119","Authors":"AFP","PublishedDate":"2022-06-23 07:08:42"});
</script>

I need to scrape the value "PublishedDate":"2022-06-23 07:08:42" How can I do that using scrapy This is what I tried:

time = response.xpath('//script[@type="text/javascript"]/text()').re(r"gtag\('event', 'custom', ({.*})\);")
json_data = json.loads(time, strict=False)


print('dawn time::', json_data['PublishedDate'])

But, I am not getting any result


Solution

  • I solved this simply by:

    time = response.xpath('//meta[@property="article:published_time"]/@content')[0].extract()
    

    as there was a relevant meta tag to the field I required