Search code examples
javascriptweb-scrapingcheerio

How to parse images posted by user in google reviews?


I am working on a scraping project to scrape google maps reviews, but I got struck when I have to parse images posted by users. I tried this method which only gives me the first image posted by user :

    $('.gws-localreviews__google-review').each((i,el) => {
    images[i] = $(el)
    .find(".EDblX .JrO5Xe").attr("style")
    })

I am scraping google reviews by this URL: https://www.google.com/async/reviewDialog?hl=en&async=feature_id:0x47e66e2964e34e2d:0x8ddca9ee380ef7e0,next_page_token:,sort_by:,start_index:,associated_topic:,_fmt:pc

Here is my response:

[
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipPgClEw3JwTLJOuf-DqC2xtZRodoavkpYVFBYqu=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipOs9TSNoyYmW1GL4SH9PlkAihvWsUbMTn-8O2Sj=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipMBRGdJb3zL1rME20osajG-bosdIV8U82VTYS1n=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipOBuGDXFDhJP69LNo6yI9cZWcjSVHpVfPBNoKyL=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipP7wBZt8Kilm8VF75T8amjMrZ7ZkOpmtb0nHChF=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipOixJabuSd4mSHnveU5JSQ1ZszHJ6Hn-pkeosiY=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipPyOXO1vnyTXVnlkPJNLlnoHYHEna36vYnrqwE=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipPReBboes7S7lNklRT21pwn096JUQVJbTX3VRRA=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipNPLvARJu1vDk03r_y4fp8f7aDDvzRX-7yJklW8=w100-h100-p-n-k-no)',
  'background-image:url(https://lh5.googleusercontent.com/p/AF1QipMW3jp20hjKwuvhogH9ZC8IeH8QhQTUESH_ycNX=w100-h100-p-n-k-no)'
 ]

But what I want an object containing one user images, then a second a object containing a second user images and so on.

I want the results like this:

[
{
 "All image's links posted by user 1"
},
{
 "All images links posted by user 2"
}
{
 "All images links posted by user 3"
},
{
 "All images links posted by user 4"
},
{
........
}
]

Solution

  • So I found the answer, it was a bit tricky but yeah it will be solved like this :

     $(".gws-localreviews__google-review").each((i, el) => {
         images[i] = $(el)
        .find(".EDblX .JrO5Xe")
        .toArray()
        .map($)
        .map(d => d.attr("style").substring(21 , d.attr("style").lastIndexOf(")")))
         });
     })
    

    It will return the response like this:

    [
     [
      All images posted by user 1,
     ],
     [
       All images posted by user 2,
     ],
     [.........
    ]