Search code examples
pythonyoutubebeautifulsoupscreen-scraping

Scraping a youtube page returns empty list


I've been trying to scrape links from this youtube page. . But, the links variable comes up empty. Anything I am doing wrong?? I have attached the code with this question


Solution

  • I think the problem lies with the way you are trying to find the links. When I curl the same url as you curl https://www.youtube.com/results\?search_query\=hello

    I don't get any a tags with those css elements on it. This seems based on the User-Agent property set in the header.

    So there are a few options:

    1. Change how you look for the video link.

      This is what video links markup looks like for my curl

      <a href="/watch?v=YQHsXMglC9A" class="yt-uix-tile-link yt-ui-ellipsis yt-ui-ellipsis-2 yt-uix-sessionlink spf-link " data-sessionlink="itct=CFcQ3DAYASITCLfbt4P439gCFQzYfgodkDYKVij0JFIFaGVsbG8" title="Adele - Hello" aria-describedby="description-id-484065" rel="spf-prefetch" dir="ltr">Adele - Hello</a>
      

      As you can see those classes do not exist here.

      However you can use some sort of regular expression on the hrefs to find ones that contain the correct format

      page.find_all("a", {'href': re.compile('/watch?v=[A-Za-z0-9_\-]`)})
      

      (you will have to mess with the regex it isn't perfect

    2. Use the Youtube API

      I would say this would be the preferred method just guessing what you are trying to do. Specifically the search api they even have python snippets