python youtube beautifulsoup screen-scraping

Scraping a youtube page returns empty list

I've been trying to scrape links from this youtube page. . But, the links variable comes up empty. Anything I am doing wrong??

Solution

I think the problem lies with the way you are trying to find the links. When I curl the same url as you curl https://www.youtube.com/results\?search_query\=hello

I don't get any a tags with those css elements on it. This seems based on the User-Agent property set in the header.

So there are a few options:

Change how you look for the video link.

This is what video links markup looks like for my curl

<a href="/watch?v=YQHsXMglC9A" class="yt-uix-tile-link yt-ui-ellipsis yt-ui-ellipsis-2 yt-uix-sessionlink spf-link " data-sessionlink="itct=CFcQ3DAYASITCLfbt4P439gCFQzYfgodkDYKVij0JFIFaGVsbG8" title="Adele - Hello" aria-describedby="description-id-484065" rel="spf-prefetch" dir="ltr">Adele - Hello</a>

As you can see those classes do not exist here.

However you can use some sort of regular expression on the hrefs to find ones that contain the correct format

page.find_all("a", {'href': re.compile('/watch?v=[A-Za-z0-9_\-]`)})

(you will have to mess with the regex it isn't perfect

Use the Youtube API

I would say this would be the preferred method just guessing what you are trying to do. Specifically the search api they even have python snippets