Search code examples
beautifulsoupdot.js

How to crawl a website that uses JavaScript using beautifulsoup?


<div class="details">
                    <h2 class="title"><a href="{{=it.url}}">{{=it.title}}</a></h2>
                    <div class="author">
                        <span class="avatar">
                            <a href="{{=it.userProfileUrl}}"><img src="{{=it.userAvatarUrl}}" alt="{{=it.displayName}}" /></a>
                        </span>
                        <span class="name">By <a href="{{=it.userProfileUrl}}">{{=it.displayName}}</a></span>
                    </div>
                    <div class="meta-data">
                        <div class="fd-rating">
                            <div class="five-star">
                                <span class="fd-rating-percent" style="width:{{=it.percentRating}};"></span>
                            </div>
                            <span>({{=it.ratingCount}})</span>
                        </div>
                        <div class="cook-time"><i class='icon-fdc-clock'></i> {{=it.totalTime}}</div>
                    </div>

The above given is a part of the code of the site I'm trying to crawl. I would like to fetch the values in =it.url. I tried to get all values of href, tried searching for the initialization of the variable it.url. All of them gave back a empty tuple.Is there a way i can fetch the url value ?Any advice will be a huge help.

Use this Link for complete code.


Solution

  • Solved the issue by using selenium and PhantomJS, I used the following code to get the processed HTML code

    from selenium import webdriver
    driver = webdriver.PhantomJS() 
    driver.get(url)
    time.sleep(5)
    result=driver.page_source