Search code examples
javascriptpythondynamicweb-scrapingxtemplate

Scraping Dynamic Javascript with Qt5


I've ran into a little problem. I have an online auction site for a video game which uses javascript. Exactly, the data I'd like to scrape is in an x-template type of script block. I can't get the actual datas but only the script in the source.

Here's my code:

def render(source_url):

    import sys
    from PyQt5.QtWidgets import QApplication
    from PyQt5.QtCore import QUrl
    from PyQt5.QtWebEngineWidgets import QWebEngineView

    class Render(QWebEngineView):
        def __init__(self, url):
            self.html = None
            self.app = QApplication(sys.argv)
            QWebEngineView.__init__(self)
            self.loadFinished.connect(self._loadFinished)
            #self.setHtml(html)
            self.load(QUrl(url))
            self.app.exec_()

        def _loadFinished(self, result):
            # This is an async call, you need to wait for this
            # to be called before closing the app
            self.page().toHtml(self._callable)

        def _callable(self, data):
            self.html = data
            # Data has been stored, it's safe to quit the app
            self.app.quit()

    return Render(source_url).html

url = "https://www.pathofexile.com/trade/search/Bestiary/blkdmmofg"

f = open("html_out.txt", "w", encoding = "utf8")
f.write(str(render(url)))
f.close()

While I manually check for the 1st item's currency-text and try to find it in my file, it can't find it since it's dynamic.

Here's how the script's start looks in the html_out.txt file:

<script type="x-template" id="trade-exchange-item-template">

And after that there comes the data I'm searching for in this form:

<span v-else class="currency-text">{{currencyText(priceInfo.currency)}}</span>

How could I make it work to fully load the site and the script and get the HTML afterwards with the correct data?

Thanks in advance!


Solution

  • Seems like I can't scrape it without an actual client. It worked fine with Selenium though.