Search code examples
scrapyscrapy-splash

ScrapySplash cannot find elements with ":" in classname


I'm using Srcapy with Splash to crawl a website using a java framework named IFaces. This frameworks use values like "_id35:_id48" for element's ID and classnames.

When I crawl the site with Splash and try to select an element with this value I got an DOM Exception 12, probably because the ":" character used in the values. I already tried to escape the value (ie.: "_id35\3a_id48" and _id35\3a _id48) but still got the same error.

Is there any other way for me to select those elements (like XPath)?


Solution

  • You can use CSS selector with splash:select within a Lua script, in that way you can select by class name like this :

    splash:select('.element')
    

    or by id like this:

    splash:select('#_id35:_id48')
    

    Take a look at the CSS selector documentation, I'm sure that you will find a way to achieve what you want.

    Take a look at this question too, for some examples on how to use js with splash.