Search code examples
pythonpython-requests-html

Error while trying to use a CSS selector with requests-HTML


I have a code:

from requests_html import HTMLSession

url = 'https://finance.yahoo.com/quote/COMP'
s = HTMLSession()

r = s.get(url)

#Both of these give the same error:

name = r.html.find('h1.D(ib) Fz(18px)', first = True).text
name = r.html.find('h1.D(ib).Fz(18px)', first = True).text

print(name)

which results in the following error:

cssselect.parser.SelectorSyntaxError: Expected selector, got <DELIM '(' at 4>

Sometimes putting a dot in the full class name where there is a space works (see the second version for name). I don't think I am making a mistake following the documentation, but it seems to me that the parentheses in class names here are problematic. If I use Beautifulsoup, I can get around this problem, but I really would like to understand how to fix this issue within Requests-HTML.


Solution

  • You need to escape the parentheses :

    from requests_html import HTMLSession
    url = 'https://finance.yahoo.com/quote/COMP'
    s = HTMLSession()
    
    r = s.get(url)
    
    name = r.html.find('h1.D\(ib\)', first=True).text
    
    print(name)
    

    Output:

    Compass, Inc. (COMP)