Search code examples
pythonpython-3.xweb-scrapingbeautifulsoupstrip

How do I strip off "Results for " in string "Results for 27th July 2019" using bs4?


I need to strip off the "Results for " text to later format it to a specific dateformat.

Problem is

When I run the code without .strip, I get:

'Results for 27th July 2019'

When I am trying to strip off the text, I get this error:

TypeError: a bytes-like object is required, not 'str'

python3:

date = res.parent.find("span", {"class": "standard-headline"}).text.encode('utf8').strip("Results for ")
TypeError: a bytes-like object is required, not 'str'

Is there a workaround? I've been looking into regex, but doesn't seem to solve my problem when there is no separator present.

Best regards


Solution

  • After encode('utf-8') you get binary string, so it expects also binary string (list of chars, to be more exact) as param. You can use either

    text.encode('utf-8').decode().strip("Results for ")
    

    or

    text.encode('utf-8').strip(b"Results for ")
    

    Bear in mind, strip is not the best choice to remove particular text from the head of the string, because this also strips all R's, e's, s's, whitespaces and so on from the tail.