Search code examples
pythonhtmlweb-scrapingbeautifulsouphtml-parsing

How to scrape texts in order / merge texts?


I'm trying to merge text element in rlg-item__paint class with text element in rlg-trade__itemshas class, like so:

url = "https://rocket-league.com/trade/465ec00f-2f5c-48e2-831e-2e294683ad56"
response = requests.get(f"{url}")
soup = BeautifulSoup(response.text, "html.parser")
for has in soup.findAll('div', attrs={'class': 'rlg-trade__itemshas'}):
    for div in soup.findAll('div', attrs={'class': 'rlg-item-links'}):
        div.extract()
    for color in soup.findAll('div', attrs={'class': 'rlg-item__paint'}):
        color.replaceWith('\n', color)
    items = (has.get_text(f"\n"' ', strip=True))
    print(items)

but it doesn't work, output:

Magma
Pink
Light Show
Cristiano
Anodized Pearl 

Pink text element from rlg-item__paint class, I want to merge it like this:

Magma
Pink Light Show
Cristiano
Anodized Pearl

so I want to merge it in bottom row of text element.


Solution

  • Note: In newer code avoid old syntax findAll() instead use find_all() or select - For more take a minute to check docs


    If pattern is always the same you could select your element more specific, extract text with .stripped_strings and slice the <a> texts:

    for e in soup.select('.rlg-trade__itemshas .--hover'):
        print(' '.join(list(e.stripped_strings)[:-2]))
    

    or you could use .decompose() to get rid of the links:

    for e in soup.select('.rlg-trade__itemshas .--hover'):
        e.select_one('.rlg-item-links').decompose()
        print(e.get_text(strip=True))
    

    Example

    from bs4 import BeautifulSoup
    import requests
    
    url = "https://rocket-league.com/trade/465ec00f-2f5c-48e2-831e-2e294683ad56"
    response = requests.get(f"{url}")
    soup = BeautifulSoup(response.text, "html.parser")
    
    for e in soup.select('.rlg-trade__itemshas .--hover'):
        print(' '.join(list(e.stripped_strings)[:-2]))
    

    Output

    Magma
    Pink Light Show
    Cristiano
    Anodized Pearl