Search code examples
pythonstringbeautifulsoupdata-cleaning

Generate a dataframe from a string


Inspired by this solution I have been using the following code to clean-up some data that I obtain using Beautiful Soup:

nfl = soup.findAll('li', "player")
lines = ("{}. {}\n".format(ind,span.get_text(strip=True).rstrip("+"))
         for ind, span in enumerate(nfl,1))
print("".join(lines))

The problem is that the output of this comes in the format of a string and I would like to store each one of it's lines as a different row in a dataframe. I tried introducing the code in a loop but that would not do. The best I could manage was to store the same string n times into my desired dataframe. Could you help me out?


Solution

  • Try:

    nfl = soup.findAll("li", "player")
    
    all_data = []
    for span in nfl:
        all_data.append({"player": span.get_text(strip=True).rstrip("+")})
    
    df = pd.DataFrame(all_data)
    print(df)