Search code examples
pythonhtmlbeautifulsouppython-requestsurllib

Python requests / urllib / selenium not parsing entire webpage HTML


I've been trying to parse HTML from https://www.teamrankings.com/nba/team/oklahoma-city-thunder but can't get the full page to parse. I've tried requests, urllib, and selenium with BeautifulSoup. All of them don't parse full HTML. The closest I got was with urllib (code below). I've tried many different user agents and all different parsers.

If I print webpage before using BeautifulSoup, I can see all the content. Once I use BeautifulSoup, it cuts most of it out. I've tried html.parser, lxml, and html5.

url = https://www.teamrankings.com/nba/team/oklahoma-city-thunder

req = Request(url, headers={'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 5_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B179 Safari/7534.48.3'})

webpage = urlopen(req).read()

print(webpage)

basketball = BeautifulSoup(webpage)

print(basketball)

Thanks in advance!


Solution

  • not sure what you mean by not getting all the content. Have you tried just using Pandas (it uses beautifulsoup under the hood to parse <table> tags. Returns the full table for me:

    EDIT

    In the furture, be more specific in your question. It wasn't until your comments that you explained more. It's all there, you just need to iterate through it all.

    import requests
    import pandas as pd
    from bs4 import BeautifulSoup
    
    url = 'https://www.teamrankings.com/nba/team/oklahoma-city-thunder'
    response = requests.get(url)
    
    df = pd.DataFrame()
    soup = BeautifulSoup(response.text, 'html.parser')
    table = soup.find_all('table')[1]
    
    cols = [ each.text for each in table.find_all('th') ]
    rows = table.find_all('tr')
    for row in rows:
        data = [ each.text for each in row.find_all('td') ]
        temp_df = pd.DataFrame([data])
        df = df.append(temp_df, sort=True).reset_index(drop=True)
    
    df = df.dropna()
    df.columns = cols
    

    Output:

        print (df)
         Date      Opponent     Result Location    W/L  Div Spread     Total Money
    1   10/23          Utah   L 95-100     Away    0-1  0-1   +9.0  Un 221.0  +339
    2   10/25    Washington    L 85-97     Home    0-2  0-1   -8.5  Un 218.5  -399
    3   10/27  Golden State   W 120-92     Home    1-2  0-1   -1.0  Un 223.5  -117
    4   10/28       Houston  L 112-116     Away    1-3  0-1  +10.0  Ov 227.5  +433
    5   10/30      Portland   L 99-102     Home    1-4  0-2   +1.5  Un 221.5  +104
    6   11/02   New Orleans  W 115-104     Home    2-4  0-2   -2.0  Un 228.5  -124
    7   11/05       Orlando   W 102-94     Home    3-4  0-2   -3.0  Un 201.5  -142
    8   11/07   San Antonio  L 112-121     Away    3-5  0-2   +5.0  Ov 211.5  +172
    9   11/09  Golden State  W 114-108     Home    4-5  0-2  -12.5  Ov 216.5  -770
    10  11/10     Milwaukee  L 119-121     Home    4-6  0-2   +8.5  Ov 220.0  +329
    11  11/12       Indiana   L 85-111     Away    4-7  0-2   +1.0  Un 213.0  -101
    12  11/15  Philadelphia  W 127-119     Home    5-7  0-2   +3.5  Ov 214.0  +148
    13  11/18   LA Clippers    L 88-90     Away    5-8  0-2   +7.5  Un 222.0  +297
    14  11/19     LA Lakers  L 107-112     Away    5-9  0-2  +11.0  Ov 209.5  +469
    15  11/22     LA Lakers  L 127-130     Home   5-10  0-2   +4.5  Ov 209.5  +186
    16  11/25  Golden State   W 100-97     Away   6-10  0-2   -7.5  Un 213.5  -297
    17  11/27      Portland  L 119-136     Away   6-11  0-3   +3.0  Ov 219.0  +137
    18  11/29   New Orleans  W 109-104     Home   7-11  0-3   -4.5  Un 229.0  -195
    19  12/01   New Orleans  W 107-104     Away   8-11  0-3   +2.5  Un 226.5  +124
    20  12/04       Indiana  L 100-107     Home   8-12  0-3   +1.5  Un 208.5  +102
    21  12/06     Minnesota  W 139-127     Home   9-12  1-3   -3.5  Ov 218.0  -160
    22  12/08      Portland   W 108-96     Away  10-12  2-3   +3.5  Un 223.0  +154
    23  12/09          Utah   W 104-90     Away  11-12  3-3   +8.5  Un 206.5  +311
    24  12/11    Sacramento    L 93-94     Away  11-13  3-3   +1.5  Un 207.5  +117
    25  12/14        Denver  L 102-110     Away  11-14  3-4   +5.5  Ov 204.0  +211
    26  12/16       Chicago  W 109-106     Home  12-14  3-4   -5.0  Ov 208.5  -211
    27  12/18       Memphis  W 126-122     Home  13-14  3-4   -6.5  Ov 219.5  -254
    28  12/20       Phoenix  W 126-108     Home  14-14  3-4   -3.0  Ov 224.5  -147
    29  12/22   LA Clippers  W 118-112     Home  15-14  3-4   -1.0  Ov 223.5  -111
    30  12/26       Memphis   L 97-110     Home  15-15  3-4   -5.5  Un 224.0  -242
    ..    ...           ...        ...      ...    ...  ...    ...       ...   ...
    53  02/09        Boston    3:30 pm     Home                 --        --    --
    54  02/11   San Antonio    8:00 pm     Home                 --        --    --
    55  02/13   New Orleans    8:00 pm     Away                 --        --    --
    56  02/21        Denver    8:00 pm     Home                 --        --    --
    57  02/23   San Antonio    7:00 pm     Home                 --        --    --
    58  02/25       Chicago    8:00 pm     Away                 --        --    --
    59  02/27    Sacramento    8:00 pm     Home                 --        --    --
    60  02/28     Milwaukee    8:00 pm     Away                 --        --    --
    61  03/03   LA Clippers    8:00 pm     Home                 --        --    --
    62  03/04       Detroit    7:00 pm     Away                 --        --    --
    63  03/06      New York    7:30 pm     Away                 --        --    --
    64  03/08        Boston    6:00 pm     Away                 --        --    --
    65  03/11          Utah    8:00 pm     Home                 --        --    --
    66  03/13     Minnesota    8:00 pm     Home                 --        --    --
    67  03/15    Washington    6:00 pm     Away                 --        --    --
    68  03/17       Memphis    8:00 pm     Away                 --        --    --
    69  03/18       Atlanta    7:30 pm     Away                 --        --    --
    70  03/20        Denver    8:00 pm     Home                 --        --    --
    71  03/23         Miami    7:30 pm     Away                 --        --    --
    72  03/26     Charlotte    8:00 pm     Home                 --        --    --
    73  03/28  Golden State    8:30 pm     Away                 --        --    --
    74  03/30        Denver    9:00 pm     Away                 --        --    --
    75  04/01       Phoenix    8:00 pm     Home                 --        --    --
    76  04/04   LA Clippers    3:30 pm     Away                 --        --    --
    77  04/05     LA Lakers    9:30 pm     Away                 --        --    --
    78  04/07      Brooklyn    8:00 pm     Home                 --        --    --
    79  04/10      New York    8:00 pm     Home                 --        --    --
    80  04/11       Memphis    8:00 pm     Away                 --        --    --
    81  04/13          Utah    8:00 pm     Home                 --        --    --
    82  04/15        Dallas    7:30 pm     Away                 --        --    --
    
    [82 rows x 9 columns]