Search code examples
rweb-scrapingrvest

Webscraping Covers.com


I am struggling to find where the data is contained when I inspect the webpage. I have the following code where the html_node argument does not have the correct entry

library(rvest)

url <- paste0("https://www.covers.com/sport/basketball/nba/teams/main/boston-celtics/2022-2023") 

covers <- url %>% read_html %>% html_node('#div_schedule') %>% html_table()

I have tried all of the following #div_tab_content, #div_past_results, #div_tp_schedule`

These do not work, and I can't see any more IDs. When I inspect the webpage, I get the following error

Error in UseMethod("html_table") : no applicable method for
'html_table' applied to an object of class "xml_missing"

Solution

  • You can use html_table() on the entire HTML content and it will extract all tables it can find in a list.

    library(rvest)
    
    url <-
      paste0(
        "https://www.covers.com/sport/basketball/nba/teams/main/boston-celtics/2022-2023"
      )
    
    covers_tables <-
      url %>%
      read_html() %>%
      html_table()
    
    covers_tables
    #> [[1]]
    #> # A tibble: 21 × 5
    #>    Playoffs Playoffs Playoffs  Playoffs Playoffs
    #>    <chr>    <chr>    <chr>     <chr>    <chr>   
    #>  1 Date     VS       Score     ATS      O/U     
    #>  2 May 29   MIA      L 84-103  L -7.5   U 204   
    #>  3 May 27   @ MIA    W 104-103 L -2     U 209   
    #>  4 May 25   MIA      W 110-97  W -8.5   U 214.5 
    #>  5 May 23   @ MIA    W 116-99  W 1      U 216   
    #>  6 May 21   @ MIA    L 102-128 L -4.5   O 214   
    #>  7 May 19   MIA      L 105-111 L -10    O 214.5 
    #>  8 May 17   MIA      L 116-123 L -8.5   O 212   
    #>  9 May 14   PHI      W 112-88  W -6     U 202.5 
    #> 10 May 11   @ PHI    W 95-86   W -2.5   U 211   
    #> # ℹ 11 more rows
    #> 
    #> [[2]]
    #> # A tibble: 83 × 5
    #>    `Regular Season` `Regular Season` `Regular Season` `Regular Season`
    #>    <chr>            <chr>            <chr>            <chr>           
    #>  1 Date             VS               Score            ATS             
    #>  2 Apr 9            ATL              W 120-114        W -4.5          
    #>  3 Apr 7            TOR              W 121-102        W 1.5           
    #>  4 Apr 5            TOR              W 97-93          W 2             
    #>  5 Apr 4            @ PHI            L 101-103        W 3.5           
    #>  6 Mar 31           UTA              W 122-114        L -13           
    #>  7 Mar 30           @ MIL            W 140-99         W 2             
    #>  8 Mar 28           @ WAS            L 111-130        L -10.5         
    #>  9 Mar 26           SA               W 137-93         W -16.5         
    #> 10 Mar 24           IND              W 120-95         W -11           
    #> # ℹ 73 more rows
    #> # ℹ 1 more variable: `Regular Season` <chr>
    #> 
    #> [[3]]
    #> # A tibble: 5 × 5
    #>   `Pre Season` `Pre Season`               `Pre Season` `Pre Season` `Pre Season`
    #>   <chr>        <chr>                      <chr>        <chr>        <chr>       
    #> 1 Date         "VS"                       Score        ATS          O/U         
    #> 2 Oct 14       "@ TOR\n            \n   … L 134-137 (… L -5         O 219       
    #> 3 Oct 7        "@ CHA\n            \n   … W 112-103    W -4.5       U 216       
    #> 4 Oct 5        "TOR"                      L 119-125 (… L -4         O 220.5     
    #> 5 Oct 2        "CHA"                      W 134-93     W -6         O 214.5     
    #> 
    #> [[4]]
    #> # A tibble: 0 × 4
    #> # ℹ 4 variables: Rank <lgl>, Leader <lgl>, L10 <lgl>, Units <lgl>