Scraping data from a submitted form from SIPRI

I am trying to get data from a website ( which requires submitting a form. There are some radio buttons and drop down boxes where you can select a time period (years) and countries and a download method. I am aware the the data can be downloaded manually, but I would like to programatically download the import data for all countries between 1990 and 2000.

I have tried two different approaches based on answers on SO (see below for code), but am having trouble getting it to actually produce results. Ideally, I would like a dataframe similar to one in the downloaded excel file. Any help or guidance would be greatly appreciated.

Approach 1

Th first approach is based on Python code for the same site: Scrape a php webpage that needs a submitted form

df = httr::POST("", 
             encode = "form",
             body = list('import_or_export' = 'export',
                         'country_code'= 'All',
                         'from' = 1990,
                         'to' = 2000,
                         'summarize' = 'country',
                         'filetype'= 'excel',
                         'Action' ='Download'),

Approach 2

The second approach I've tried is relatively similar to this approach, How to retrieve response by using POST in R

headers = c('Content-Type' = 'application/json; charset=UTF-8')
data = "{'country_code':'All','low_year':'1990','high_year':'2000','import_or_export':'import','summarize':'country','filetype':'html','Action':'Download'}"
r <- httr::POST(url = "", 
                httr::add_headers(.headers=headers), body = data)


  • I leave the parsing and cleaning to you, but here's a suggestion for the request

    "" %>% 
      request() %>%  
        'import_or_export' = 'export',
        'country_code'= '',
        'low_year' = 1990,
        'high_year' = 2000,
        'summarize' = 'country',
        'filetype'= 'html',
        'Action' = 'Download'
      ) %>%  
      req_perform() %>% 
      resp_body_html() %>% 
      html_table %>% 
      getElement(2) %>% 
    # A tibble: 89 x 14
       X1        X2    X3    X4    X5    X6    X7    X8    X9    X10   X11   X12   X13   X14  
       <chr>     <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
     1 &nbsp     1990  1991  1992  1993  1994  1995  1996  1997  1998  1999  2000  Total NA   
     2 Angola    &nbsp &nbsp 8     &nbsp &nbsp &nbsp &nbsp &nbsp &nbsp &nbsp &nbsp 8     NA   
     3 Argentina 6     0     &nbsp 13    5     5     &nbsp &nbsp &nbsp &nbsp 2     31    NA   
     4 Aruba     &nbsp &nbsp &nbsp &nbsp &nbsp &nbsp 18    &nbsp &nbsp &nbsp &nbsp 18    NA   
     5 Australia 168   90    &nbsp 30    36    36    16    20    4     &nbsp &nbsp 400   NA   
     6 Austria   30    20    20    10    17    &nbsp 18    1     29    23    24    191   NA   
     7 Belarus   &nbsp &nbsp &nbsp 8     &nbsp 7     129   398   63    452   293   1349  NA   
     8 Belgium   1     1     &nbsp &nbsp 33    158   57    93    46    45    26    458   NA   
     9 Brazil    106   127   98    40    54    38    27    27    18    &nbsp &nbsp 535   NA   
    10 Bulgaria  6     42    16    28    55    1     21    6     39    167   2     381   NA   
    # ... with 79 more rows