Search code examples
rweb-scrapingtidyversehttr

How to download data from the Reptile database using r


I am using R to try and download images from the Reptile-database by filling their form to seek for specific images. For that, I am following previous suggestions to fill a form online from R, such as:

library(httr)
library(tidyverse)

POST(
url = "http://reptile-database.reptarium.cz/advanced_search",
encode = "json",
body = list(
genus = "Chamaeleo",
species = "dilepis"
)) -> res

out <- content(res)[1]

This seems to work smoothly, but my problem now is to identify the link with the correct species name in the resulting out object.

This object should contain the following page: https://reptile-database.reptarium.cz/species?genus=Chamaeleo&species=dilepis&search_param=%28%28genus%3D%27Chamaeleo%27%29%28species%3D%27dilepis%27%29%29

This contains names with links. Thus, i would like to identify the link that takes me to the page with the correct species's table. however I am unable to find the link nor even the name of the species within the generated out object.


Solution

  • Here I only extract the links to the pictures. Simply map or apply a function to download them with download.file()

    library(tidyverse)
    library(rvest)
    
    genus <- "Chamaeleo"
    species <- "dilepis"
    
    pics <- paste0(
      "http://reptile-database.reptarium.cz/species?genus=", genus,
      "&species=", species) %>%
      read_html() %>% 
      html_elements("#gallery img") %>%
      html_attr("src")
    
     [1] "https://www.reptarium.cz/content/photo_rd_05/Chamaeleo-dilepis-03000034021_01_t.jpg"
     [2] "https://www.reptarium.cz/content/photo_rd_05/Chamaeleo-dilepis-03000033342_01_t.jpg"
     [3] "https://www.reptarium.cz/content/photo_rd_02/Chamaeleo-dilepis-03000029987_01_t.jpg"
     [4] "https://www.reptarium.cz/content/photo_rd_02/Chamaeleo-dilepis-03000029988_01_t.jpg"
     [5] "https://www.reptarium.cz/content/photo_rd_05/Chamaeleo-dilepis-03000035130_01_t.jpg"
     [6] "https://www.reptarium.cz/content/photo_rd_05/Chamaeleo-dilepis-03000035131_01_t.jpg"
     [7] "https://www.reptarium.cz/content/photo_rd_05/Chamaeleo-dilepis-03000035132_01_t.jpg"
     [8] "https://www.reptarium.cz/content/photo_rd_05/Chamaeleo-dilepis-03000035133_01_t.jpg"
     [9] "https://www.reptarium.cz/content/photo_rd_06/Chamaeleo-dilepis-03000036237_01_t.jpg"
    [10] "https://www.reptarium.cz/content/photo_rd_06/Chamaeleo-dilepis-03000036238_01_t.jpg"
    [11] "https://www.reptarium.cz/content/photo_rd_06/Chamaeleo-dilepis-03000036239_01_t.jpg"
    [12] "https://www.reptarium.cz/content/photo_rd_11/Chamaeleo-dilepis-03000041048_01_t.jpg"
    [13] "https://www.reptarium.cz/content/photo_rd_11/Chamaeleo-dilepis-03000041049_01_t.jpg"
    [14] "https://www.reptarium.cz/content/photo_rd_11/Chamaeleo-dilepis-03000041050_01_t.jpg"
    [15] "https://www.reptarium.cz/content/photo_rd_11/Chamaeleo-dilepis-03000041051_01_t.jpg"
    [16] "https://www.reptarium.cz/content/photo_rd_12/Chamaeleo-dilepis-03000042287_01_t.jpg"
    [17] "https://www.reptarium.cz/content/photo_rd_12/Chamaeleo-dilepis-03000042288_01_t.jpg"
    [18] "https://calphotos.berkeley.edu/imgs/128x192/9121_3261/2921/0070.jpeg"               
    [19] "https://calphotos.berkeley.edu/imgs/128x192/1338_3161/0662/0074.jpeg"               
    [20] "https://calphotos.berkeley.edu/imgs/128x192/9121_3261/2921/0082.jpeg"               
    [21] "https://calphotos.berkeley.edu/imgs/128x192/1338_3152/3386/0125.jpeg"               
    [22] "https://calphotos.berkeley.edu/imgs/128x192/6666_6666/1009/0136.jpeg"               
    [23] "https://calphotos.berkeley.edu/imgs/128x192/6666_6666/0210/0057.jpeg"