Search code examples
rgeospatialtiffshapefile

How do I extract data from a .tiff file in R?


I am completely new to geospatial analysis and am at a lost as to how to start. I would like to know what the sex ratio of Vietnam was in 2000, broken down by grid square per https://hub.worldpop.org/geodata/summary?id=11943.

How do I download and read this data in R, and extract the male and female population per grid square of .tiff files?

Furthermore, how would I be able to aggregate the data to find the sex ratio of 20 year holds per district?

Thank you very much.


Solution

  • Using terra, download the ages you want to compare. This will be slow as server appears throttled. Using 2000, female/male

    library(terra)
    vnm_f_2k_20 <- rast('~/Downloads/vnm_f_20_2000.tif')
    vnm_m_2k_20 <- rast('~/Downloads/vnm_m_20_2000.tif')
    # plot(vnm_m_2k_20) you could but not much to look at 
    vnm_20_ratio <- vnm_f_2k_20/vnm_m_2k_20 # female/male
    writeRaster(vnm_20_ratio, '~/Downloads/vnm_20_2k_fm_ratio.tif')
    # source again
    vnm_20_ration <- rast('~/Downloads/vnm_20_2k_fm_ratio.tif')
    plot(vnm_20_ratio)
    # import some (bad) shape files
    viet_shp_vect <- vect('~/viet/Boundary.shp')
    viet_shp_vect
     class       : SpatVector 
     geometry    : lines 
     dimensions  : 4588, 3  (geometries, attributes)
     extent      : 102.1421, 116.9473, 6.95331, 23.39391  (xmin, xmax, ymin, ymax)
     source      : Boundary.shp
     coord. ref. : lon/lat WGS 84 (EPSG:4326) 
     names       :   gid  Code      Type
     type        : <int> <chr>     <chr>
     values      :     1 BA010 Coastline
                      24 BA010 Coastline
                      52 BA010 Coastline
    # we want polygons
    viet_shp_vect_poly <- as.polygons(viet_shp_vect)
    viet_shp_vect_poly
     class       : SpatVector 
     geometry    : polygons 
     dimensions  : 882, 3  (geometries, attributes)
     extent      : 103.418, 116.9473, 6.95331, 21.51663  (xmin, xmax, ymin, ymax)
     coord. ref. : lon/lat WGS 84 (EPSG:4326) 
     names       :   gid  Code      Type
     type        : <int> <chr>     <chr>
     values      :     1 BA010 Coastline
                      24 BA010 Coastline
                      52 BA010 Coastline
    # okay, fewer
    # which biggest? to see on vnm_20_ratio
    viet_poly_expanse = expanse(viet_terra_poly, unit = 'km') # km^2
    viet_terra_poly[which(viet_poly_expanse > 125)]
     class       : SpatVector 
     geometry    : polygons 
     dimensions  : 3, 3  (geometries, attributes)
     extent      : 103.8319, 107.6092, 10.00636, 21.27627  (xmin, xmax, ymin, ymax)
     coord. ref. : lon/lat WGS 84 (EPSG:4326) 
     names       :   gid  Code              Type
     type        : <int> <chr>             <chr>
     values      :   843 BA010         Coastline
                    4020  AC02 District boundary
                     287 BA010         Coastline
    
    # get their extent(s)
    ext(viet_terra_poly[which(viet_poly_expanse > 125)][1])
    SpatExtent : 107.377707546894, 107.609243224099, 21.0418956379434, 21.2762675716601 (xmin, xmax, ymin, ymax)
    ext(viet_terra_poly[which(viet_poly_expanse > 125)][2])
    SpatExtent : 103.831937267764, 104.079663940524, 10.0063610568506, 10.4522918854626 (xmin, xmax, ymin, ymax)
    ext(viet_terra_poly[which(viet_poly_expanse > 125)][3])
    SpatExtent : 106.915154049455, 107.108447974102, 20.7130312182932, 20.8800654844975 (xmin, xmax, ymin, ymax)
    first_unk_viet_visible = crop(vnm_20_ratio,ext(viet_terra_poly[which(viet_poly_expanse > 125)][1]) 
    
    plot(first_unk_viet_visible)
    

    enter image description here

    enter image description here

    You have shapefiles, I have geojson and we both need to figure out which level of detail, at the provincial level, or district level and all the little Spratlies that that entails.

    For your consideration of which level to apply your analysis, you really should look back to the worldpop data and what level it was collected at. This is the concept of 'support' in spatial analysis. If worldpop is provincial, you should use provincial, if district, then district. Hopefully your gadm is better organized than mine where I was scratching around for something that would be visible, having say names of recognizable places. The quick tour.

    enter image description here

    If you just need values by geographic area

    median(extract(vnm_20_ratio, viet_terra_poly[which(viet_poly_expanse > 125)][1])[, 2], na.rm = TRUE)
    [1] 0.8670955