I am rather new with spatial data and I am not sure how to proceed with my code. I have occurrence data from a lot of species and a .tif map with the occurrence of cocoa plantations in Ghana and Ivory Coast. I want to calculate the overlap of each (buffered) occurrence point with the distribution of cocoa.
Here is my map:
Here is a link for it: https://ufile.io/lnmqkkal
Here is a sample of my df:
sdm_data <- structure(list(scientificName = c("Centropus_leucogaster", "Charadrius_hiaticula",
"Cinnyris_cupreus", "Milvus_migrans", "Indicator_minor", "Vanellus_senegallus",
"Campethera_punctuligera", "Phylloscopus_trochilus", "Spermestes_cucullata",
"Ceratogymna_atrata", "Columba_guinea", "Bubulcus_ibis", "Stiphrornis_erythrothorax",
"Acrocephalus_schoenobaenus", "Prionops_plumatus", "Centropus_monachus",
"Himantornis_haematopus", "Actitis_hypoleucos", "Deleornis_fraseri",
"Corvus_albus", "Accipiter_tachiro", "Scopus_umbretta", "Thescelocichla_leucopleura",
"Illadopsis_rufescens", "Columba_unicincta", "Ploceus_cucullatus",
"Columba_livia", "Lybius_dubius", "Spermestes_cucullata", "Hylia_prasina",
"Laniarius_barbarus", "Pandion_haliaetus", "Actophilornis_africanus",
"Pycnonotus_barbatus", "Eidolon_helvum", "Spilopelia_senegalensis",
"Criniger_calurus", "Tauraco_macrorhynchus", "Phoeniculus_purpureus",
"Lamprotornis_caudatus", "Cyanomitra_olivacea", "Corythaeola_cristata",
"Apaloderma_narina", "Lophoceros_nasutus", "Chrysococcyx_cupreus",
"Uraeginthus_bengalus", "Pycnonotus_barbatus", "Vidua_macroura",
"Ketupa_poensis", "Hybomys_trivirgatus", "Spermestes_cucullata",
"Vidua_macroura", "Circus_aeruginosus", "Spilopelia_senegalensis",
"Horizocerus_albocristatus", "Prinia_subflava", "Bubulcus_ibis",
"Anthus_leucophrys", "Pholidornis_rushiae", "Necrosyrtes_monachus",
"Apus_batesi", "Tricholaema_hirsuta", "Estrilda_melpoda", "Camaroptera_brachyura",
"Streptopelia_semitorquata", "Passer_griseus", "Horizocerus_albocristatus",
"Necrosyrtes_monachus", "Eremomela_pusilla", "Nicator_chloris",
"Necrosyrtes_monachus", "Turtur_tympanistria", "Turtur_abyssinicus",
"Apalis_nigriceps", "Criniger_olivaceus", "Gymnobucco_calvus",
"Corvus_albus", "Buteo_auguralis", "Gypohierax_angolensis", "Pogoniulus_scolopaceus",
"Chloropicus_pyrrhogaster", "Merops_gularis", "Estrilda_melpoda",
"Passer_griseus", "Eurystomus_gularis", "Nigrita_canicapillus",
"Stephanoaetus_coronatus", "Dryoscopus_gambensis", "Ceuthmochares_aereus",
"Ploceus_brachypterus", "Ploceus_brachypterus", "Cypsiurus_parvus",
"Apus_affinis", "Merops_pusillus", "Platysteira_blissetti", "Lagonosticta_larvata",
"Pycnonotus_barbatus", "Chrysococcyx_caprius", "Eurillas_latirostris",
"Sarothrura_pulchra"), class = c("Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Mammalia", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Mammalia", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves", "Aves"
), countryCode = c("GH", "GH", "GH", "GH", "GH", "GH", "GH",
"GH", "GH", "CI", "GH", "CI", "GH", "CI", "GH", "GH", "GH", "GH",
"GH", "GH", "GH", "GH", "CI", "GH", "CI", "GH", "GH", "GH", "GH",
"CI", "GH", "GH", "GH", "CI", "GH", "GH", "GH", "GH", "GH", "GH",
"GH", "CI", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH",
"GH", "GH", "GH", "GH", "CI", "CI", "GH", "GH", "GH", "GH", "GH",
"GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH",
"GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH",
"GH", "GH", "CI", "GH", "GH", "GH", "GH", "GH", "GH", "GH", "GH",
"GH", "GH", "GH", "CI", "GH"), lat = c(5.294725, 5.538762, 5.069007,
4.9631, 9.259442, 9.259442, 9.351597, 9.296375, 5.326462, 5.833334,
10.686944, 5.354483, 5.244878, 5.2, 10.396633, 5.350061, 5.265666,
5.375306, 6.229311, 5.935704, 5.350061, 9.296741, 5.85, 5.294725,
5.416667, 5.282214, 5.4306, 10.396633, 6.239851, 5.330343, 9.400862,
10.67723, 5.179573, 5.388454, 6.133333, 10.520621, 5.354316,
6.241391, 5.61215, 5.589215, 6.241391, 5.85, 6.68671, 7.4145,
6.241391, 9.259285, 5.935704, 9.281975, 5.216111, 5.77, 4.866574,
9.249128, 6.767449, 5.265666, 5.4306, 5.199609, 5.362665, 5.38216,
5.593093, 9.296375, 5.369245, 5.318909, 9.296375, 5.11396, 5.592857,
6.180255, 5.4306, 5.582121, 9.296375, 5.32459, 5.885418, 5.350061,
9.402673, 5.369245, 5.289211, 5.29904, 6.241391, 9.259442, 9.351597,
6.241391, 5.354316, 5.350061, 9.402673, 5.32459, 6.68671, 5.354316,
5.690694, 9.259442, 6.68671, 6.233479, 5.0854, 6.68671, 5.0506,
9.296375, 5.34425, 9.416667, 5.0854, 5.406347, 5.85, 5.183314
), lon = c(-2.594147, -0.287876, -1.42497, -2.4084, -1.855252,
-1.855252, -1.866111, -1.771631, -1.381812, -7.3, -0.806111,
-3.988783, -2.640255, -3.733333, -2.062541, -1.38196, -2.581787,
-1.629985, -0.527119, 0.093384, -1.38196, -1.850207, -7.383333,
-2.594147, -7.3, -1.356295, -1.4145, -2.062541, 0.095351, -4.132249,
-1.90428, -2.90983, -1.309481, -4.01923, -0.85, -0.362171, -1.383588,
-0.553651, -0.190772, -0.181737, -0.553651, -7.383333, -1.34421,
-1.8894, -0.553651, -1.855416, 0.093384, -1.86376, -2.651111,
-0.22, -1.757812, -1.861067, 0.109927, -2.581787, -1.4145, -3.737939,
-3.888946, -1.394263, -1.375694, -1.771631, -1.360245, -1.416425,
-1.771631, -1.28007, -0.159177, -0.345341, -1.4145, -0.183042,
-1.771631, -1.40445, 0.040941, -1.38196, -1.894798, -1.360245,
-2.640378, -1.644067, -0.553651, -1.855252, -1.866111, -0.553651,
-1.383588, -1.38196, -1.894798, -1.40445, -1.34421, -1.383588,
-7.050476, -1.855252, -1.34421, 0.09053, -1.43393, -1.34421,
-2.6949, -1.771631, -1.37552, -1.966667, -1.43393, -1.447472,
-7.383333, -1.88866)), row.names = c(NA, -100L), class = c("data.table",
"data.frame"))
And here is my code:
library(sf)
library(terra)
library(exactextractr)
#Read cocoa map
cocoa.map <- rast("cocoa_plantation_area.tif")
#Convert data to sf
sdm_data_sf <- sdm_data %>%
st_as_sf(coords = c("lon", "lat"), crs="+proj=longlat +datum=WGS84", agr="constant", remove=FALSE)
#convert sf object to SpatVector for a faster buffering using terra:buffer
sdm_data <- vect(sdm_data_sf)
#Add a buffer of 100 meters
bufferpoints <- buffer(sdm_data, width=100)
#Match CRS of my map and my df
crs(bufferpoints) <- crs(cocoa.map)
#crop the map to the extent of my buffered points
curr.map <- terra::crop(cocoa.map, terra::ext(bufferpoints))
#Convert SpatVector back to sf to be able to use exactextractr::exact_extract, because terra:extract is slower and my data set is big (555000 rows)
bufferpoints_sf <- sf::st_as_sf(bufferpoints)
#I think with this I am extracting the coverage of each point with the map? I am not sure
extract <- exactextractr::exact_extract(curr.map, bufferpoints_sf)
Now, the problem is that
exactextractr::exact_extract
What I get is a Large list with approx. the same amount of rows per entry, which does not make sense to me because each specie has a different number of entries. Maybe I am getting a value for each cocoa cell? And what should I do next with this huge list?I have tried multiple options throughout the day but I am not finding a solution. Converting my SpatVector back to sf seemed to be a ok-ish step because otherwise I was getting this warning all the time:
Warning message:
In class(object) <- "environment" :
Setting class(x) to "environment" sets attribute to NULL; result will no longer be an S4 object
Edit: Uploaded the map here: https://ufile.io/lnmqkkal
Here is a simple solution with just terra
library(terra)
# create a SpatVector and buffer
v <- vect(as.data.frame(sdm_data), crs="+proj=longlat +datum=WGS84")
b <- buffer(v, width=100)
# replace NA with zeros in the cocoa data, that helps a lot later
# save to a new file as it takes a while to process
cocoaraw <- rast("cocoa_plantation_area.tif")
cocoa <- subst(cocoaraw, NA, 0, filename="cocoa_area_01.tif")
Now extract and compute the mean value for each point. That avoids dealing with the list. Since the values are all zero or one, the mean gives you the fraction of cocoa land within the buffer. If you have points that are on the beach you might want to set the ocean to NA
(with mask) and use na.rm=TRUE
to be more precise; but even if you had a few points like that, this would probably be pedantic in this case.
e1 <- extract(cocoa, b, mean)
head(e1)
# ID plantation_area
#1 1 0
#2 2 0
#3 3 0
#4 4 0
#5 5 0
#6 6 0
And you can do the last step with exactexractr instead:
e2 <- exactextractr::exact_extract(cocoa, sf::st_as_sf(b), "mean")
(BTW: I am curious what the source of the cocoa data is)