Search code examples
spatialgpxterra

How do you convert a GPX file directly into a SpatVector of lines while preserving attributes?


I'm trying to teach myself coding skills for spatial data analysis. I've been using Robert Hijmans' document, "Spatial Data in R," and so far, it's been great. To test my skills, I'm messing around with a GPX file I got from my smartwatch during a run, but I'm having issues getting my data into a SpatVector of lines (or a line, more specifically). I haven't been able to find anything online on this topic.

As you can see below with a data sample, the SpatVector "run" has point geometries even though "lines" was specified. From Hijman's example of SpatVectors with lines, I gathered that adding columns with "id" and "part" both equal to 1 does something that enables the data to be converted to a SpatVector with line geometries. Accordingly, in the SpatVector "run2," the geometry is lines.

My questions are 1) is adding the "id" and "part" columns necessary? 2) and what do they actually do? I.e. why are these columns necessary? 3) Is there a way to go directly from the original data to a SpatVector of lines? In the process I used to get "run2," I lost all the attributes from the original data, and I don't want to lose them.

Thanks!

library(plotKML)
library(terra)
library(sf)
library(lubridate)
library(XML)
library(raster)

#reproducible example
GPX <- structure(list(lon = c(-83.9626053348184, -83.9625438954681, 
-83.962496034801, -83.9624336734414, -83.9623791072518, -83.9622404705733, 
-83.9621777739376, -83.9620685577393, -83.9620059449226, -83.9619112294167, 
-83.9618398994207, -83.9617654681206, -83.9617583435029, -83.9617464412004, 
-83.9617786277086, -83.9617909491062, -83.9618581719697), lat = c(42.4169608857483, 
42.416949570179, 42.4169420264661, 42.4169377516955, 42.4169291183352, 
42.4169017933309, 42.4168863706291, 42.4168564472347, 42.4168310500681, 
42.4167814292014, 42.4167292937636, 42.4166279565543, 42.4166054092348, 
42.4164886493236, 42.4163396190852, 42.4162954464555, 42.4161833804101
), ele = c("267.600006103515625", "268.20001220703125", "268.79998779296875", 
"268.600006103515625", "268.600006103515625", "268.399993896484375", 
"268.600006103515625", "268.79998779296875", "268.79998779296875", 
"269", "269", "269.20001220703125", "269.20001220703125", "269.20001220703125", 
"268.79998779296875", "268.79998779296875", "269"), time = c("2020-10-25T11:30:32.000Z", 
"2020-10-25T11:30:34.000Z", "2020-10-25T11:30:36.000Z", "2020-10-25T11:30:38.000Z", 
"2020-10-25T11:30:40.000Z", "2020-10-25T11:30:45.000Z", "2020-10-25T11:30:47.000Z", 
"2020-10-25T11:30:51.000Z", "2020-10-25T11:30:53.000Z", "2020-10-25T11:30:57.000Z", 
"2020-10-25T11:31:00.000Z", "2020-10-25T11:31:05.000Z", "2020-10-25T11:31:06.000Z", 
"2020-10-25T11:31:12.000Z", "2020-10-25T11:31:19.000Z", "2020-10-25T11:31:21.000Z", 
"2020-10-25T11:31:27.000Z"), extensions = c("18.011677", "18.011977", 
"18.012176", "18.012678", "18.013078", "18.013277", "18.013578", 
"18.013877", "17.013977", "17.014278", "17.014478", "17.014677", 
"17.014676", "17.014677", "16.014477", "16.014477", "16.014576"
)), row.names = c(NA, 17L), class = "data.frame")



crdref <- "+proj=longlat +datum=WGS84"
run <- vect(GPX, type="lines", crs=crdref)
run


data <- cbind(id=1, part=1, GPX$lon, GPX$lat)
run2 <- vect(data, type="lines", crs=crdref)
run2

Solution

  • There is a vect method for a matrix and one for a data.frame. The data.frame method can only make points (and has no type argument, so that is ignored). I will change that into an informative error and clarify this in the manual.

    So to make a line, you could do

    library(terra)
    g <- as.matrix(GPX[,1:2])   
    v <- vect(g, "lines")
    

    To add attributes you would first need to determine what they are. You have one line but 17 rows in GPX that need to be reduced to one row. You could just take the first row

    att <- GPX[1, -c(1:2)] 
    

    But you may prefer to take the average instead

    GPX$ele <- as.numeric(GPX$ele)
    GPX$extensions <- as.numeric(GPX$extensions)
    GPX$time <- as.POSIXct(GPX$time)
    att <- as.data.frame(lapply(GPX[, -c(1:2)], mean))
    #       ele       time extensions
    #1 268.7412 2020-10-25    17.3078
    
    values(v) <- att
    

    Or in one step

     v <- vect(g, "lines", atts=att)
     v
     #class       : SpatVector 
     #geometry    : lines 
     #dimensions  : 1, 3  (geometries, attributes)
     #extent      : -83.96261, -83.96175, 42.41618, 42.41696  (xmin, xmax, ymin, ymax)
     #coord. ref. :  
     #names       :   ele       time extensions
     #type        : <num>      <chr>      <num>
     #values      : 268.7 2020-10-25      17.31
    

    The id and part columns are not necessary if you make a single line. But you need them when you wish to create multiple lines and or line parts (in a "multi-line").

    gg <- cbind(id=rep(1:3, each=6)[-1], part=1, g)
    vv <- vect(gg, "lines")
    plot(vv, col=rainbow(5), lwd=8)
    lines(v)
    points(v, cex=2, pch=1)
    

    And with multiple lines you would use id in aggregate to compute attributes for each line.