Search code examples
rgpspartitioning

How to split GPS coordinate data in R


I have a large field with latitude and longitude GPS coordinates.

My data looks like this:

> head(df2015, 10)
     X    Yield   Latitude Longitude
1   97 40.85889  0.8848444  120.8712
2   98 43.54383  2.1551468  120.8833
3   99 42.33718  3.4424795  120.8776
4  100 39.21862  4.7188642  120.8685
5  101 38.24887  6.0019946  120.8820
6  102 36.95594  7.2819180  120.8943
7  103 34.00766  8.5942431  120.8902
8  104 34.58568  9.8706278  120.8970
9  105 34.47788 11.1475653  120.8912
10 106 34.20532 12.4183101  120.8910

It is a rectangular plot (field). The actual data is here:

df2015 <- read.table("https://raw.githubusercontent.com/yamunadhungana/data/master/home.2015.csv", header = TRUE, sep = ",")

plot(df2015$Latitude, df2015$Longitude)

enter image description here

I would like to know how I can split this 600m by 400 m size plot into 4 sub-fields of equal size and put their names in my dataframe df2015. For example, I would like to group the rows by subplots A, B, C,D as shown above and put the levels in my original dataframe.


Solution

  • Here is an approach with findInterval from base R:

    df2015 <- read.table("https://raw.githubusercontent.com/yamunadhungana/data/master/home.2015.csv", header = TRUE, sep = ",")
    pos.matrix <- matrix(LETTERS[c(2,3,1,4)],byrow = TRUE, nrow = 2)
    pos.matrix
    #     [,1] [,2]
    #[1,] "B"  "C" 
    #[2,] "A"  "D" 
    
    df2015$grid <- apply(cbind(findInterval(df2015[,"Latitude"],seq(0,400,by = 200)),
                               3-findInterval(df2015[,"Longitude"],seq(0,600,by = 300))),
                         1,function(x){pos.matrix[x[2],x[1]]})
    df2015[1:10,]
    #     X    Yield   Latitude Longitude grid
    #1   97 40.85889  0.8848444  120.8712    A
    #2   98 43.54383  2.1551468  120.8833    A
    #3   99 42.33718  3.4424795  120.8776    A
    #4  100 39.21862  4.7188642  120.8685    A
    #5  101 38.24887  6.0019946  120.8820    A
    #6  102 36.95594  7.2819180  120.8943    A
    #7  103 34.00766  8.5942431  120.8902    A
    #8  104 34.58568  9.8706278  120.8970    A
    #9  105 34.47788 11.1475653  120.8912    A
    #10 106 34.20532 12.4183101  120.8910    A
    

    The grid position is now a new column in df2015. You could use split to break the data.frame into a list of grid positions.

    Here is a visualization that confirms the proper assignment:

    set.seed(3)
    mysamp <- sample(seq_len(nrow(df2015)),250)
    plot(NA, xlim = c(0,400), ylim = c(0,600), xlab = "Latitude", ylab = "Longitude")
    text(df2015[mysamp,c("Latitude","Longitude")],
         labels = df2015[mysamp,"grid"], cex = 0.4)
    

    enter image description here