I am trying to figure out the best way to split a string of values.. Each string is a series of xy pixel coordinates that ultimately form a polygon. But I can't seem to find a solution where I can split the string into two subsets.. one with all the x coordinates and one with all the y coordinates for each polygon.
This is the current format:
polygonID | Points |
---|---|
1 | [x1,y1,x2,y2,x3,y3...] |
2 | [x1,y1,x2,y2,x3,y3...] |
Example of values: [[1057.97, 338.98, 1069.53, 322.73,..........]] [[ x1 , y1, x2 , y2,...........]]
So you can see the first two values create an xy pair, and therefore I would need to pull the first x and then every other value after to subset all x coordinate values.. and do the same for all y coordinates to create two columns of points.
(side note: the length of coordinate points per polygon varies)
Ultimately what I want is two lists then which would look like this:
polygonID | X_coords | Y_coords |
---|---|---|
1 | [x1,x2,x3,...] | [y1,y2,y3,...] |
2 | [x1,x2,x3,...] | [y1,y2,y3,...] |
I have looked at options with stringr and dplyr, but I have not found a good solution (I also don't have any code worked out just yet as I am trying to gain any insight first). Any and all help is appreciated. Thanks :)
Ok so I'm by no means an expert and I might be complicating things but at least I think I have a working answer. If I understand correctly, the column "Points" of your data-frame (which I will call df
) is a character column.
Then:
df %>%
mutate(Points = strsplit(gsub("\\[|\\]","",Points), ","),
xcoord = paste0("[", sapply(map(Points, ~.x[c(TRUE, FALSE)]), paste, collapse = ","),"]" ),
ycoord = paste0("[", sapply(map(Points, ~.x[c(FALSE, TRUE)]), paste, collapse = ","),"]" )) %>%
select(-Points)
I start by removing the brackets in your column "Points" with gsub()
, then split the strings with strsplit()
. If you want to keep the Points column, just rename that result.
Then, in order:
c(TRUE, FALSE)
(inspired from this post: Select every other element from a vector)sapply
to paste together the list from point 1 with "," as a separator.paste0
to paste brackets before and after the result from point 2.select()
to remove the Points column (if not needed anymore).