I get a field from a SQL database in which various information is stored as a serialized object. I would like to extract the values from this string in R and save them as separate columns.
Here is a simplified example:
df <- data.frame(ID = c(1,2,3),
information = c("{s:4:\"name\";s:3:\"max\";s:3:\"age\";s:1:\"8\";}",
"{s:4:\"name\";s:5:\"peter\";s:3:\"age\";s:2:\"10\";}",
"{s:4:\"name\";s:4:\"susy\";s:3:\"age\";s:3:\"100\";}")
)
df <- df %>%
mutate(
name = substring(information, gregexpr('"', information)[[1]][3] +1, gregexpr('"', information)[[1]][4]-1)
)
df
I tried to select the value between the third and fourth quotes. Unfortunately, this selection only works for the first value. For the other selections, the positions are obviously taken from the first value.
[1] "max" "pet" "sus"
Thank you for your ideas
Split on delimiter and read as delimited text, keep columns we need:
read.delim(text = gsub(";", ":", df$information, fixed = TRUE),
sep = ":", header = FALSE)[, c(3, 6, 9, 12)]
# V3 V6 V9 V12
# 1 name max age 8
# 2 name peter age 10
# 3 name susy age 100
Note: there is a pattern seq(3, 12, 3)
to automate the column selection.