I have a data frame that I'm looking to output to JSON in a particular format, with a small sample below:
dat <- structure(list(unit = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), .Label = c("A", "B"), class = "factor"), type = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("X", "Y"), class = "factor"),
date = structure(c(1357963687, 1357963869, 1357964048, 1357964230,
1357963687, 1357963942, 1357963942, 1357964123), class = c("POSIXct",
"POSIXt"), tzone = ""), latitude = c(-21.21, -21.22, -21.23,
-21.24, -21.23, -21.23, -21.23, -21.23), longitude = c(116.78,
116.77, 116.76, 116.75, 116.74, 116.75, 116.75, 116.76)), .Names = c("unit",
"type", "date", "latitude", "longitude"), row.names = c(NA, -8L
), class = "data.frame")
The JSON format that I'm going to need looks like this:
[{"unit":"A","type":"X","latitude":[["2013-01-12 12:08:07",-21.21],["2013-01-12 12:11:09",-21.22],["2013-01-12 12:14:08",-21.23],["2013-01-12 12:17:10",-21.24]],
"longitude":[["2013-01-12 12:08:07",116.78],["2013-01-12 12:11:09",116.77],["2013-01-12 12:14:08",116.76],["2013-01-12 12:17:10",116.75]]
},
{"unit":"B","type":"X", "latitude":[["2013-01-12 12:08:07",-21.23],["2013-01-12 12:12:22",-21.23],["2013-01-12 12:12:22",-21.23],["2013-01-12 12:15:23",-21.23]],
"longitude":[["2013-01-12 12:08:07",116.74],["2013-01-12 12:12:22",116.75],["2013-01-12 12:12:22",116.75],["2013-01-12 12:15:23",116.76]]
}]
I've been so far unable to manipulate the RJSONIO::toJSON
function to do anything like that, and I'm not finding the examples in the documentation too helpful.
What do I need to do to get the right output?
Note: There will only ever be a single type
for each unit
.
PS: Is there a tool out there to make these kinds of things easy? Perhaps something drag and drop?
You can get close by first turning the data frame into a list of lists. For example:
> a=list(unit="A",type="X",latitude=c(1,2,3),longitude=c(4,5,6))
> b=list(unit="B",type="Y",latitude=c(11,22,33),longitude=c(43,54,65))
> dlist = list(a,b)
> cat(toJSON(dlist))
[
{
"unit": "A",
"type": "X",
"latitude": [ 1, 2, 3 ],
"longitude": [ 4, 5, 6 ]
},
{
"unit": "B",
"type": "Y",
"latitude": [ 11, 22, 33 ],
"longitude": [ 43, 54, 65 ]
}
]
The problem is really how to manipulate a dataframe into the right format.
However your JSON output has mixed types -character and numeric - in vectors: ["2013-01-12 12:08:07",-21.23]
and I don't see how to get that out of R which insists on vectors being a single type. Would ["2013-01-12 12:08:07","-21.23"]
be acceptable? If so then read on...
The plyr
package has lots of code for splitting and manipulating data frames and lists. For example:
dlply(dat,~unit)
will split the dataframe by the unit
variable. You can apply a function to each of these sections and return a list. This function:
make1 <- function(d){
list(
unit=d$unit[1],
type=d$type[1],
latitude=cbind(as.character(d$date),d$latitude),
longitude=cbind(as.character(d$date),d$longitude))
}
Should convert one section into the right list format. So tell dlply
to do that to every section, and return a list of lists. That list has names, which makes toJSON
output as a named array - we need to remove the names to get a JS list.
> L = dlply(dat,~unit,make1)
> names(L)=NULL
> cat(toJSON(L))
[
{
"unit": "A",
"type": "X",
"latitude": [ [ "2013-01-12 04:08:07", "-21.21" ],
[ "2013-01-12 04:11:09", "-21.22" ],
[ "2013-01-12 04:14:08", "-21.23" ],
[ "2013-01-12 04:17:10", "-21.24" ] ],
"longitude": [ [ "2013-01-12 04:08:07", "116.78" ],
[ "2013-01-12 04:11:09", "116.77" ],
[ "2013-01-12 04:14:08", "116.76" ],
[ "2013-01-12 04:17:10", "116.75" ] ]
},
{
"unit": "B",
"type": "Y",
"latitude": [ [ "2013-01-12 04:08:07", "-21.23" ],
[ "2013-01-12 04:12:22", "-21.23" ],
[ "2013-01-12 04:12:22", "-21.23" ],
[ "2013-01-12 04:15:23", "-21.23" ] ],
"longitude": [ [ "2013-01-12 04:08:07", "116.74" ],
[ "2013-01-12 04:12:22", "116.75" ],
[ "2013-01-12 04:12:22", "116.75" ],
[ "2013-01-12 04:15:23", "116.76" ] ]
}
]
Fun eh?