I have some data structured a bit like this:
x01 <- c("94633X94644Y95423X96130", "124240X124494Y124571X124714", "135654X135660Y136226X136786")
That I end up using later as an IRanges object through some steps that look like:
x02 <- sapply(x01,
function(x) do.call(rbind,
strsplit(strsplit(x,
split = "Y",
fixed = TRUE)[[1]],
split = "X",
fixed = TRUE)),
simplify = FALSE,
USE.NAMES = FALSE)
x03 <- sapply(x02,
function(x) IRanges(start = as.integer(x[, 1L]),
end = as.integer(x[, 2L])),
simplify = FALSE,
USE.NAMES = FALSE)
> x03
[[1]]
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 94633 94644 12
[2] 95423 96130 708
[[2]]
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 124240 124494 255
[2] 124571 124714 144
[[3]]
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 135654 135660 7
[2] 136226 136786 561
Now I would like to be able to store x03 as a column in a data.frame with some associated information with something simple like:
> x04 <- data.frame("col1" = 1:3,
"col2" = x01,
"col3" = x03)
This unsurprisingly tells me that I have a differing number of rows, however, I feel like i've seen JSON imports into R mimic the kind of structure I want, where a ragged list inhabits the column of a data.frame. Is this a possible operation?
It's a very good question, I have seen it before with other dataframe like objects, but I think the above does not work because as long as there is an as.data.frame that can be used onto the matrix, or IRanges, it will mess up the dimensions and not embed it (I might be very well wrong).
One option is to use a tibble:
x04 = tibble::tibble(x01=x01,x02=x02,x03=x03)
# A tibble: 3 x 3
a b c
<chr> <list> <list>
1 94633X94644Y95423X96130 <chr[,2] [2 x 2]> <IRanges>
2 124240X124494Y124571X124714 <chr[,2] [2 x 2]> <IRanges>
3 135654X135660Y136226X136786 <chr[,2] [2 x 2]> <IRanges>
x04$x03
[[1]]
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 94633 94644 12
[2] 95423 96130 708
[[2]]
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 124240 124494 255
[2] 124571 124714 144
[[3]]
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 135654 135660 7
[2] 136226 136786 561
Another option:
library(S4Vectors)
DataFrame(x01=x01,x02=List(x02),x03=IRangesList(x03))
x01 x02
<character> <List>
1 94633X94644Y95423X96130 94633:94644,95423:96130,...
2 124240X124494Y124571X124714 124240:124494,124571:124714,...
3 135654X135660Y136226X136786 135654:135660,136226:136786,...
x03
<IRangesList>
1 94633-94644,95423-96130
2 124240-124494,124571-124714
3 135654-135660,136226-136786