to put it simply, I have a list of DFMs created by quanteda package(LD1). each DFM has different texts of different lengths.
now, I want to calculate and compare lexical diversity for each text within DFMs and among DFMs.
lex.div <-lapply(LD1, function(x) {textstat_lexdiv(x,measure = "all")})
this leaves me with a list of S3 type data, and within each of which, there are different attributes that are lexical diversity measures.
lex.div[[1]]$TTR
[1] 0.2940000 0.2285000 0.2110000 0.1912500 0.1802000 0.1671667 0.1531429 0.1483750 0.1392222
[10] 0.1269000
lex.div[[2]]$TTR
[1] 0.3840000 0.2895000 0.2273333 0.2047500 0.1922000 0.1808333 0.1677143 0.1616250 0.1530000
[10] 0.1439000 0.1352727 0.1279167 0.1197692 0.1125000 0.1069333
here comes the problem. I need all the TTR values in one matrix. i want lex.div[[1]]$TTR
to be the first row of the matrix, lex.div[[2]]$TTR
to be the second, and so on. note that the length of lex.div[[1]]$TTR
≠ lex.div[[2]]$TTR
.
here is what I've done so far:
m1 <-matrix(lex.div[[1]]$TTR, nrow = 1, ncol = length(lex.div[[1]]$TTR))
m.sup <- if(ncol(m1) < 30) {mat.to.add = matrix(NA, nrow = nrow(m1), ncol = 30 - ncol(m1))}
m1 <-cbind(m1, m.sup)
m2 <-matrix(lex.div[[2]]$TTR, nrow = 1, ncol = length(lex.div[[2]]$TTR))
m.sup <- if(ncol(m2) < 30) {mat.to.add = matrix(NA, nrow = nrow(m2), ncol = 30 - ncol(m2))}
m2 <-cbind(m2, m.sup)
m3 <-matrix(lex.div[[3]]$TTR, nrow = 1, ncol = length(lex.div[[3]]$TTR))
m.sup <- if(ncol(m3) < 30) {mat.to.add = matrix(NA, nrow = nrow(m3), ncol = 30 - ncol(m3))}
m3 <-cbind(m3, m.sup)
...
m.total <-rbind (m1,m2,m3...)
but I cannot do it this way. can you help me write a for loop or sth to get it done easier and quicker?
You can try the code below
TTRs <- lapply(lex.div, `[[`, "TTR")
m <- t(sapply(TTRs, `length<-`, max(lengths(TTRs))))