Simple question... given for example:
data("crude")
which is a corpus with 20 text documents, how do I get something like:
1 4
2 6
3 5
4 3
etc...
where the second column is the number of rows of each document in the corpus "crude"? Or even a vector of row numbers would work.
NROW/nrow don't seem to work.
Thanks for looking!
Hi you can count line feed (LF) with
library(stringr)
str_count(string = crude[[1]], pattern = "\\n")
# [1] 11
crude[[1]]
have 12 rows on my computer, so for all the corpus you can do this :
sapply(crude, FUN = function(x) str_count(string = x, pattern = "\\n") + 1)