Search code examples
rlistloopsgrepl

"Argument is of length zero" error with grepl loop


I am having issues with a loop involving grepl. I am trying to print the index that contains the string "Taxable Revenue by Area", but I keep getting the error Argument is of length zero. I have tried it different ways but keep getting an error. When I check the length of the grepl statement, it is 1, not zero. I'm really stuck! nevlists is a list of dataframes. Each data frame is named by number, 1-48 and the length of nevlists therefore is 48. When I run the grepl statement on its own with the page I want: grepl("Taxable Revenue by Area", nevlists$'48'[3,] this evaluates as TRUE which is what I'm looking for. I just can't adapt this to the loop for whatever reason.

library(readr)
library(stringr)
library(magrittr)
library(dplyr)
library(tidyr)
library(pdftools)

 nvsr65_05 <- pdf_text("https://gaming.nv.gov/modules/showdocument.aspx?documentid=13542")

 getstats<- function(nvsr65_05){

listofdfs <- list() #Create a list in which you intend to save your df's.

for (i in 1:length(nvsr65_05)) {
table_data2 <- nvsr65_05[[i]] %>%
str_split(pattern = "\n")
table_data2 <- data.frame(matrix(unlist(table_data2)))
listofdfs[[i]] <- table_data2
}

return(listofdfs)
}


nevlists <- getstats(nvsr65_05)
names(nevlists) <-c(1:48)

for (i in 1:length(nevlists)) {
  if(grepl("Taxable Revenue by Area", nevlists$'i'[3,]) == TRUE){
    print(i)}}

#Try2

for (i in 1:length(nevlists)) {
if(as.numeric(grepl("Taxable Revenue by Area", nevlists$'i'[3,])) > 0){
print(i)}}

Solution

  • I'm not 100% sure, but I think this is due to the way you're indexing- try:

    for (i in names(nevlists)) {
      # Get the index as a character instead of numeric
      # in case your names are something other than pure numbers as 
      # in this example
      ix = paste(i)
      if (grepl("Taxable Revenue by Area", nevlists[[ix]][3, ]) == TRUE) {
        print(ix)
      }
    }
    

    This does only print "48" for me- is that what you expect?

    If you don't actually care about the name of the list item, you can ignore naming the list items at all and just do:

    for (i in 1:length(nevlists)) {
      if (grepl("Taxable Revenue by Area", nevlists[[i]][3, ]) == TRUE) {
        print(i)
      }
    }
    

    to output a numeric index value, which may be more useful depending on what you're after.