Search code examples
rdataframeloopslapplytibble

How to delete the first row from every data frame in a list of data frames?


New to R! If you answer, I'd appreciate any explanation of my mistakes!

I have a list of data frames (tibbles, actually) and I'm trying to remove the first row in all of them. Here's one of the data frames:

> head(dfs_list[[1]][[1]])
# A tibble: 6 × 5
  Day        `Day length` `Solar noon` `Astronomical t…  `Astronomical t… 
  <chr>      <chr>        <chr>        <chr>             <chr>
1 Day        Day length   Solar noon   Start             End
2 Jan 1      09:31:23     12:22:29 pm  6:02 am           6:42 pm
3 Jan 2      09:32:06     12:22:57 pm  6:02 am           6:43 pm
4 Jan 3      09:32:52     12:23:25 pm  6:02 am           6:44 pm
5 Jan 4      09:33:42     12:23:52 pm  6:02 am           6:44 pm
6 Jan 5      09:34:36     12:24:19 pm  6:03 am           6:45 pm

Seems like the task should be straightforward enough, but I'm having a hard time of it. I've tried two approaches, resulting in the following errors:

dfs_edited <- lapply(dfs_list, dfs_list[-1,])
Error in dfs_list[-1, ] : incorrect number of dimensions
for(i in dfs_list) {
  tmp <- get(i)
  tmp <- tmp[-1,]
  assign(i, tmp)
}
Error in get(i) : invalid first argument

Solution

  • It looks like you have a list of lists of frames (double-nested), perhaps this is reproducing that:

    set.seed(42)
    dfs_list <- replicate(2, replicate(3, mtcars[sample(32,3),], simplify=FALSE), simplify=FALSE)
    
    str(dfs_list, max.level=2)
    # List of 2
    #  $ :List of 3
    #   ..$ :'data.frame':  3 obs. of  11 variables:
    #   ..$ :'data.frame':  3 obs. of  11 variables:
    #   ..$ :'data.frame':  3 obs. of  11 variables:
    #  $ :List of 3
    #   ..$ :'data.frame':  3 obs. of  11 variables:
    #   ..$ :'data.frame':  3 obs. of  11 variables:
    #   ..$ :'data.frame':  3 obs. of  11 variables:
    
    dfs_list[[1]][1:2]
    # [[1]]
    #                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
    # Chrysler Imperial 14.7   8  440 230 3.23 5.345 17.42  0  0    3    4
    # Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
    # Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
    # [[2]]
    #                   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    # Pontiac Firebird 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
    # Merc 280         19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
    # Hornet 4 Drive   21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
    

    From here, a double-lapply will work:

    dfs_list2 <- lapply(dfs_list, function(z) lapply(z, function(y) y[-1,]))
    
    str(dfs_list2, max.level=2)
    # List of 2
    #  $ :List of 3
    #   ..$ :'data.frame':  2 obs. of  11 variables:
    #   ..$ :'data.frame':  2 obs. of  11 variables:
    #   ..$ :'data.frame':  2 obs. of  11 variables:
    #  $ :List of 3
    #   ..$ :'data.frame':  2 obs. of  11 variables:
    #   ..$ :'data.frame':  2 obs. of  11 variables:
    #   ..$ :'data.frame':  2 obs. of  11 variables:
    
    dfs_list2[[1]][1:2]
    # [[1]]
    #                    mpg cyl disp  hp drat   wt  qsec vs am gear carb
    # Hornet Sportabout 18.7   8  360 175 3.15 3.44 17.02  0  0    3    2
    # Mazda RX4         21.0   6  160 110 3.90 2.62 16.46  0  1    4    4
    # [[2]]
    #                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    # Merc 280       19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
    # Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1