Creating a dataframe by using row numbers of other data frames in R

I would like to know that whether there is a way in R for using datasets with similar names. For example:

I have eight datasets named cars_1995, cars_new_1995, cars_1996, cars_new_1996, cars_1997, cars_new_1997, cars_1998 and cars_new_1998 and they have different dimensions but only thing that I want to use is their row numbers.

So I want to create a new data frame and rownames will be the years (1995,1996,1997,1998) and colnames will be the groups (cars, cars_new). the dimension of the data frame will be 4:2. The values of the dataframe will be the rownumbers such as:

       Cars  Cars_new
  1995  25  76 
  1996  38  35 
  1997  87  49 
  1998  34  50

First question is, in the R, is there a way to use the names of dataframes like I need to take row numbers of all data frames from 1995 to 1999 (one column for cars_YEAR, the other column for cars_YEAR_new). By the way only year indicator is the names of the data frames.

I'm OK with how to build a data frame or how to get the rownumbers (nrow) However, I haven't figured out how to pull all datasets by just using embed years of the dataframes' names consecutively.So it would be great how to get all cars_YEAR datasets row numbers by using the year extension in the datasets names.

Solution

Solution using `stringr` + `dplyr` + `tidyr`:

library(stringr)
library(dplyr)
library(tidyr)

ls() %>%
  str_extract("^cars_\\d{4}.*") %>%
  na.omit() %>%
  mget(envir = globalenv()) %>%
  lapply(nrow) %>%
  data.frame() %>%
  gather(key, value) %>%
  mutate(key = str_replace(key, "(\\d{4})_(.+)", "\\2_\\1")) %>%
  separate(key, c("key", "year"), "_(?=\\d)") %>%
  spread(key, value)

Result:

  year cars cars_new
1 1995  114      102
2 1996  113       61
3 1997  168      186
4 1998  196       68

Notes:

ls() is handy for this example as it can extract all object names in the global environment.
Using str_extract, I extracted only object names that starts with cars_[4 digits] then fed the result into mget
mget takes a vector of character object names and returns the values from a specified environment in a named list
lapply is used to extract row numbers in each element (dataset) of the list. This is then converted to a data.frame with each column corresponding to a row number.
gather reshapes the data.frame from wide to long format.
The regular expression in mutate and separate are respectively to change names from cars_1995_news to cars_new_1995 and separate into a column of cars_new and year
The final spread reshapes the data.frame to the desired format.

Note about data:

I used rnorm and sample to create the toy datasets and did not set a seed for each of them. This means you will not get the same resulting data.frame as what I have here. This is ok as what the row numbers actually are is not important to solve OP's problem.

Data:

# Create sample datasets
cars_1995 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1995_new = data.frame(A = rnorm(sample(50:200, 1)))
cars_1996 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1996_new = data.frame(A = rnorm(sample(50:200, 1)))
cars_1997 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1997_new = data.frame(A = rnorm(sample(50:200, 1)))
cars_1998 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1998_new = data.frame(A = rnorm(sample(50:200, 1)))

Edits (dataset names from `cars_YEAR_new` to `cars_new_YEAR`):

ls() %>%
  str_extract("^cars_(new_)*\\d{4}") %>%
  na.omit() %>%
  mget(envir = globalenv()) %>%
  lapply(nrow) %>%
  data.frame() %>%
  gather(key, value) %>%
  separate(key, c("key", "year"), "_(?=\\d)") %>%
  spread(key, value)

With the new dataset names, the mutate + str_replace step can be removed, and the str_extract step also has to be changed, since the pattern is now different.

New Data:

# Create sample datasets
cars_1995 = data.frame(A = rnorm(sample(50:200, 1)))
cars_new_1995 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1996 = data.frame(A = rnorm(sample(50:200, 1)))
cars_new_1996 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1997 = data.frame(A = rnorm(sample(50:200, 1)))
cars_new_1997 = data.frame(A = rnorm(sample(50:200, 1)))
cars_1998 = data.frame(A = rnorm(sample(50:200, 1)))
cars_new_1998 = data.frame(A = rnorm(sample(50:200, 1)))

Creating a dataframe by using row numbers of other data frames in R

Solution using stringr + dplyr + tidyr:

Edits (dataset names from cars_YEAR_new to cars_new_YEAR):

Solution using `stringr` + `dplyr` + `tidyr`:

Edits (dataset names from `cars_YEAR_new` to `cars_new_YEAR`):