Search code examples
rdataframetibble

Tibble equivalent of data frame creation in R


In R, when I use the following code to generate a table with the data.frame() command:

nn <- 10
df <- data.frame(Completers = rep(c(1, 0), each = nn),
                 Gender = c(1, 0))

I get this result:

   Completers Gender
1           1      1
2           1      0
3           1      1
4           1      0
5           1      1
6           0      0
7           0      1
8           0      0
9           0      1
10          0      0

However, when I try to do the same with tibble::tibble():

tb <- tibble::tibble(Completers = rep(c(1, 0), each = nn),
                     Gender = c(1, 0))

I get the following error:

Error:
! Tibble columns must have compatible sizes.
• Size 10: Existing data.
• Size 2: Column `Gender`.
ℹ Only values of size one are recycled.
Run `rlang::last_error()` to see where the error occurred.

Needless to say that running rlang::last_error() does not help (me, at least).

Of course, I could simply tb <- tibble::as_tibble(df) and get on with my life, but still...

Therefore, my question is:

  • What is the tibble() code equivalent of data.frame() in order to have the same result as above?

sessioninfo::session_info() extract:

 setting  value
 version  R version 4.2.1 (2022-06-23)
 os       macOS Monterey 12.6
 system   x86_64, darwin17.0
 rstudio  2022.07.1+554 Spotted Wakerobin (desktop)
-------------------------------------------------------
package              * version    date (UTC) lib source
tibble                 3.1.8      2022-07-22 [1] CRAN (R 4.2.0)

Solution

  • It is mentioned in the ?tibble documentation

    tibble() builds columns sequentially. When defining a column, you can refer to columns created earlier in the call. Only columns of length one are recycled.

    > tibble::tibble(Completers = rep(c(1, 0), each = nn), Gender = 1)
    # A tibble: 20 × 2
       Completers Gender
            <dbl>  <dbl>
     1          1      1
     2          1      1
     3          1      1
     4          1      1
     5          1      1
     6          1      1
     7          1      1
     8          1      1
     9          1      1
    10          1      1
    11          0      1
    12          0      1
    13          0      1
    14          0      1
    15          0      1
    16          0      1
    17          0      1
    18          0      1
    19          0      1
    20          0      1
    

    If we want to get the same output, use rep with length.output

    tibble::tibble(Completers = rep(c(1, 0), each = nn), 
         Gender = rep(c(1, 0), length.out = length(Completers)))
    # A tibble: 20 × 2
       Completers Gender
            <dbl>  <dbl>
     1          1      1
     2          1      0
     3          1      1
     4          1      0
     5          1      1
     6          1      0
     7          1      1
     8          1      0
     9          1      1
    10          1      0
    11          0      1
    12          0      0
    13          0      1
    14          0      0
    15          0      1
    16          0      0
    17          0      1
    18          0      0
    19          0      1
    20          0      0