Search code examples
rdataframedplyrtibblereadr

Read whitespace-delimited Stack Overflow data with row numbers directly into R


Often Stack Overflow R questions can share sample data that is just data.frame output as such, instead of dput:

      id cate  result
 1     1 yes       1
 2     1 yes      NA
 3     1 no       NA
 4     2 no       NA
 5     2 yes       1
 6     2 yes      NA
 7     2 no       NA
 8     3 no       NA
 9     3 yes      NA
10     3 no       NA
11     3 yes       1
12     3 yes      NA
13     3 no       NA
14     3 yes      NA
15     4 yes       1
16     4 yes      NA
17     4 yes      NA
18     4 no       NA
19     4 no       NA 

One way I found to read this into R while answering questions is to add a row_num column manually, then read_table, and select(-row_num).

readr::read_table("   row_num   id cate  result
 1     1 yes       1
 2     1 yes      NA
 3     1 no       NA
 4     2 no       NA
 5     2 yes       1
 6     2 yes      NA
 7     2 no       NA
 8     3 no       NA
 9     3 yes      NA
10     3 no       NA
11     3 yes       1
12     3 yes      NA
13     3 no       NA
14     3 yes      NA
15     4 yes       1
16     4 yes      NA
17     4 yes      NA
18     4 no       NA
19     4 no       NA ") |>
  dplyr::select(-row_num)

# # A tibble: 19 × 3
#       id cate  result
#    <dbl> <chr>  <dbl>
#  1     1 yes        1
#  2     1 yes       NA
#  3     1 no        NA
#  4     2 no        NA
#  5     2 yes        1
#  6     2 yes       NA
#  7     2 no        NA
#  8     3 no        NA
#  9     3 yes       NA
# 10     3 no        NA
# 11     3 yes        1
# 12     3 yes       NA
# 13     3 no        NA
# 14     3 yes       NA
# 15     4 yes        1
# 16     4 yes       NA
# 17     4 yes       NA
# 18     4 no        NA
# 19     4 no        NA

Are there any simpler packages/tricks to read data.frame or tibble output in just one step?


Solution

  • Or read.table:

    df <- read.table(text = "      id cate  result
     1     1 yes       1
     2     1 yes      NA
     3     1 no       NA
     4     2 no       NA
     5     2 yes       1
     6     2 yes      NA
     7     2 no       NA
     8     3 no       NA
     9     3 yes      NA
    10     3 no       NA
    11     3 yes       1
    12     3 yes      NA
    13     3 no       NA
    14     3 yes      NA
    15     4 yes       1
    16     4 yes      NA
    17     4 yes      NA
    18     4 no       NA
    19     4 no       NA", header = TRUE)
    df
    #>    id cate result
    #> 1   1  yes      1
    #> 2   1  yes     NA
    #> 3   1   no     NA
    #> 4   2   no     NA
    #> 5   2  yes      1
    #> 6   2  yes     NA
    #> 7   2   no     NA
    #> 8   3   no     NA
    #> 9   3  yes     NA
    #> 10  3   no     NA
    #> 11  3  yes      1
    #> 12  3  yes     NA
    #> 13  3   no     NA
    #> 14  3  yes     NA
    #> 15  4  yes      1
    #> 16  4  yes     NA
    #> 17  4  yes     NA
    #> 18  4   no     NA
    #> 19  4   no     NA
    

    Created on 2022-07-29 by the reprex package (v2.0.1)