Search code examples
rreproducible-research

Example of using dput()


Being a new user here, my questions are not being fully answered due to not being reproducible. I read the thread relating to producing reproducible code but to avail. Specifically lost on how to use the dput() function.

Could someone provide a step by step on how to use the dput() using the iris df for eg it would be very helpful.


Solution

  • Using the iris dataset, which is handily included into R, we can see how dput() works:

    data(iris)
    head(iris)
    
      Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    1          5.1         3.5          1.4         0.2  setosa
    2          4.9         3.0          1.4         0.2  setosa
    3          4.7         3.2          1.3         0.2  setosa
    4          4.6         3.1          1.5         0.2  setosa
    5          5.0         3.6          1.4         0.2  setosa
    6          5.4         3.9          1.7         0.4  setosa
    

    Now we can get the whole dataset using dput(iris). In most situations, a whole dataset is unnecessary to provide for a Stackoverflow question, as a few lines of the relevant variables suffice as a working data example.

    Two things come in handy: The head() function outputs only the first six rows of a dataframe/matrix. Also, the indexing in R (via brackets) allows you to select only specific columns.

    Therefore, we can restrict the output of dput() using a combination of these two:

    dput(head(iris[, c(1, 3)]))
    
    structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4), 
        Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7)), .Names = c("Sepal.Length", 
    "Petal.Length"), row.names = c(NA, 6L), class = "data.frame")
    

    will give us the code to reproduce the first (up to) six rows of column 1 and 3 of the iris dataset.

    df <- structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4), 
        Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7)), .Names = c("Sepal.Length", 
    "Petal.Length"), row.names = c(NA, 6L), class = "data.frame")
    
    > df
      Sepal.Length Petal.Length
    1          5.1          1.4
    2          4.9          1.4
    3          4.7          1.3
    4          4.6          1.5
    5          5.0          1.4
    6          5.4          1.7
    

    If the first rows do not suffice, we can skip using head() and rely on indexing only:

    dput(iris[1:20, c(1, 3)])
    
    structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4, 4.6, 
    5, 4.4, 4.9, 5.4, 4.8, 4.8, 4.3, 5.8, 5.7, 5.4, 5.1, 5.7, 5.1
    ), Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 
    1.5, 1.5, 1.6, 1.4, 1.1, 1.2, 1.5, 1.3, 1.4, 1.7, 1.5)), .Names = c("Sepal.Length", 
    "Petal.Length"), row.names = c(NA, 20L), class = "data.frame")
    

    will give us the the first twenty rows:

    df <- structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4, 4.6, 
    5, 4.4, 4.9, 5.4, 4.8, 4.8, 4.3, 5.8, 5.7, 5.4, 5.1, 5.7, 5.1
    ), Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 
    1.5, 1.5, 1.6, 1.4, 1.1, 1.2, 1.5, 1.3, 1.4, 1.7, 1.5)), .Names = c("Sepal.Length", 
    "Petal.Length"), row.names = c(NA, 20L), class = "data.frame")
    
    > df
       Sepal.Length Petal.Length
    1           5.1          1.4
    2           4.9          1.4
    3           4.7          1.3
    4           4.6          1.5
    5           5.0          1.4
    6           5.4          1.7
    7           4.6          1.4
    8           5.0          1.5
    9           4.4          1.4
    10          4.9          1.5
    11          5.4          1.5
    12          4.8          1.6
    13          4.8          1.4
    14          4.3          1.1
    15          5.8          1.2
    16          5.7          1.5
    17          5.4          1.3
    18          5.1          1.4
    19          5.7          1.7
    20          5.1          1.5