Search code examples
rdataframereshapedcast

Reshape data by hour vs day, with value of a third column


I'm trying to do some reshaping of hourly climatic data, but I can't get it right... Here is the data, one day varible (365 levels.+/- 1 depending year), one hour variable (24 levels), one numeric temperature (+/- 8760 obs).

head(df)
####         .day .hour temperature
#### 2 2013-01-01     1          19
#### 3 2013-01-01     2          19
#### 4 2013-01-01     3          18
#### 5 2013-01-01     4          18

My expected output is a data.frame like this, but instead of the values one (the lenghts) I need the temperature values inside...

        .day 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
1 2013-01-01 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
2 2013-01-02 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
3 2013-01-03 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
4 2013-01-04 1 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

This output is generated with dcast(.day~.hour), Il also tried some tidyr with no success. How can I do this? And what about if there are some missing lines somewhere (a day missing, etc.)? Thanks.


Solution

  • To reformat data from long to wide format, we can use tidyr has the relevant function spread. The help files has sufficient example here: http://cran.r-project.org/web/packages/tidyr/tidyr.pdf#page.14

    require(tidyr)
    spread(df, .hour, temperature, fill = NA) #fill any missing data with NA 
    

    A comprehensive tour of other options available to effect the same changes is given here: https://stackoverflow.com/a/9617424/2724299