I want to use the reshape function in R with wide format to analyze weather data noted from stations over three years: 1996, 2006, and 2016.
I have combined the three data sets in long format manually, but some variables such as longitude and latitude are not supposed to vary from one year to another. However some are different.
I get the warning : some constant variables (Nom,Lat,Long) are really varying, when I apply reshape, which is expected.
In this case, the aim is to have these variables, in the resulting wide format, containing the values observed in the year 2016.
Note: the weather stations are not all present in all the three years.
Here is an example:
a <- c(rep(2, 4), rep(4, 3))
b <- c(rep(2, 3), 5, rep(4, 3), 4, 5, 3)
c <- c(1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
name <- c('a', 'b', 'c', 'd', 'e', 'f', 'g', 'a', 'b',
'c', 'd', 'r', 't', 'y', 'u', 'q', 'z')
year <- c(rep(2002, 7),rep(2005, 10))
df <- cbind(id = c, pos = c(a,b), year, name)
df <- as.data.frame(df)
df would be as follows:
> df
id pos year name
1 1 2 2002 a
2 2 2 2002 b
3 3 2 2002 c
4 4 2 2002 d
5 5 4 2002 e
6 6 4 2002 f
7 7 4 2002 g
8 1 2 2005 a
9 2 2 2005 b
10 3 2 2005 c
11 4 5 2005 d
12 5 4 2005 r
13 6 4 2005 t
14 7 4 2005 y
15 8 4 2005 u
16 9 5 2005 q
17 10 3 2005 z
now to go to wide format using reshape:
dfw <- reshape(df, direction = "wide", timevar = "year",
idvar = "id", v.names = "pos")
Warning message: In reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : some constant variables (name) are really varying
> dfw
id name pos.2002 pos.2005
1 1 a 2 2
2 2 b 2 2
3 3 c 2 2
4 4 d 2 5
5 5 e 4 4
6 6 f 4 4
7 7 g 4 4
15 8 u <NA> 4
16 9 q <NA> 5
17 10 z <NA> 3
I got what I need: pos varying over time.
My problem is: in the year 2002, observations with id 5, 6, and 7 had the respective names e, f, and g. However in 2005, they had the respective names r, t, and y. the wide format table dfw shows the names given in 2002
I want the result in the reshape function to present the names defined in 2005 for this type of observations.
Is there something to be modified in the reshape function? maybe in another package?
Note that these are two separate tables initially one for each year, and the were manually combined, so perhaps a modification can be made before combining?
when combining the two data frames for the separate years, the table for 2005 should be placed in the beginning. This way R takes the first configuration which is defined for year 2005.