Following is a sample of the data I have
datahave
# A tibble: 6 x 6
YEAR SCHOOL_NAME CONTENT_AREA BELOW_BASIC_PCT BASIC_PCT ADVANCED_PCT
<dbl> <chr> <chr> <chr> <chr> <chr>
1 2015 5TH AND 6TH GRADE CTR. Eng. Language Arts 38.1 28.3 10.1
2 2015 5TH AND 6TH GRADE CTR. Mathematics 39 30.3 14.6
3 2015 5TH AND 6TH GRADE CTR. Science 25.4 41.7 12.3
4 2015 6TH GRADE CENTER Eng. Language Arts 7.6 27.8 21.8
5 2015 6TH GRADE CENTER Mathematics 19.100000000000001 37.700000000000003 17.5
6 2015 7th and 8th Grade Center Eng. Language Arts 52.1 27.4 1.7
Following is a reproducible example similar to this
school<-c("A","A",'A','B','B','B')
content_area<-c('english','math','science','english','math','science')
below_basic<-c(20,30,40,10,15,20)
advanced<-c(2,5,3,1,2.5,1.5)
df<-data.frame(school,content_area,below_basic,advanced)
df
and ran the following code on the above
library(reshape2)
dcast(melt(df), school ~ content_area + variable)
This gives me the desired output because it is using Using school, content_area as id variables
However when I run the same code on the original dataset
dcast(melt(datahave), SCHOOL_NAME ~ CONTENT_AREA + variable)
it is actually using Using SCHOOL_NAME, CONTENT_AREA, BELOW_BASIC_PCT, BASIC_PCT, ADVANCED_PCT as id variables
How do I specify which columns can be used as the ID variable? so I get an output similar to the reproducible example.
We can specify the id.var
in melt
, otherwise, it can automatically pick the variables based on the type.
library(reshape2)
dcast(melt(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA")),
SCHOOL_NAME ~ CONTENT_AREA + variable)
# SCHOOL_NAME Eng. Language Arts_BELOW_BASIC_PCT Eng. Language Arts_BASIC_PCT
#1 5TH AND 6TH GRADE CTR. 38.1 28.3
#2 6TH GRADE CENTER 7.6 27.8
#3 7th and 8th Grade Center 52.1 27.4
# Eng. Language Arts_ADVANCED_PCT Mathematics_BELOW_BASIC_PCT Mathematics_BASIC_PCT Mathematics_ADVANCED_PCT
#1 10.1 39.0 30.3 14.6
#2 21.8 19.1 37.7 17.5
#3 1.7 NA NA NA
# Science_BELOW_BASIC_PCT Science_BASIC_PCT Science_ADVANCED_PCT
#1 25.4 41.7 12.3
#2 NA NA NA
#3 NA NA NA
The melt/dcast
wrapper is recast
which can be used as well
recast(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA"),
SCHOOL_NAME ~ CONTENT_AREA + variable)
datahave <- structure(list(YEAR = c(2015L, 2015L, 2015L, 2015L, 2015L, 2015L
), SCHOOL_NAME = c("5TH AND 6TH GRADE CTR.", "5TH AND 6TH GRADE CTR.",
"5TH AND 6TH GRADE CTR.", "6TH GRADE CENTER", "6TH GRADE CENTER",
"7th and 8th Grade Center"), CONTENT_AREA = c("Eng. Language Arts",
"Mathematics", "Science", "Eng. Language Arts", "Mathematics",
"Eng. Language Arts"), BELOW_BASIC_PCT = c(38.1, 39, 25.4, 7.6,
19.1, 52.1), BASIC_PCT = c(28.3, 30.3, 41.7, 27.8, 37.7, 27.4
), ADVANCED_PCT = c(10.1, 14.6, 12.3, 21.8, 17.5, 1.7)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))