Below is a sample dataset and a few lines of code that are troubling me. I can not figure out how to turn these derived variables (Year and Session) into numeric, so that I can then get proper summaries and use the "subset" function.
##Generate sample dataset
df=data.frame(StudyAreaVisitNote=c("2006 Session 1","2006 Session 2", "2008 Session 4", "2012 Session 3"))
##Create new column denoting year and session on their own
as.factor(df$StudyAreaVisitNote)
df$Year <- substr(x = df$StudyAreaVisitNote, start = 1, stop = 4)
df$Session <- substr(x = df$StudyAreaVisitNote, start = 13, stop = 14)
##Summary of Data
summary(df) ## Year and Session are Class and Mode "Character", summary provides little info
##Turn Year and Session into Numeric
as.numeric(df$Year)
as.numeric(df$Session)
##Try Summary of Data Again
summary(df) ## Again, Year and Session are Class and Mode "Character", summary provides little info
The lines
as.factor(df$StudyAreaVisitNote)
as.numeric(df$Year)
as.numeric(df$Session)
do not permanently change the values in df
. They return transformed vectors that are printed to the console, then, because you do not save them anywhere, they disappear as soon as that line in done being called. Generally objects in R are not updated via referece, you must alwayts re-assign the returned result to wherevver you would like to store it. So try
df$Year <- as.numeric(df$Year)
df$Session <- as.numeric(df$Session)
instead