Search code examples
rsubsetdata-wrangling

Creating a dataframe in R that is a subset of a number of other columns


I have a data frame with 854 observations and 47 variables (India_Summary). I want to create another data frame that contains only some columns from the 47 variables, named 'MEMSEXCOV1', 'PostSecAvailable', 'TertiaryYears'.

I thought I could simply use this (assuming I am just naming the new df 'India_Summary2'):

India_Summary2 <- India_Summary[['MEMSEXCOV1', 'PostSecAvailable', 'TertiaryYears']]

The error I receive is:

Error in `[[.default`(col, i, exact = exact) : subscript out of bounds.

I tried using an equal sign instead:

India_Summary2 = India_Summary[['MEMSEXCOV1', 'PostSecAvailable', 'TertiaryYears']]

and I receive the below error:

Error in `[[.default`(col, i, exact = exact) : subscript out of bounds
In addition: Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
  display list redraw incomplete
2: In doTryCatch(return(expr), name, parentenv, handler) :
  invalid graphics state
3: In doTryCatch(return(expr), name, parentenv, handler) :
  invalid graphics state

Solution

  • Your code looks like Python. In R, I'd recommend using the dplyr package. You'd have something like this:

    library(dplyr)
    
    India_Summary2 <- India_Summary %>% 
       select(MEMSEXCOV1, PostSecAvailable, TertiaryYears)