Search code examples
rselectgroup-bydplyr

select several column use group_by in dplyr


I want to select several column easily. please help me.

Now I use like this:

dplyr::group_by(iris, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)

I hope use like this. But present error:

dplyr::group_by(iris, Sepal.Length:Petal.Width)

select() can select column use colon(:) dplyr::select(Sepal.Length:Petal.Width)

But group_by() cannot column use colon(:)

dplyr::group_by(iris, Sepal.Length:Petal.Width)

select() can use colon(:) to select column, but why group_by() use colon?


Solution

  • You can accomplish something similar using the *_ version, though it may take a bit more thought to get the right values. Here, you want the first four columns, so this should work:

    iris %>% group_by_(.dots = names(.)[1:4])
    

    Shows:

    Source: local data frame [150 x 5]
    Groups: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width [149]
    
       Sepal.Length Sepal.Width Petal.Length Petal.Width Species
              <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
    1           5.1         3.5          1.4         0.2  setosa
    2           4.9         3.0          1.4         0.2  setosa
    

    It would probably work even better to save the column names first, which would give you even more control, e.g.,

    colsToSave <- names(iris)[1:4]
    
    iris %>% group_by_(.dots = colsToSave)
    

    Gives the same result, but could let you set your own ranges. You could even use select to generate to columns you want, then just save the names, though that is likely overkill.

    colsToSave <- iris %>% select(Sepal.Length:Petal.Width) %>% names