Search code examples
rdataframedplyrdata-wrangling

Using R, how can I make columns become rows and correlate with values of another column?


Basically I need a new column called "census" containing columns dbh1, dbh3, and dbh4 as sequential rows, correlating with their tree number and species of interest.

The data looks like this (with 10k rows):

Tree Spec dbh1 dbh3 dbh4
1 PICO 6 8.8 9
2 ABLA 21 24.1 25.4
3 PICO 12 14.3 15.2
4 PIEN 24 25.5 25.8

I need it to look like this:

Tree Spec Census
1 PICO 6
1 PICO 8.8
1 PICO 9
2 ABLA 21
2 ABLA 24.1
2 ABLA 25.4
3 PICO 12
3 PICO 14.3
3 PICO 15.2
4 PIEN 24
4 PIEN 25.5
4 PIEN 25.8

I've tried looking around the tidyverse and I'm sure I'm missing what could be a very simple answer.


Solution

  • You can use pivot_longer() from tidyr to pivot your dataframe, and then remove the new column with the information of which original variable the values from Census came from.

    Data

    df <-
      structure(list(Tree = 1:4, Spec = c("PICO", "ABLA", "PICO", "PIEN"
      ), dbh1 = c(6L, 21L, 12L, 24L), dbh3 = c(8.8, 24.1, 14.3, 25.5
      ), dbh4 = c(9, 25.4, 15.2, 25.8)), class = "data.frame", row.names = c(NA,-4L))
    

    Code

    library(dplyr)
    library(tidyr)
    
    df %>% 
      pivot_longer(cols = starts_with("dbh"),values_to = "Census") %>% 
      select(-name)  
    

    Output

    # A tibble: 12 x 3
        Tree Spec  Census
       <int> <chr>  <dbl>
     1     1 PICO     6  
     2     1 PICO     8.8
     3     1 PICO     9  
     4     2 ABLA    21  
     5     2 ABLA    24.1
     6     2 ABLA    25.4
     7     3 PICO    12  
     8     3 PICO    14.3
     9     3 PICO    15.2
    10     4 PIEN    24  
    11     4 PIEN    25.5
    12     4 PIEN    25.8