I have a problem after changing my dataset from long to wide (using spread
, from the tidyr library on the Result_Type
column). I have the following example df:
Group<-c("A","A","A","B","B","B","C","C","C","D", "D")
Result_Type<-c("Final.Result", "Verification","Test", "Verification","Final.Result","Fast",
"Verification","Fast", "Final.Result", "Test", "Final.Result")
Result<-c(7,1,8,7,"NA",9,10,12,17,50,11)
df<-data.frame(Group, Result_Type, Result)
df
Group Result_Type Result
1 A Final.Result 7
2 A Verification 1
3 A Test 8
4 B Verification 7
5 B Final.Result NA
6 B Fast 9
7 C Verification 10
8 C Fast 12
9 C Final.Result 17
10 D Test 50
11 D Final.Result 11
In the column Result_type
there are many possible result types and in some datasets I have Result_Type 's that will not occur in other datasets. However, one level: Final.Result
does occur in every dataset.
Also: This is example data but the actual data has many different columns, and as these differ across the datasets I use, I used spread
(from the tidyr library) so I don't have to give any specific column names other than my target columns.
library("tidyr")
df_spread<-spread(df, key = Result_Type, value = Result)
Group Fast Final.Result Test Verification
1 A <NA> 7 8 1
2 B 9 NA <NA> 7
3 C 12 17 <NA> 10
4 D <NA> 11 50 <NA>
What I would like is that once I convert the dataset from long to wide, Final.Result
is the first column, how the rest of the columns is arranged doesn't matter, so I would like it to be like this (without calling any names of the other columns that are spread, or using order index numbers):
Group Final.Result Fast Test Verification
1 A 7 <NA> 8 1
2 B NA 9 <NA> 7
3 C 17 12 <NA> 10
4 D 11 <NA> 50 <NA>
I saw some answers that indicated you can reverse the order of the spreaded columns, or turn off the ordering of spread, but that doesn't make sure that Final.Result
is always the first column of the spread
levels.
I hope I am making myself clear, it's a little complicated to explain. If someone needs extra info I will be happy to explain more!
spread
creates columns in the order of the key
column's factor levels. Within the tidyverse, forcats::fct_relevel
is a convenience function for rearranging factor levels. The default is that the level(s) you specify will be moved to the front.
library(dplyr)
library(tidyr)
...
levels(df$Result_Type)
#> [1] "Fast" "Final.Result" "Test" "Verification"
Calling fct_relevel
will put "Final.Result"
as the first level, keeping the rest of the levels in their previous order.
reordered <- df %>%
mutate(Result_Type = forcats::fct_relevel(Result_Type, "Final.Result"))
levels(reordered$Result_Type)
#> [1] "Final.Result" "Fast" "Test" "Verification"
Adding that into your pipeline puts Final.Result
as the first column after spreading.
df %>%
mutate(Result_Type = forcats::fct_relevel(Result_Type, "Final.Result")) %>%
spread(key = Result_Type, value = Result)
#> Group Final.Result Fast Test Verification
#> 1 A 7 <NA> 8 1
#> 2 B NA 9 <NA> 7
#> 3 C 17 12 <NA> 10
#> 4 D 11 <NA> 50 <NA>
Created on 2018-12-14 by the reprex package (v0.2.1)