I am looking to transform a long dataformat to a wide one, but the values of the "signatories" column are not exactly the same as the new column names should be.
I have a dataframe with bill numbers and the names of their signatories (senators). If a senator has signed a bill, their is a row in the dataframe with the bill number and the name of the senator in questions. All the senators whose names are not listed next to the bill number have not signed the bill.
I want to construct a matrix with rows as bill numbers and columns as senator names, filled with 1s and 0s only. 1 indicates that a given senator has signed a bill, 0 indicates they have not.
ex <- data.frame(billno = c(715851, 715851, 715851,715852, 715852, 715852, 715852, 715852, 715852), signatories = c("Ben", "Lisa", "Roger", "Louise", "Macy", "John", "Jake", "James", "Ben"))
Senatornames <- c("Ben", "Lisa", "Roger", "Louise", "Macy", "John", "Jake", "James", "Julian", "Ayn")
#current output is this
# billno signatories
#1 715851 Ben
#2 715851 Lisa
#3 715851 Roger
#4 715852 Louise
#5 715852 Macy
#6 715852 John
#7 715852 Jake
#8 715852 James
#9 715852 Ben
#I want something like this. How do I arrive at this result?
# billno Ben Lisa Roger Louise Macy John Jake James Ayn
#1 715851 1 1 1 0 0 0 0 0 0
#2 715852 1 0 0 1 1 1 1 1 0
Thanks so much!!
Next code can bring you close to what you want. I have used the vector you provided in order to create a full_join()
so that all values can appear. Here the code:
library(tidyverse)
#Dataframe
df <- data.frame(signatories=Senatornames,stringsAsFactors = F)
#Code
ex1 <- ex %>% mutate(Value=1) %>%
full_join(df) %>%
fill(billno) %>%
pivot_wider(names_from = signatories,values_from=Value) %>%
replace(is.na(.),0)
Output:
# A tibble: 2 x 11
billno Ben Lisa Roger Louise Macy John Jake James Julian Ayn
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 715851 1 1 1 0 0 0 0 0 0 0
2 715852 1 0 0 1 1 1 1 1 0 0