I am trying to get my dataframe in a long format. Currently, it's a dataframe with 1 obs. and >250 variables which looks something like this:
user_0_id | user_0_name | user_0_sex | user_1_id | user_1_name | user_1_sex | user_2_id | user_2_name | user_2_sex |
---|---|---|---|---|---|---|---|---|
16 | miller | female | 52 | smith | male | 33 | frank | female |
Meanwhile, I just wish to create a dataframe which looks like this:
id | name | sex |
---|---|---|
16 | miller | female |
52 | smith | male |
33 | frank | female |
I've tried renaming the column names in the df to match by applying the following code
names(df) <- sub('^users_', '', names(df))
names(df) <- sub('^[0-9.]+', '', names(df))
names(df) <- sub('^_', '', names(df))
Afterwards, I've been trying to switch it to the long format using melt, which only ended up either multiplying the obs. or completely mumbling up the dataframe.
df <- melt(setDT(df), id.vars = c("id","name"), variable.name = "data")
df <- melt(df, id.vars=c("id", "name", "sex"))
This is the first time I've asked a question on stackoverflow, so any advice as to how to make future questions more comprehensible are very welcome! Thanks to anyone that took their time to read this.
library(tidyr)
pivot_longer(
df,
cols = everything(),
names_to = c(".value"),
names_pattern = "user_\\d+_(.*)"
)
# A tibble: 3 × 3
id name sex
<int> <chr> <chr>
1 16 miller female
2 52 smith male
3 33 frank female
Where df
is:
df <- read.table(text =
'user_0_id user_0_name user_0_sex user_1_id user_1_name user_1_sex user_2_id user_2_name user_2_sex
16 miller female 52 smith male 33 frank female
', header = TRUE)