I have two data dataframes
Category = c("New", "Fixed" ,"Regular")
`Jan 2014` = c(-418518172, 24425087,0.3016606 )
`Feb 2014`=c(-168947846,17919651,-0.7212493)
`Mar 2014`=c(-168846,17544,-0.775493)
df1 <- data.frame(Category, `Jan 2014`, `Feb 2014`,`Mar 2014`, check.names = FALSE)
Category = c("New","Fixed" ,"Regular")
`Jan 2014` = c(-8172,-8524,0.5060900)
`Feb 2014`=c(-1686,19929,-0.85059894)
`Mar 2014`=c(-2424246,1232324,-0.7217593)
df2 <- data.frame(Category, `Jan 2014`, `Feb 2014`, `Mar 2014`,check.names = FALSE)
calculate_df <- cbind(df1,(df1[-1]/df2[-1])*100)
After doing some calculation i am trying to reshape it so that all the dates becomeDate
column and then each type of Category
New
Fixed
and Regular
comes under separate column with their respective values.
reshape_df <- calculate_df %>%
pivot_longer(
cols = !Category,
names_to = "Date",
values_to = "Values"
)
calculate_df
Current Output
Category Date Values
<chr> <chr> <dbl>
1 New Jan 2014 -4.19e+8
2 New Feb 2014 -1.69e+8
3 New Mar 2014 -1.69e+5
4 New Jan 2014 5.12e+6
5 New Feb 2014 1.00e+7
6 New Mar 2014 6.96e+0
7 Fixed Jan 2014 2.44e+7
8 Fixed Feb 2014 1.79e+7
9 Fixed Mar 2014 1.75e+4
10 Fixed Jan 2014 -2.87e+5
11 Fixed Feb 2014 8.99e+4
12 Fixed Mar 2014 1.42e+0
13 Regular Jan 2014 3.02e-1
14 Regular Feb 2014 -7.21e-1
15 Regular Mar 2014 -7.75e-1
16 Regular Jan 2014 5.96e+1
17 Regular Feb 2014 8.48e+1
18 Regular Mar 2014 1.07e+2
How can i achieve the expected outcome?
Expected Output
Date Category Values Category Values Category Values
1 Jan 2014 New -4.19e+8 Fixed 2.44e+7 Regular 3.02e-1
2 Feb 2014 New -1.69e+8 Fixed 1.79e+7 Regular -7.21e-1
3 Mar 2014 New -1.69e+5 Fixed 1.75e+4 Regular -7.75e-1
4 Jan 2014 New 5.12e+6 Fixed -2.87e+5 Regular 5.96e+1
5 Feb 2014 New 1.00e+7 Fixed 8.99e+4 Regular 8.48e+1
6 Mar 2014 New 6.96e+0 Fixed 1.42e+0 Regular 1.07e+2
In base R, you may start from here (and, probably, you will find a less verbose solution soon). Notice that, it appears unclear to me why wide is better than long. It is not good practice to have repeated column names. However, an exception might be publishing.
# to preserve the order of levels
fb_split_df = \(x, f, ...) {
f = factor(f, levels = unique(f))
split(x, f, ...)
}
xyzzy = fb_split_df(reshape_df, reshape_df$Category)
xyzzy[[1]] = xyzzy[[1]][c("Date", "Category", "Values")]
xyzzy[-1] = lapply(xyzzy[-1], \(x) transform(x, Date = NULL))
do.call(cbind, xyzzy) |> setNames(c("Date", rep(c("Category", "Values"), 3)))
gives
Date Category Values Category Values Category Values
1 Jan 2014 New -4.185182e+08 Fixed 2.442509e+07 Regular 0.3016606
2 Feb 2014 New -1.689478e+08 Fixed 1.791965e+07 Regular -0.7212493
3 Mar 2014 New -1.688460e+05 Fixed 1.754400e+04 Regular -0.7754930
4 Jan 2014 New 5.121368e+06 Fixed -2.865449e+05 Regular 59.6061175
5 Feb 2014 New 1.002063e+07 Fixed 8.991746e+04 Regular 84.7931106
6 Mar 2014 New 6.964887e+00 Fixed 1.423652e+00 Regular 107.4448227
Starting with calculate_df
:
custom_reshape = \(df) {
tdf = t(df)
cats = tdf[1, ]
n = length(cats)
tdf = tdf[-1, ]
lapply(seq_len(ncol(tdf)), \(i) cbind(rep(cats[[i]], n), tdf[, i])) |>
do.call(cbind, args = _) |>
data.frame(row.names(tdf), ... = _, row.names = NULL)
}
custom_reshape(calculate_df) |>
setNames(c("Date", rep(c("Category", "Values"), 3)))