I have a data frame like this:
df <- data.frame(Country = rep(c("US","CA"),each=3),
Variable = c("Inflation","Unemployment","Interest rate"),
Month = rnorm(6), Quarter = rnorm(6)+2, Year=rnorm(6)+3)
and I would like to transform it into something like this:
I have tried the tables package, but did not succeed. Does anyone have any clue about how to achieve this?
EDIT: The output may be either in LaTeX or Text form.
You can't merge data.frame
cells in R like you would in excel, but you can combine the Country
and period columns. Here's a solution with dplyr
+ tidyr
:
library(dplyr)
library(tidyr)
df %>%
gather(var, value, Month:Year) %>%
unite("var", Country, var) %>%
spread(var, value)
Result:
Variable CA_Month CA_Quarter CA_Year US_Month US_Quarter US_Year
1 Inflation 0.2760235 1.758310 4.233976 -0.4321298 3.6232025 5.149919
2 Interest rate -0.5208693 1.227022 3.412022 1.2283928 3.6858872 3.495870
3 Unemployment -1.0489755 1.531800 3.634362 1.6898725 0.9299318 1.665646
To actually create a table (for reporting) that has the Country
merged, here is how to do it with a one-liner using the tables
package:
library(tables)
tabular(Variable ~ Heading()*Country*Heading()*identity*(Month + Quarter + Year), data=df)
Result:
CA US
Variable Month Quarter Year Month Quarter Year
Inflation 0.5269 2.152 3.854 -0.9456 3.764 1.432
Interest rate 1.3974 1.820 3.340 0.4520 1.734 3.962
Unemployment -0.2303 3.377 3.419 -0.6652 2.486 2.739
tabular
uses its unique expression to generate the table formats:
~
separates the expression for rows from columns. I am displaying the rows of Variable
.
*
means that you are nesting one column into another. In this case, I am nesting the Month:Year
columns in Country
.
identity
specifies that you are displaying the actual values in each cell.
Heading
replaces the heading of the next item with a string. In this case, I am replacing "Country"
and "identity"
with blank.
To output as latex, you can wrap the entire expression with the latex
function:
latex(tabular(Variable ~ Heading()*Country*Heading()*identity*(Month + Quarter + Year), data=df))
Result:
\begin{tabular}{lcccccc}
\hline
& \multicolumn{6}{c}{Country} \\
& \multicolumn{3}{c}{CA} & \multicolumn{3}{c}{US} \\
Variable & Month & Quarter & Year & Month & Quarter & \multicolumn{1}{c}{Year} \\
\hline
Inflation & $\phantom{-}0.5269$ & $2.152$ & $3.854$ & $-0.9456$ & $3.764$ & $1.432$ \\
Interest rate & $\phantom{-}1.3974$ & $1.820$ & $3.340$ & $\phantom{-}0.4520$ & $1.734$ & $3.962$ \\
Unemployment & $-0.2303$ & $3.377$ & $3.419$ & $-0.6652$ & $2.486$ & $2.739$ \\
\hline
\end{tabular}