Suppose I have the following data
A <- c(4,4,4,4,4)
B <- c(1,2,3,4,4)
C <- c(1,2,4,4,4)
D <- c(3,2,4,1,4)
E <- c(4,4,4,4,5)
data <- data.frame(A,B,C,D,E)
data<- t(data)
colnames(data) = c("num1","freq1","freq2","freq3","totfreq")
> data
num1 freq1 freq2 freq3 totfreq
A 4 4 4 4 4
B 1 2 3 4 4
C 1 2 4 4 4
D 3 2 4 1 4
E 4 4 4 4 5
I am trying to plot a grouped bar chart. The x-axis on both should be my variables A:E
, and y
is the values for freq1
, freq2
, freq3
for each letter. I also need to keep the capability to plot variables A:E
by values in totfreq
.
I know I need to convert to long form but I'm having trouble with how my data is set up. Somehow I need A
, B
, C
, D
, E
need to stack into a column, another column that stacks freq1
, freq2
, freq3
, totfreq
, and then a last column with the values. Any advice how to accomplish this?
I'm looking to plot preferably in plotly, but ggplot would work too
First off, you have a matrix but probably want a data frame. Making it a tibble will drop the row names, which is where your letters are stored, so
as.data.frame(data) %>% rownames_to_column("id")
will get you a data frame with a column id
of letters.
You want to put this data into a long format by gathering all the freq
columns. I'm then adding a column that gives the type of observation; this isn't necessary, but since you say you want to filter easily for one of two types—either the groups freq1
, etc, or totfreq
—this is a handy setup that I often use.
library(tidyverse)
A <- c(4,4,4,4,4)
B <- c(1,2,3,4,4)
C <- c(1,2,4,4,4)
D <- c(3,2,4,1,4)
E <- c(4,4,4,4,5)
data <- data.frame(A,B,C,D,E)
data<- t(data)
colnames(data) = c("num1","freq1","freq2","freq3","totfreq")
data_long <- as.data.frame(data) %>%
rownames_to_column("id") %>%
gather(key = var, value = value, freq1:totfreq) %>%
mutate(type = ifelse(var == "totfreq", "total", "by_group"))
head(data_long)
#> id num1 var value type
#> 1 A 4 freq1 4 by_group
#> 2 B 1 freq1 2 by_group
#> 3 C 1 freq1 2 by_group
#> 4 D 3 freq1 2 by_group
#> 5 E 4 freq1 4 by_group
#> 6 A 4 freq2 4 by_group
With the type
column, it's really easy to filter by type for plotting. This would let you either pipe a filtered data frame into something like ggplot
, or gives you a column to use for faceting or mapping onto an aesthetic.
# for grouped bar chart
data_long %>% filter(type == "by_group")
#> id num1 var value type
#> 1 A 4 freq1 4 by_group
#> 2 B 1 freq1 2 by_group
#> 3 C 1 freq1 2 by_group
#> 4 D 3 freq1 2 by_group
#> 5 E 4 freq1 4 by_group
#> 6 A 4 freq2 4 by_group
#> 7 B 1 freq2 3 by_group
#> 8 C 1 freq2 4 by_group
#> 9 D 3 freq2 4 by_group
#> 10 E 4 freq2 4 by_group
#> 11 A 4 freq3 4 by_group
#> 12 B 1 freq3 4 by_group
#> 13 C 1 freq3 4 by_group
#> 14 D 3 freq3 1 by_group
#> 15 E 4 freq3 4 by_group
# for total freqs
data_long %>% filter(type == "total")
#> id num1 var value type
#> 1 A 4 totfreq 4 total
#> 2 B 1 totfreq 4 total
#> 3 C 1 totfreq 4 total
#> 4 D 3 totfreq 4 total
#> 5 E 4 totfreq 5 total
Created on 2018-05-17 by the reprex package (v0.2.0).