Search code examples
rdata-visualizationpanel-data

R studio - How to group data in a panel by year and sum them


I have a panel dataset for 20 years and 10 companies.

For every company I have data on sales in Europe and the US.

I would like to plot the overall sales in Europe and the US for every year.

Basically, I need to sum the figure from every company for every year for the respective variable.

How should I do that?

Thanks everybody! I solved the problem usign group_by.

USsales <- data %>% group_by(Year) %>% summarize(tot_USsales = sum(USsales, na.rm = TRUE))

Europesales <- data %>% group_by(Year) %>% summarize(tot_Eursales = sum(Eursales, na.rm = TRUE))

netsales <- merge(Europesales, USsales, by="Year")

Then I just plot it with ggplot.

Thank you guys


Solution

  • If you use R, you could do something like this:

    require(dplyr)
    require(magrittr)
    
    OverallSalesEurope <- Dataset %>% 
                           filter(Region == "Europe") %>% 
                            group_by(Company, Year) %>% 
                             summarize(OverallSales = 
                                       sum(Sales, na.rm=TRUE)) 
    
    OverallSalesUS <- Dataset %>% 
                           filter(Region == "US") %>% 
                            group_by(Company, Year) %>% 
                             summarize(OverallSales = 
                                       sum(Sales, na.rm=TRUE)) 
    

    Of course, we don't know what your variables are named in your data set but the principles involved are illustrated in the code above.