Search code examples
rpivotdemographics

How can I add the populations of males and females together to remove gender as a variable in a demographics table. In R Studio


This is my first time posting a question, so may not have the correct info to start, apologies in advance. Am new to R. Prefer to use dplyr or tidyverse because those are the packages we've used so far. I did search for a similar question, but most gender/sex related questions are around separating the data, or performing operations on each separately.

I have a table of population counts, with variables (factors) Age Range, Year and Sex, with Population as the dependent variable. I want to create a plot to show if the population is aging - that is, showing how the relative proportion of different ages groups changes over time. But gender is not relevant, so I want to add together the population counts for males and females, for each year and age range.

I don't know how to provide a copy of the raw data .csv file, so if you have any suggestions, please let me know.

This is a sample of the data(output table): output table

And here is the code so far:

 file_name <- "AusPopDemographics.csv"
AusDemo_df = read.table(file_name,",", header=TRUE)

(grp_AusDemo_df <- AusDemo_df %>% group_by(Year, Age))

I am guessing it may be something like pivot(wider) to bring male and female up as column headings, then transmute() to sum them and create a new population column.

Thanks for your help.


Solution

  • With dplyr you could do something like this

    library(dplyr)
    grp_AusDemo_df <- AusDemo_df %>% 
      group_by(Year, Age) %>%
      summarise(Population = sum(Population, na.rm = TRUE))