Search code examples
pythonpandasdata-analysis

Roll up categorical values into second variable Python


I have data that comes from a data source with pre-coded categorical variables. Unfortunately, these are not the variables that I need for my analysis and need to roll them up into a second column:

age_group  lifestage
18-24      young adult
25-34      adult
35-44      adult
45-54      adult
.          .
.          .
.          .

I currently am using a loop through lists in order to do this:

ya_list = ['18-24']
adult_list = ['25-34', '35-44', '45-54']

for age in age_group:
    if age in ya_list:
        lifestage = 'young adult' 
    elif age in adult_list:
        lifestage = 'adult'

This works ok for this example with only a few groups to recode into, but when I have groups with 10 or more groups to recode, it becomes a lot more unwieldy. I can't help but think there has to be a better way to do this, but I haven't been able to find one.


Solution

  • You want a dictionary:

    stages = {'18-24': 'young adult',
              '25-34': 'adult', ...}
    
    for age in age_group:
        lifestage = stages[age]
    

    This is the canonical replacement for a lot of elifs in Python.