Categorize birth-death data in epochs

I have data regarding the years of birth and death of several people. I want to compute efficiently how many people are in each of a group of pre-defined epochs.

For example. If I have this list of data:

Paul 1920-1950
Sara 1930-1950
Mark 1960-2020
Lennard 1960-1970

and I define the epochs 1900-1980 and 1980-2023, I would want to compute the number of people alive in each period (not necessarily the whole range of the years). In this case, the result would be 4 people (Paul, Sara, Mark and Lennard) for the first epoch and 1 person (Mark) for the second epoch.

Is there any efficient routine out there? I would like to know, as the only way I can think of now is to create a huge loop with a lot of ifs to start categorizing.

I really appreciate any help you can provide.

Solution

So, you need to check that they have been born before the end of the period AND they have died after the start of the period.

One way could be:

# add columns for birth year and death year
df[['birth', 'death']] = df['birt/death'].str.split('-', expand=True)


# convert to numeric (https://stackoverflow.com/a/43266945/15032126)
cols = ['birth', 'death']
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce', axis=1)

index	name	birt/death	birth	death
0	Paul	1920-1950	1920	1950
1	Sara	1920-1950	1920	1950
2	Mark	1960-2020	1960	2020
3	Lennard	1960-1970	1960	1970

def counts_per_epoch(df, start, end):
    return len(df[(df['birth'] <= end) & (df['death'] >= start)])

print(counts_per_epoch(df, 1900, 1980))
print(counts_per_epoch(df, 1980, 2023))

# Prints
4
1