I have the following dataframe,
id sqft years_left date price
0 1400 65 01-01-2021 xxx
1 1200 49 01-01-1950 xxx
..
950,000 1600 10 09-05-1990 xx
I want to run a correlation between years_left and price.
But I want to bin the data by the year_left column into 100 bins, between 1 and 999 years, so like
df = df[df[years_left] > 0 && df[years_left] < 10]
df2 = df[df[years_left] > 11 && df[years_left] < 20]
....
I have to manually create each dataframe. Or using a for loop to create the dataframe in a list.
Is there a shortcut to do what I want? To bin the data into how ever many bins I want then find correlations within each bin?
you can create a dataframe dict like this:
df_dict = {f'df{index}': k for index, (_, k) in enumerate(
df.groupby(pd.cut(df.years_left, bins=range(0, 1000, 10))))}
And then you can access df's like:
df_dict['df1'] and so on ...
NOTE: Instead of dict comprehension
you can also use list comprehension
.