in the screenshot below you'll find a dataframe that contains string values in each cell. What i would like to do is to create a new dataframe out of this one that contains 3 columns: 'Very interested' 'Somewhat interested', and 'Not interested'. I don't know how to transform the original df into this new one, i tried just counting the values that meets a condition like 'Very interested' and putting them into a new df but the numbers don't seem right.
i would appreciate any help here. Thank you.
EDIT: here is also the code to reproduce a dataframe similar to the one in the screenshot:
df = pd.DataFrame({1: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 2: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 3: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 4: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 5: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 6: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested']},
index=['Big Data','Data Analysis','Data Journalism', 'Data Visualization', 'Deep Learning', 'Machine Learning'])
As per the desired output, it should be something like this:
I think need reshape by melt
and then get counts by GroupBy.size
with Series.unstack
:
df = (df.rename_axis('val')
.reset_index()
.melt('val', var_name='a', value_name='b')
.groupby(['val','b'])
.size()
.unstack(fill_value=0))
Another solution withstack
, counts by SeriesGroupBy.value_counts
with Series.unstack
:
df = (df.stack()
.groupby(level=0)
.value_counts()
.unstack(fill_value=0))