python pandas dataframe pandas-groupby analysis

Filtering data using python Pandas liberary

I have a column I have created in my Dataframe with either true or false values. now I want to analyze the data using those true or false values (as in, I only care about the true values). I'm trying to write code that can do this: if the row at that column is true, then get the data from another column of that row. More precisely I used the .groupby().count() function for data frames and would like to continue using that if possible, but would want to only count the rows with the corresponding true values. I'd appreciate any type of help! :)

Edit: The comments were helpful but they didn't answer the question I had (Sorry for lack of example earlier). Data example

For example, lets assume this is my table. I'd like to only count the person if Single == True. How would I change the .groupby().count() methods to do so?

Solution

Like the comments say, you should add some simple sample data and state what you expect the outcome to look like. Since you don't give any data in the OP, I made some up.

Here are a couple ways to look at how many people own cats in these cities. You can see how easy it is to make up data that can be used in your question. The groupby applied here groups by city and counts the True & False.

import pandas as pd

### Make up data
colA = [1, 2, 3, 4]
colB = ['yes', 'no', 'yes', 'yes']
colC = ['Paris', 'London', 'London', 'Atlanta']
df = pd.DataFrame(list(zip(colA, colB, colC)), 
                  columns =['person_id', 'has_cat', 'city']) 
df['myboolean'] = df['has_cat']
df.replace({'myboolean': {'yes': True, 'no': False}}, inplace=True)
df['myboolean'] = df['myboolean'].astype('bool')
display(df)

df.groupby('city')['myboolean'].value_counts()

Another way to do it below. If the column is set to boolean, then the true/false is treated as 1/0 and you can use sum. The groupby applied here groups by city and counts the Trues.

df.groupby('city')['myboolean'].sum().astype(int)

And if you want to extract the rows that are True into a new dataframe:

mysubset = df.loc[df['myboolean'] == True]
display(mysubset)