Search code examples
pandasjupytercrosstab

Program Pandas Cross Tab in a Loop


I'm trying to publish a series of cross tabs in pandas, in the context of a jupyter notebook, like so:

def crosstab_all(dataset,attributelist):
    for k in attributelist:
        print('k',k)
        pd.crosstab(dataset[k],dataset["successfulmatch"], normalize=True, margins=True, margins_name="Total")

attributelist=["has_closing_date","has_address",'has_price','has_listing_date','has_contract_dates','has_tsp','has_susan','has_sell_side','has_buy_side','has_both_sides','has_beth','has_agent','has_admin','has_closing_tsp','has_key_stages']

crosstab_all(dataset,attributelist)

I find that if I just do:

k="has_closing_date"
pd.crosstab(dataset[k],dataset["successfulmatch"], normalize=True, margins=True, margins_name="Total")

... it will work. The issue seems to be running successive crosstab function calls. So for example having two crosstab commands in immediate succession will fail. I suspect that the problem is not the crosstab command as such but rather some extra step I need to spawn multiple jupyter windows.

Anyway, I appreciate any suggestions as to how to make this work.


Solution

  • OK, I found something that works. This solution doesn't generate separate windows, and you lose some formatting, but I learned that crosstab returns a dataframe, so you can just print that, like so:

    def crosstab_all(dataset,attributelist):
        for k in attributelist:
            xdf=pd.crosstab(dataset[k],dataset["successfulmatch"], normalize=True, margins=True, margins_name="Total")
            print('xdf',xdf)
            print('') # for spacing
    
    attributelist=["has_closing_date","has_address",'has_price','has_listing_date','has_contract_dates','has_tsp','has_susan','has_sell_side','has_buy_side','has_both_sides','has_beth','has_agent','has_admin','has_closing_tsp','has_key_stages']
    
    crosstab_all(dataset,attributelist) # dataset is a dataframe
    

    This will return you an unformatted xtab with each pass of the loop.