Search code examples
pythonpandascrosstab

hwo to write crosstab query in python?


i have a function in python that use crosstab on dataframe

def event_mouhafaza(self,df):
        df_event_mohafazat_crosstab = pd.crosstab(df['event_mohafaza'],df['event_type'])
        print(df_event_mohafazat_crosstab)        

the above function work as it should and return the expected result.

when i try to replace the values of the crosstab query by variables the system crash.

def event_mouhafaza(self,df,items):
   
     for item in items:
         item1 = items[0]
         item2 = items[1]
        
     df = df.set_index(item2)    
     df_event_mohafazat_crosstab = pd.crosstab(df,item1,item2)
     print(df_event_mohafazat_crosstab)

And display this error:

df_event_mohafazat_crosstab = pd.crosstab(df,item1,item2)
  File "F:\AIenv\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab
    raise ValueError("values cannot be used without an aggfunc.")
ValueError: values cannot be used without an aggfunc.

where is the error in the second function and how to fix it?


Solution

  • You're using the crosstab function wrong in the second example. pd.crosstab does not take a dataframe as its first argument. Right now you're calling the function like this (using kwargs to highlight the issue). When you specify the values argument (like you are with a positional argument), pandas also expects something to be passed into the aggfunc argument as well. See the documentation for more info on that.

    # This will error out.
    pd.crosstab(index=df, columns=item1, values=item2)
    

    If item1 and item2 are the names of columns within your dataframe, you'll need to do this:

    pd.crosstab(index=df[item1], columns=df[item2])
    

    Next, you don't actually want to set item2 as the index if you're planning to use it in the crosstabulation. And your for-loop isn't actually doing anything, you can assign item1 and item2 without it:

    def event_mouhafaza(self,df,items):
       
         item1 = items[0]
         item2 = items[1]
            
         df_event_mohafazat_crosstab = pd.crosstab(index=df[item1], columns=df[item2])
         print(df_event_mohafazat_crosstab)