Search code examples
pandasdataframenonetype

Pandas None values vs Empty dataframe


I have two queries with different results for no data in pandas dataframe.

First query:

with self.connection as conn:
            query = f"""SELECT current_client_name, client_phone_number, profile, effective_date, effective_time, 
                                          client_details.telegram_id, telegram_reference
                                          FROM `clients` JOIN `client_details` 
                                         ON clients.telegram_id=client_details.telegram_id 
                                         WHERE `status` = \'{status}\' order by effective_date"""
            df_result = pd.read_sql_query(query, conn)
            return df_result

In the result when no records found i see the empty dataframe:

Empty DataFrame
Columns: [current_client_name, client_phone_number, profile, effective_date, effective_time, telegram_id, telegram_reference]
Index: []

and can catch empty dataframe with next condition:

if df.empty is True:

Second query

with self.connection as conn:
            query = f"SELECT MAX(effective_date), effective_time FROM `client_details` WHERE `telegram_id` = {telegram_id} " \
                    f"and (`status` = \'WAIT\' or `status`= \'APPROVED\')"
            df_result = pd.read_sql_query(query, conn)
            return df_result

This query for some reasons return me not empty dataframe but dataframe with None in columns

df MAX(effective_date) effective_time
0                None           None

this result can not be handled with next code because for some reasons was returned is not empty dataframe!

if df.empty is True:

Does anybody know why its happening : why return for some cases empty dataframe and why return dataframe with None type in columns? How to avoid or handle this?

Thanks


Solution

  • In the first case, your query doesn't return any results, thus there are no entries in your df.

    In the second case, your query returns a result, but those fields are empty in your SQL database, thus your have a non-empty DataFrame, containing None values.

    Depending if you only want rows containing all non-null values (e.g. both columns in your example have to be non-null), or if some null values are ok (e.g. one of the columns is Null, while another is non-Null), you could drop rows containing these None values:

    To drop rows containing any None values:

    df = df.dropna(how = 'any')
    

    To drop rows only containing None values:

    df = df.dropna(how = 'all')
    

    Then the check if df.empty should come as True.