Search code examples
pythondataframelambdarowisnull

Python isnull() - using Lambda on a row


I am trying to add a column to a dataframe - ad_clicks.

ad_clicks already has a column - ad_click_timestamp which has time values if a customer clicked on the website. If he/she did not, then the value in the cell is nan.

If I do this -

ad_clicks['is_click'] = ~ad_clicks.ad_click_timestamp.isnull()

it works.

If I do this -

ad_clicks['is_click'] = ad_clicks.apply(lambda row : ~row.ad_click_timestamp.isnull(), axis = 1)

it throws me an error as follows -

AttributeError: ("'str' object has no attribute 'isnull'", u'occurred at index 0')

As per my understanding - using apply without a specified column, the input to the lambda function becomes the row and by specifying the column to operate on - it selects that particular column for each row and performs the operation.

So, why does the same expression work without the lambda - when applied to the column but not to with lambda when applied to the cell.

What am I missing here? Thanks in advance for any explanation (or syntax correction that I may be missing).


Solution

  • Your code doesn't work because in an apply call, row.ad_click_timestamp would give a single item, which is like np.nan. So you would have to try:

    ad_clicks['is_click'] = ad_clicks.apply(lambda row : ~np.isnan(row.ad_click_timestamp), axis = 1)
    

    Or:

    ad_clicks['is_click'] = ad_clicks.apply(lambda row : ~(row.ad_click_timestamp != row.ad_click_timestamp), axis = 1)