I am trying to add a column to a dataframe - ad_clicks.
ad_clicks already has a column - ad_click_timestamp which has time values if a customer clicked on the website. If he/she did not, then the value in the cell is nan.
If I do this -
ad_clicks['is_click'] = ~ad_clicks.ad_click_timestamp.isnull()
it works.
If I do this -
ad_clicks['is_click'] = ad_clicks.apply(lambda row : ~row.ad_click_timestamp.isnull(), axis = 1)
it throws me an error as follows -
AttributeError: ("'str' object has no attribute 'isnull'", u'occurred at index 0')
As per my understanding - using apply without a specified column, the input to the lambda function becomes the row and by specifying the column to operate on - it selects that particular column for each row and performs the operation.
So, why does the same expression work without the lambda - when applied to the column but not to with lambda when applied to the cell.
What am I missing here? Thanks in advance for any explanation (or syntax correction that I may be missing).
Your code doesn't work because in an apply
call, row.ad_click_timestamp
would give a single item, which is like np.nan
. So you would have to try:
ad_clicks['is_click'] = ad_clicks.apply(lambda row : ~np.isnan(row.ad_click_timestamp), axis = 1)
Or:
ad_clicks['is_click'] = ad_clicks.apply(lambda row : ~(row.ad_click_timestamp != row.ad_click_timestamp), axis = 1)