Pandas ignore missing dates to find percentiles

I have a dataframe. I am trying to find percentiles of datetimes. I am using the function:

Dataframe:

student, attempts, time
student 1,14, 9/3/2019  12:32:32 AM
student 2,2, 9/3/2019  9:37:14 PM
student 3, 5
student 4, 16, 9/5/2019  8:58:14 PM

studentInfo2 = [14, 4, Timestamp('2019-09-04 00:26:36')]
data['time'] = pd.to_datetime(data['time_0001'], errors='coerce')
perc1_first = stats.percentileofscore(data['time'].notnull(), student2Info[2], 'rank')

where student2Info[2] holds the datetime for a particular student. When I try and do this I get the error:

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Any ideas on how I can get the percentile to calculate correctly even when there are missing times in the columns?

Solution

You need to transform the Timestamps into units that percentileofscore can understand. Also, pd.DataFrame.notnull() returns a boolean list that you may use to filter your DataFrame, it does not return the filtered list, so I've updated that for you. Here is a working example:

import pandas as pd
import scipy.stats as stats

data = pd.DataFrame.from_dict({
    "student": [1, 2, 3, 4],
    "attempts": [14, 2, 5, 16],
    "time_0001": [
        "9/3/2019  12:32:32 AM",
        "9/3/2019  9:37:14 PM",
        "",
        "9/5/2019  8:58:14 PM"
    ]
})

student2Info = [14, 4, pd.Timestamp('2019-09-04 00:26:36')]
data['time'] = pd.to_datetime(data['time_0001'], errors='coerce')
perc1_first = stats.percentileofscore(data[data['time'].notnull()].time.transform(pd.Timestamp.toordinal), student2Info[2].toordinal(), 'rank')
print(perc1_first)  #-> 66.66666666666667