Search code examples
pythonoutliersdbscan

Use the DBSCAN algorithm on data


I'm trying to apply the DBSCAN algorithm on a small dataframe to make outlier prediction after. All the columns have numeric values but I keep getting the same error even though I have no null values.

This is my code to call the algorithm:

    db = DBSCAN(eps=0.09, min_samples=10).fit(dfc)
    m = loop.LocalOutlierProbability(dfc).fit()
    scores_noclust = m.local_outlier_probabilities
    m_clust = loop.LocalOutlierProbability(dfc, 
    cluster_labels=list(db.labels_)).fit()
    scores_clust = m_clust.local_outlier_probabilities
    print(list(scores_clust))

I get this error:

ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

I don´t understand why, since I have no null values.


Solution

  • According to your comments, it seems that you have a column that has objects that you need to cast into integers.

    dfc['Idade'] = pd.to_numeric(dfc['Idade']).astype(int)
    

    Just doing the cast is not enough, it just returns a new serie, but it won't modify the old one in place, you have to do this explicitly.