Search code examples
pandasmachine-learningscikit-learncluster-analysisk-means

ValueError: invalid literal for int() with base 10: '2020-12-22 00:00:00' date time to time stamp conversion


I am doing clustering on date and value data. while performing k-means clustering date data not not fitted in model showing this error

k_means = KMeans(n_clusters=2)
k_means.fit(df) 

Error:

ValueError  Traceback (most recent call last)

    <ipython-input-43-caf6280a5928> in <cell line: 2>()
      1 k_means = KMeans(n_clusters=2)
   --> 2 k_means.fit(df)

    -> 2070 return np.asarray(self._values, dtype=dtype)

  

    ValueError: could not convert string to float: '2020-12-22 00:00:00'

    so i use to convert date into timestamp so that fitted into model 
    but while conversion into timestamp from date showing following error  


        df["stamp"] = df["Alert_Time"].values.astype(np.int64) // 10 ** 9


    ValueError      Traceback (most recent call last)
   

    <ipython-input-50-8c3cf615eeda> in <cell line: 1>()
    -> 1 df["stamp"]=df["Alert_Time"].values.astype(np.int64) // 10 ** 9

    ValueError: invalid literal for int() with base 10: '2020-12-22'

Solution

  • You can use pd.to_datetime:

    df['stamp'] = pd.to_datetime(df['Alert_Time']).sub(pd.Timestamp(0)).dt.total_seconds()
    print(df)
    
    # Output
                Alert_Time         stamp
    0  2020-12-22 00:00:00  1.608595e+09
    

    Minimal Reproducible Example:

    df = pd.DataFrame({'Alert_Time': ['2020-12-22 00:00:00']})