Search code examples
pythonpcacovarianceeigenvalue

How to calculate covariance matrix of data frame


I have read data frame of sensor data, using pandas read_fwf function. I need to find covariance matrix of read 928991 x 8 matrix. Eventually, I want to find eigen vectors and eigen values, using principal component analysis algorithm for this covariance matrix.


Solution

  • The answer of this question would be as follows

    import pandas as pd
    import numpy as np
    from numpy.linalg import eig
    
    df_sensor_data = pd.read_csv('HT_Sensor_dataset.dat', delim_whitespace=True)
    del df_sensor_data['id']
    del df_sensor_data['time']
    del df_sensor_data['Temp.']
    del df_sensor_data['Humidity']
    df = df_sensor_data.notna().astype('float64')
    covariance_matrix = df_sensor_data.cov()
    print(covariance_matrix)
    
    values, vectors = eig(covariance_matrix)
    print(values)
    print(vectors)