Search code examples
pythonmachine-learningscikit-learnpca

Why I'm getting this error while executing KernalPCA method present in decomposition module of sklearn


I was trying out Kernal PCA using sklearn library on heart disease dataset from kaggle (https://www.kaggle.com/ronitf/heart-disease-uci)so, I have created a list of all the types of kernals in list "P" and passed into the KernalPCA() method to parameter kernel.

When I execute the below code I get this error message attached after the code.

The outputs of the plot are completely fine but I get the error

I was curious why? Could anyone please help?

from sklearn import decomposition
from sklearn.preprocessing import StandardScaler
from scipy import sparse
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import seaborn as sns
df = pd.read_csv('heart.csv')
target = df['target']
df.head()
Scaler = StandardScaler()
# X represents Standardized data of df
X = Scaler.fit_transform(df)
X.shape
n=2
p = ['linear','poly','rbf','sigmoid','cosine','precomputed']
for i in p:
    trans = decomposition.KernelPCA(n_components=n,kernel=i)
    Xli = trans.fit_transform(X)
    y = pd.DataFrame(Xli,columns=('PC1','PC2'))
    y['Target'] = target

This was the error snip when the above code is executed


Solution

  • It fails on your last choice of kernel. This works pretty ok:

    np.random.seed(111)
    X = np.random.uniform(0,1,(10,4))
    target = np.random.normal(0,1,10)
    
    p = ['linear','poly','rbf','sigmoid','cosine']
    for i in p:
        trans = decomposition.KernelPCA(n_components=n,kernel=i)
        Xli = trans.fit_transform(X)
        y = pd.DataFrame(Xli,columns=('PC1','PC2'))
        y['Target'] = target
    

    If you specify kernel = 'precomputed', then you need to provide the gram matrix, see this answer, for example if we precompute the gram matrix with a linear kernel:

    def linear_kernel(X, Y):
        return X.dot(Y.T)
    
    gram = linear_kernel(X, X)
    trans = decomposition.KernelPCA(n_components=n,kernel="precomputed")
    trans.fit_transform(gram)
    
    array([[ 0.34115243,  0.08282281],
           [ 0.34927523, -0.51709   ],
           [-0.48173365, -0.05455087],
           [-0.34252946, -0.21207875],
           [ 0.66528647, -0.12052876],
           [ 0.04018184,  0.71760041],
           [-0.35535148, -0.2107046 ],
           [ 0.04163704,  0.16239367],
           [-0.48902704,  0.01668406],
           [ 0.23110862,  0.13545204]])
    

    Compare with:

    trans = decomposition.KernelPCA(n_components=n,kernel="linear")
    trans.fit_transform(X)
    
    array([[ 0.34115243,  0.08282281],
           [ 0.34927523, -0.51709   ],
           [-0.48173365, -0.05455087],
           [-0.34252946, -0.21207875],
           [ 0.66528647, -0.12052876],
           [ 0.04018184,  0.71760041],
           [-0.35535148, -0.2107046 ],
           [ 0.04163704,  0.16239367],
           [-0.48902704,  0.01668406],
           [ 0.23110862,  0.13545204]])