I have a set of 2D vectors presented in a n*2
matrix form.
I wish to get the 1st principal component, i.e. the vector that indicates the direction with the largest variance.
I have found a rather detailed documentation on this from Rice University.
Based on this, I have imported the data and done the following:
import numpy as np
dataMatrix = np.array(aListOfLists) # Convert a list-of-lists into a numpy array. aListOfLists is the data points in a regular list-of-lists type matrix.
myPCA = PCA(dataMatrix) # make a new PCA object from a numpy array object
Then how may I get the 3D vector that is the 1st Principal Component?
PCA gives only 2d vecs from 2d data.
Look at the picture in Wikipedia PCA:
starting with a point cloud (dataMatrix) like that, and using matplotlib.mlab.PCA
,
myPCA.Wt[0]
is the first PC, the long one in the picture.