Search code examples
pythonpca

How to get the 1st Principal Component by PCA using Python?


I have a set of 2D vectors presented in a n*2 matrix form.

I wish to get the 1st principal component, i.e. the vector that indicates the direction with the largest variance.

I have found a rather detailed documentation on this from Rice University.

Based on this, I have imported the data and done the following:

import numpy as np

dataMatrix = np.array(aListOfLists)   # Convert a list-of-lists into a numpy array.  aListOfLists is the data points in a regular list-of-lists type matrix.
myPCA = PCA(dataMatrix)   # make a new PCA object from a numpy array object

Then how may I get the 3D vector that is the 1st Principal Component?


Solution

  • PCA gives only 2d vecs from 2d data.

    Look at the picture in Wikipedia PCA:
    starting with a point cloud (dataMatrix) like that, and using matplotlib.mlab.PCA,
    myPCA.Wt[0] is the first PC, the long one in the picture.