Search code examples
matrixclojurepcaincanterprincipal-components

Principal component function Incanter


I have been trying to use the principal-components function from Incanter to do PCA and seem to be off track in using it. I found some sample data online from a PCA tutorial and wanted to practice on it:

(def data [[0.69 0.49] [-1.31 -1.21] [0.39 0.99] [0.09 0.29] [1.29 1.09] 
           [0.49 0.79] [0.19 (- 0 0.31)] [(- 0 0.81) (- 0 0.81)] 
           [(- 0 0.31) (- 0 0.31)] [(- 0 0.71) (- 0 1.01)]])

Upon first attempt to implement PCA I tried passing vectors to Incanter's matrix function, but found myself passing it too many arguments. At this point I decided to try a nested vector structure as defined above, but would like to avoid this route.

How would I turn data into a matrix (Incanter) such that it will be accepted as input into Incanter's function principal-components. For simplicity let's call the new matrix fooMatrix.

Once this matrix, fooMatrix, has been constructed the following code should work to extract the first two principal components

     (def pca (principal-components fooMatrix))
     (def components (:rotation pca))
     (def pc1 (sel components :cols 0))
     (def pc2 (sel components :cols 1)) 

and then the data can be projected on the principal components by

     (def principal1 (mmult fooMatrix pc1)) 
     (def principal2 (mmult fooMatrix pc2))

Solution

  • Check out the Incanter API. I believe you just want (incanter.core/matrix data). These are your options for Incanter's matrix function. Maybe A2 is what you're interested in.

    (def A (matrix [[1 2 3] [4 5 6] [7 8 9]])) ; produces a 3x3 matrix
    (def A2 (matrix [1 2 3 4 5 6 7 8 9] 3)) ; produces the same 3x3 matrix
    (def B (matrix [1 2 3 4 5 6 7 8 9])) ; produces a 9x1 column vector
    

    Example using your data:

    user=> (use '[incanter core stats charts datasets])
    nil
    user=>(def data [0.69 0.49 -1.31 -1.21 0.39 0.99 0.09 0.29 1.29
                    1.09 0.49 0.79 0.19 (- 0 0.31) (- 0 0.81) (- 0 0.81)
                    (- 0 0.31) (- 0 0.31) (- 0 0.71) (- 0 1.01)])
    user=>(def fooMatrix (matrix data 2))
    user=>(principal-components fooMatrix)
    {:std-dev (1.3877785387777999 0.27215937850413047), :rotation  A 2x2 matrix
     -------------
    -7.07e-01 -7.07e-01 
    -7.07e-01  7.07e-01 
    }
    

    Voilà. Nested vector structure gone.