I'm using sklearn PCA technique. I need to solve:
pca1 = beta1. c1 + beta2. c2 + beta3. c3 + beta4. c4 + beta5. c5
I read in the documentation that The components are sorted by explained_variance_. How do I know who the beta values are?
d = {'c1': [3, 7 ,1 ,4], 'c2': [8, 2 ,9 ,5], 'c3': [0, 7 ,9 ,2], 'c4': [3, 5 ,9 ,1], 'c5': [4, 6 ,8 ,3]}
data= pd.DataFrame(data=d)
print("data:\n",data,"\n")
x = StandardScaler().fit_transform(data)
pca = PCA(n_components=1)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents, columns = ['principal
component 1'])
print("\ncomponents: \n",pca.components_,"\n")
print("\nexplained_variance_\n",pca.explained_variance_,"\n")
Result:
data:
+--+----+----+----+-----+----+
| | c1 | c2 | c3 | c4 | c5 |
|0 | 3 | 8 | 0 | 3 | 4 |
|1 | 7 | 2 | 7 | 5 | 6 |
|2 | 1 | 9 | 9 | 9 | 8 |
|3 | 4 | 5 | 2 | 1 | 3 |
+--+----+----+----+-----+----+
components:
[[-0.32703417 0.29320425 0.45731291 0.55565347 0.53776765]]
explained_variance_:
[ 3.10207373]
beta are components!
beta1 = -0.32703417
beta2 = 0.29320425
beta3 = 0.45731291
beta4 = 0.55565347
beta5 = 0.53776765