I would like to plot points with 100 parameters each with values between 0-99 on a 2 dimensional plot. This should be straightforward with normal methods of dimensionality reduction (PCA/tSNE/UMAP etc) but I need to be able to add subsequent points to the plot without it needing to recalculate and therefore change
I am picturing an algorithm that takes a data-point with it's 100 values and converts it to X,Y coordinates that can then be plotted. Points proximal in the 2D projection are proximal in the original 100D space. Does such an algorithm exist? If not, any alternative approaches?
Thanks
I am not sure I understood the question correctly but with an initial set X
, we can fit a PCA to compute the principal components. Then, we can use these principal components to transform new samples.
from sklearn.decomposition import PCA
import numpy as np
import matplotlib.pyplot as plt
n_samples, n_feats = 50, 100
X = np.random.randint(0, 99, size=n_samples * n_feats).reshape(n_samples, n_feats)
pca = PCA(n_components=2).fit(X)
X_reduced = pca.transform(X)
plt.scatter(X[:, 0], X[:, 1])
This plots,
Then, when a new sample comes in
new_sample = np.random.randint(0, 99, size=100).reshape(1, 100)
new_sample_reduced = pca.transform(new_sample)
plt.scatter(new_sample_reduced[:, 0], new_sample_reduced[:, 1], color="red")
We can plot it