I am using the ESM-1b model to train it with some protein sequences. I already have the vectors and now I wanted to plot them using TSNE. However, when I try to pass the vectors to the TSNE model I get:
'list' object has no attribute 'shape'`
How should I plot the Pytorch vectors (they are Pytorch tensors, actually)?
The code I have so far:
sequence_representations = []
for i, (_, seq) in enumerate(new_list):
sequence_representations.append(token_representations[i, 1 : len(seq) + 1].mean(0))
This is an example of the Pytorch tensors I have (sequence_representations):
[tensor([-0.0054, 0.1090, -0.0046, ..., 0.0465, 0.0426, -0.0675]),
tensor([-0.0025, 0.0228, -0.0521, ..., -0.0611, 0.1010, -0.0103]),
tensor([ 0.1168, -0.0189, -0.0121, ..., -0.0388, 0.0586, -0.0285]),......
TSNE:
X_embedded = TSNE(n_components=2, learning_rate='auto', init='random').fit_transform(sequence_representations) #Where I get the error
Assuming you are using scipy's TSNE
, you'll need sequence_representations
to be
ndarray
of shape (n_samples
,n_features
)
Right now have a list of pytorch tensors.
To convert sequence_representations
to a numpy ndarray
you'll need:
seq_np = torch.stack(sequence_representations) # from list of 1d tensors to a 2d tensor
seq_np = seq_np.numpy() # convert to numpy