Search code examples
pythonparallel-processingmultiprocessingword2vecdimensionality-reduction

Parallel version of t-SNE


Is there any Python library with parallel version of t-SNE algorithm? Or does the multicore/parallel t-SNE algorithm exist?

I'm trying to reduce dimension (300d -> 2d) of all word2vecs in my vocabulary using t-SNE.

Problem: the size of vocabulary is about 130000 and it takes too long to proceed t-SNE for them.


Solution

  • Yes there is a parallel version of the barnes-hutt implementation of t-SNE. https://github.com/DmitryUlyanov/Multicore-TSNE

    There is also now a new implementation of tSNE that uses a Fast-Fourier transform funciton to significantly speed up the convolution step. It also uses the ANNOY library to perform the nearest neighbours search, the default tree-based method is also there and both take advantage of parallel processing.

    Original code is available here: https://github.com/KlugerLab/FIt-SNE

    and an R package version here: https://github.com/JulianSpagnuolo/FIt-SNE