Search code examples
pythonlda

Can I apply LDA (latent dirichlet allocation) to a different language corpus?


I am trying to analyze a text corpus obtained from a Turkish virtual community website to examine the user generated content during protests. Specifically, I plan to apply LDA to determine topics. I havent used LDA before and I dont exactly know if its applicable to a different language setting.

Thanks


Solution

  • Yes, I see no reason why not. It might be hard to find out of the box solutions for some preprocessing steps, but apparently it had been done previously: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=6830499