nlp gensim preprocessor lda latent-semantic-analysis

How do i retain numbers while preprocessing data using gensim in python?

I have used gensim.utils.simple_preprocess(str(sentence) to create a dictionary of words that I want to use for topic modelling. However, this is also filtering important numbers (house resolutions, bill no, etc) that I really need. How did I overcome this? Possibly by replacing digits with their word form. How do i go about it, though?

Solution

You don't have to use simple_preprocess() - it's not doing much, it's not that configurable or sophisticated, and typically the other Gensim algorithms just need lists-of-tokens.

So, choose your own tokenization - which in some cases, depnding on your source data, could be as simple as a .split() on whitespace.

If you want to look at what simple_preprocess() does, as a model, you can view its Python source at:

https://github.com/RaRe-Technologies/gensim/blob/351456b4f7d597e5a4522e71acedf785b2128ca1/gensim/utils.py#L288

I want to install the "n" package and I get an error
n <version> command does not activate specified version
Change n install location
How to install a specific version of Node on Ubuntu/Debian?
Different node version for different projects, is there a way of telling node which version to use?
Install Node.js to install n to install Node.js?
How to select the latest node.js v6 version using n?
n-install: ERROR: GNU Make not found, which is required for operation
How to downgrade Node version with n
how switch to previous version in n (Node version manager)?
Automatically use the right version of Node for a package
internal/modules/cjs/loader.js:905 -> throw err;
Why doesn't "n" downgrade my node version on a Mac?
Node version manager
n failed to install/switch node in Linux?
vue command not found on Mac
How to uninstall n and all node versions installed by n
Angular CLI on HTTPS - can't install CI as root
n (node version manager): cannot create directory
npm module n emits errors
How to update npm permanently?
Cannot change nodejs version using n
upgrade nodejs to stable version
How should I install and use multiple versions of Node on the same production machine?