Search code examples
neural-networknlp

Word to vector where should I start?


I'm trying to implement a neural networks model on labeled data that I have. The data contains several columns (categorical and numeric features as well).
Few columns in this data contains a short description, written by users which I also want to analyze but I don't know how to start. The data looks something like this:

problem ID   status   description                        labels
1            closed   short description of the problem   CRM
2            open     short description of the problem   ERP 
3            closed   short description of the problem   CRM

Using status (which I will convert into dummy variables) and description (this is where I need you guys), I want to train the model to predict the labels.

Any idea about how should I start? How can I convert the description columns into a useful data?

Thanks!


Solution

  • You want to do the classification basically based on the features, for categorical variables encode them into some trainable form. for text first, perform cleaning, if that has more numbers then convert numbers into their words form and make vectors for it using tf-idf or any other vectorization approach, also normalize your numerical features and then train a simple svm classifier with it, if not giving good accuracy then go with CNN and LSTM based neural network, you can also try CNN+Embeddings for better results.