python scikit-learn nlp text-classification

How to have multioutput in text classification?

I'm doing dialect text classification. The problem is some tweets, can be classified as both dialect A and B, how can I do that? I want to do it and then automatically calculate the accuracy, I don't want to do it manually. When I don't classify them as both A and B, it gives me many misclassified texts.

In the training though, they're not classified as both dialect A and B. but separately.

Solution

Make use of OneHotEncoding

from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder

# Your target will look similar to
target = ['A', 'A', 'B']

# After OneHotEncoding
[[1, 0],
 [1, 0],
 [0, 1]]

After training on this target, your model will predict the probability of the class. You can set a threshhold to classify the prediction to both the classes

# Sample output
[[1., 0.],
 [0.5, 0.5],
 [0.1, 0.9]]

predictions = ['A', 'A and B', 'B']

Example

Webscraping Roblox
Remove the mandatory field label 'This field is required.' and fix the bug with 'clean_email'
How to plot a Probability Density Function in Python?
How large is a fresh install of Python?
Appending new elements into an empty list
Simple way to measure cell execution time in ipython notebook
PyAudio working, but spits out error messages each time
Reportlab show page number and page count IF there is more than one page in a document
How to read SharePoint Online (Office365) Excel files into Python specifically pandas with Work or School Account?
How to set a column which suffix name is based on a value in another column
Debugging Python C++ extension from Visual Studio Code on Linux
How can I get all users on Google admin_sdk?
csv.Error: iterator should return strings, not bytes
How to check if an object has an attribute?
How to use selenium with proxy auth in headless mode?
Is there a way to exit a pytest test and continue to the next one?
Returning the lowest index for the first non whitespace character in a string in Python
Formatting exceptions as Python does
Prime factorization using list comprehension in Python
Why does the power spectrum E(k) of my velocity field follow 𝑘 ^(−(n−1)) instead of 𝑘^(−n)?
How to merge dataframes over multiple columns and split rows?
How to create a Sympy IndexedBase using a custom subclass of Symbol?
Removing dynamically an element from a list
Returning boolean if set is empty
Can variables be decorated?
Fast(est) exponentiation of numpy 3D matrix
Removing an element from a list based on a condition
Printing elements of dictionary line by line
Matplotlib does not display the hatch of a patch in a legend
Python win32com - Class not registered error