I am currently using MS LUIS for Chatbot.
Our country usually talks and chats using 2 languages, English and Chinese.
However, in LUIS I can only define one culture.
As a result, when my culture is set to English, and when I imports Chinese text, the confidence level is very low (e.g. English - 0.88, Chinese - 0.1). The other way round is the same.
The situation is the same even after I did tokenize the Chinese text using library like JieBa or THULAC.
Therefore when I did testing, it is usually very easy to fall into an unrelated intent.
I would like to make LUIS recognize both English AND Chinese easily. Are there any way to solve this problem?
Thank you very much for your help.
I would like to make LUIS recognize both English AND Chinese easily. Are there any way to solve this problem?
Yes, the way is to separate your LUIS apps/projects, 1 for each language, and use a language detection before calling LUIS.
That's the official approach from LUIS docs (see here):
If you need a multi-language LUIS client application such as a chat bot, you have a few options. If LUIS supports all the languages, you develop a LUIS app for each language. Each LUIS app has a unique app ID, and endpoint log. If you need to provide language understanding for a language LUIS does not support, you can use Microsoft Translator API to translate the utterance into a supported language, submit the utterance to the LUIS endpoint, and receive the resulting scores.
For the language detection you can use Text Analytics API
from Microsoft Cognitive Services for example to get the text language, and then with this result query the right LUIS app.
How to use it?