Rasa. Wrong confidence score for non-related messages

I’m building bot using rasa to response for user’s questions and I have an issue.

Rasa gives me high level of confidence for messages that are completely not related to intent’s examples.

I have medical-related intents but message like “I like coffee” gives me even more confidence than messages related. Also, random chars messages like “laj jfias jjlas fe” also give me high confidence.

Could anyone give me a hint how to fix this? Where can I look for a bug?

This is my config:

language: "en"

pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_spacy"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_classifier_sklearn"

Solution

Forced classification into one of your intents, seems to be the issue. One way to solve it can be like this:

Add some examples that are unrelated to your domain & add them under some intent e.g. your_fallback_intent
Define a story for your_fallback_intent

this will make nlu to classify unrelated messages under your_fallback_intent

pls add details in comment if you still face the issue.