Search code examples
chatbotrasa-nlu

identify new words as intents in rasa nlu


Have been using rasa nlu to classify intents and entities for my chatbot. Everything works as expected (with extensive training) but with entities, it seems to predict the value based on the exact position and length of the word. This is fine for a scenario where the entities are limited. But when the bot needs to identify a word (which has a different length and not trained yet, for example a new name), it's failing to detect. Is there a way wherein I can make rasa identify the entities based on the relative position of the word or better yet, insert a list of words that becomes the domain specific for the entity to find a match with (like phrase list in LUIS)?

{"q":"i want to buy a Casio SX56"}

{
"project": "default",
"entities": [
    {
        "extractor": "ner_crf",
        "confidence": 0.7043648832678735,
        "end": 26,
        "value": "Casio SX56",
        "entity": "watch",
        "start": 16
    }
],
"intent": {
    "confidence": 0.8835646513829762,
    "name": "buy_watch"
},
"text": "i want to buy a Casio SX56",
"model": "model_20180522-165141",
"intent_ranking": [
    {
        "confidence": 0.8835646513829762,
        "name": "buy_watch"
    },
    {
        "confidence": 0.07072182459497935,
        "name": "greet"
    }       
]
}

But if Casio SX56 gets replaced with Citizen M1:

{"q":"i want to buy a Citizen M1"}

{
"project": "default",
"intent": {
    "confidence": 0.8710909096729019,
    "name": "buy_watch"
},
"text": "i want to buy a Citizen M1",
"model": "model_20180522-165141",
"intent_ranking": [
    {
        "confidence": 0.8710909096729019,
        "name": "buy_watch"
    },
    {
        "confidence": 0.07355588750895545,
        "name": "greet"
    }       
]
}

Thank you!


Solution

  • The feature I was looking for is phrase matcher which would allow me to add a list of possible entities to the training model. This way, if any new name pops up, we can simply add the name to the phrase list and the model would be able to identify it with all possible utterances. Though this is still in development and should be added to the master soon: https://github.com/RasaHQ/rasa_nlu/pull/822