Search code examples
speech-recognitionivrvxmlvoicexml

Building VXML/GRXML dialog to recognize based on caller saying "That one" rather than the item?


My app has a bunch of dialogs where a caller is asked to pick and choose from a list:

"Which would you like? Account Information, Account Changes, Request Documents, Speak to a Representative."

In pseudocode, here's how it would appear:

<prompt>
 Which would you like?  Account Information, Account Changes, Request Documents, Speak to a Representative.
</prompt>
<grammar>
 "Account Information": goto Account Info logic
 "Account Changes": goto Account Change logic
 "Request Documents": goto Documents logic
 "Representative": goto Call Transfer logic
</grammar>

Now, this grammar does not account for situations where a caller says "That one!" right after hearing one of the options. That would be considered out of grammar, and an error case. I can get around this by breaking the dialog up into four prompts, and having redundant grammars in each:

 <prompt>
    Which would you like?
</prompt>
<prompt>
    Account Information
</prompt>
<grammar>
    "That one": goto Account Info logic
    "Account Information": goto Account Info logic
    "Account Changes": goto Account Change logic
    "Request Documents": goto Documents logic
    "Representative": goto Call Transfer logic
</grammar>
<prompt>
    Account Changes
</prompt>
<grammar>
    "That one": goto Account Change logic
    "Account Information": goto Account Info logic
    "Account Changes": goto Account Change logic
    "Request Documents": goto Documents logic
    "Representative": goto Call Transfer logic
</grammar> 
<prompt>
    Request Documents
</prompt>
<grammar>
    "That one": goto Documents logic
    "Account Information": goto Account Info logic
    "Account Changes": goto Account Change logic
    "Request Documents": goto Documents logic
    "Representative": goto Call Transfer logic
</grammar>
<prompt>
    "Request Documents": goto Documents logicSpeak to a Representative.
</prompt>
<grammar>
    "That one": goto Call Transfer logic
    "Account Information": goto Account Info logic
    "Account Changes": goto Account Change logic
    "Request Documents": goto Documents logic
    "Representative": goto Call Transfer logic
 </grammar> 

But is this the "right" way of doing this? Is there a way to do this with a single dialog?

Thanks,
IVR Avenger


Solution

  • It is the best way for most platforms. If you're using a VoiceXML 2.1 platform that supports mark, you can use it to determine which item was playing when the user spoke.

    If platform portability is a goal, I would recommend the multi-field solution.

    On the usability side, I would use direct identification of a list choice to be a final fallback. It is tedious to use and timing errors tend to occur. To minimize the latter, make sure there is sufficient and choice gap so that a slow user will select the correct entry. A platform delay of just a 1/4 second in transitioning prompts can impact the experience.