Search code examples
umldiagrammodelingvxmlvoicexml

Tools for describing and modeling voice xml applications


What tools do you use to describe call flow, to draw, to model in voice xml development? What is the most useful editor to draw such diagrams & models [ as I can guess there's no specific tool for voice applications design ] ? And what blocks do you use to draw prompts, grammars, how would you do that in professional voice app development?


Solution

  • Note, you didn't say who the audience of the material, so I'm assuming you are on the development side of the house and documenting for the business/customer.

    To draw call flow diagrams, Visio is the most common tool used in the enterprise space. In small applications, you might actually have the prompts in blocks, but usually the diagrams are just used to provide a high level view of the application. For example, a main menu prompt would list the choices, but not all the variations or retries. The connectivity between blocks is usually limited to major paths, not all error paths or all global exits. And, decisions usually only reflect major business logic decisions, like customer type identification.

    Prompting and detailed call flow is typically done in Word. The Word document forms the core of the specification. Some groups separate our the call flow verbiage and interface document from host/data related activities, some do not.

    As for grammars, those are typically not documented in the traditional sense, at least, not for natural language(NL) grammars. Simple DTMF grammars can be inferred from menuing choices in the documents above. Natural language grammars typically start from a common set of words/phrases implied by the prompts with a large amount of standard filler ("I want", "Please", "May I", ...). With a NL application, typically the first pass is just to collect a large collection of recorded utterances, which are then transcribed. The recordings and the transcriptions are then used in the tuning process to enhance the choices in the grammar to match what callers are saying and eliminating unused paths to increase accuracy. The tuning process has numerous other steps, but this is the main one used that affects the grammar contents.

    I've seen a variety of attempts to link documentation to code. None seem to be that successful. The automatically generated documented, or source documentation, is usually too confusing for business users. For example, they don't want to see the 50 exit points from a state. They want to see the 5 that the customer is told about and then maybe, somewhere else, reference global choices or some state specific escape phrases and their destinations (cancel, back, restart).