machine-learning nlp artificial-intelligence linguistics

Create a user assistant using NLP

I am following a course titled Natural Language Processing on Coursera, and while the course is informative, I wonder if the contents given cater to what am I looking for.

Basically I want to implement a textual version of Cortana, or Siri for now as a project, i.e. where the user can enter commands for the computer in natural language and they will be processed and translated into appropriate OS commands. My question is

What is generally sequence of steps for the above applications, after processing the speech? Do they tag the text and then parse it, or do they have any other approach?

Under which application of NLP does it fall? Can someone cite me some good resources for same? My only doubt is that what I follow now, shall that serve any important part towards my goal or not?

Solution

What you want to create can be thought of as a carefully constrained chat-bot, except you are not attempting to hold a general conversation with the user, but to process specific natural language input and map it to specific commands or actions.

In essence, you need a tool that can pattern match various user input, with the extraction or at least recognition of various important topic or subject elements, and then decide what to do with that data.

Rather than get into an abstract discussion of natural language processing, I'm going to make a recommendation instead. Use ChatScript. It is a free open source tool for creating chat-bots that recently took first place in the Loebner chat-bot competition, as it has done so several times in the past:

http://chatscript.sourceforge.net/

The tool is written in C++, but you don't need to touch the source code to create NLP apps; just use the scripting language provided by the tool. Although initially written for chat-bots, it has expanded into an extremely programmer friendly tool for doing any kind of NLP app.

Most importantly, you are not boxed in by the philosophy of the tool or limited by the framework provided by the tool. It has all the power of most scripting languages so you won't find yourself going most of the distance towards your completing your app, only to find some crushing limitation during the last mile that defeats your app or at least cripples it severely.

It also includes a large number of ontologies that can jump-start significantly your development efforts, and it's built-in pre-processor does parts-of-speech parsing, input conformance, and many other tasks crucial to writing script that can easily be generalized to handle large variations in user input. It also has a full interface to the WordNet synset database. There are many other important features in ChatScript that make NLP development much easier, too many to list here. It can run on Linux or Windows as a server that can be accessed using a TCP-IP socket connection.

Here's a tiny and overly simplistic example of some ChatScript script code:

# Define the list of available devices in the user's household.
concept: ~available_devices( green_kitchen_lamp stove radio )

#! Turn on the green kitchen lamp.
#! Turn off that damn radio!
u: ( turn _[ on off ] *~2 _~available_devices )
    # Save off the desired action found in the user's input.  ON or OFF.
    $action = _0

    # Save off the name of the device the user wants to turn on or off.
    $target_device = _1

    # Launch the utility that turns devices on and off.
    ^system( devicemanager $action $target_device )

Above is a typical ChatScript rule. Your app will have many such rules. This rule is looking for commands from the user to turn various devices in the house on and off. The # character indicates a line is a comment. Here's a breakdown of the rule's head:

It consists of the prefix u:. This tells ChatScript a rule that the rule accepts user input in statement or question format.
It consists of the match pattern, which is the content between the parentheses. This match pattern looks for the word turn anywhere in the sentence. Next it looks for the desired user action. The square brackets tell ChatScript to match the word on or the word off. The underscore preceding the square brackets tell ChatScript to capture the matched text, the same way parentheses do in a regular expression. The ~2 token is a range restricted wildcard. It tells ChatScript to allow up to 2 intervening words between the word turn and the concept set named ~available_devices. ~available_devices is a concept set. It is defined above the rule and contains the set of known devices the user can turn on and off. The underscore preceding the concept set name tells ChatScript to capture the name of the device the user specified in their input.

If the rule pattern matches the current user input, it "fires" and then the rule's body executes. The contents of this rule's body is fairly obvious, and the comments above each line should help you understand what the rule does if fired. It saves off the desired action and the desired target device captured from the user's input to variables. (ChatScript variable names are preceded by a single or double dollar-sign.) Then it shells to the operating system to execute a program named devicemanager that will actually turn on or off the desired device.

I wanted to point out one of ChatScript's many features that make it a robust and industrial strength NLP tool. If you look above the rule you will see two sentences prefixed by a string consisting of the characters #!. These are not comments but are validation sentences instead. You can run ChatScript in verify mode. In verify mode it will find all the validation sentences in your scripts. It will then apply each validation sentence to the rule immediately following it/them. If the rule pattern does not match the validation sentence, an error message will be written to a log file. This makes each validation sentence a tiny, easy to implement unit test. So later when you make changes to your script, you can run ChatScript in verify mode and see if you broke anything.