Search code examples
nlpsemantics

Template language for natural language text mutations


Stuck upon a rather trivial task that seems to lead to a wider problem.

Need to be able to generate light variations of a same short text. Some word forms depend on the speaker's gender, some can be replaced with synonyms.

Pseudo code:

I {random:decided|made up my mind} to {random:try|test|give a try to}
this {new|fresh} {cool|awesome} {service|web service|online tool}.

I'm looking for an "industry standard" templating language to describe such texts and possible variations. Thinking further, I might want global variables (like the gender one), cross-links for dependencies picked earlier in the sentence.

This looks close to regular expressions syntax. Ideally it would be more readable/writable by non-programmers.

Perhaps the problem is well-known, with a solid state solution like some programming language specifically for the task?


Solution

  • I was unable to find something like this so I set out to create it. The result is called Nalgene - the natural language generation language. The syntax is fairly simple but also powerful enough to support recursive phrases, synonyms, captured values, and dependencies.

    %
        $person.name went to the $place to %action
    
    %action
        ~buy a new $item
        ~sell @posessive($person.gender) $item
    
    ~buy
        buy
        purchase
    
    $place
        store
        market
    
    ...
    

    It outputs generated sentences alongside a tree representation (the primary purpose is to generate training data for ML systems).

    > jill went to the store to return her toothbrush
    
    ( %
        ( $person.name
            jill )
        ( $place
            store )
        ( %action
            ( $item
                toothbrush ) ) )
    

    If you are still looking a year later, stop by and open an issue, and let me know what you seek in a dream NLG language.