Search code examples
uimaruta

How many languages does UIMA Ruta supports?


I am new to text analysis, UIMA and UIMA Ruta related technologies and working on a new software (Java based) for intelligent document processing. Currently, I am going through all the reading materials related with UIMA/Ruta. One question I have and still don't know the clear answer is how many different languages does UIMA Ruta supports? I would be kind for any other help/link/doc regarding what reading materials should I go through (for an intelligent document processing software capable of analyzing documents in multiple languages). Thanks -Rahul


Solution

  • Ruta itself is a (scripting) language which is language-agnostic and per se does not support any particular set of (natural) languages. You can write Ruta scripts for any language such as English, Spanish, Chinese, etc.

    For example, take a look at the Learning by Example section in the official Ruta reference. It presents a simple script that marks animals in English texts. As should be obvious, you could do the same for any language by adapting the regular expressions in the example code.

    Therefore, which languages your system will support depends entirely on your Ruta scripts and not Ruta itself.