Search code examples
kqlazure-data-explorerkusto-explorer

filter by language in KQL/Kusto/Data Explorer


Is it possible to detect the language of a text string using Kusto functions? I have a dataset that I would like to filter by language. Ex. English, Spanish etc.

Example:

datatable (text:string) ['Bonjour', 'Salam', 'Hallo', 'Neih hou']

Desired Outcome:

Text        Language  
'Bonjour',  'French',
'Salam',    'Persian',
'Hallo',    'Danish',
'Neih hou', 'Cantonese'

Solution

  • It's not available out of the box in ADX but if you can run the python plugin on your cluster then you can use a custom script utilizing a library like https://pypi.org/project/fasttext-langdetect/ or https://pypi.org/project/langdetect/