Search code examples
algorithmmachine-learningartificial-intelligencesearch-enginemultilingual

Multi-language search matching


Suppose we have the name written in any none-latin letters - languages, like Arabic, Hebrew, Chinese, Japanese etc.

How could a search engine match between the original name and the English spelling of the same name. and vice versa?

Something like the name 拓海 in Japanese and the English spelling Takumi.

what is the algorithm/technique used to do this ?


Solution

  • good day.

    you have to do following:

    classificate each lang in the world on the same symbols:

    all langs:

    • Engish [26 letters] a b c d e f g ...
    • Russian [33 letters] a б в г д е ....
    • Chinese [x letters] ....
    • Ukrainian [x letters] a б в г д ..... i
    • Japanese [x letters] ...
    • .................

    finally you will be have rules between any symbols spelling in any langs. Some langs, for instance, Hindi, Chinese and etc not will be have any rules. you should be create your own rules(based on transcription of this langs).

    algo:

    [w][e][п] = wep

    e e r

    e - eng r - rus transcription[п] = p