Search code examples
regexstringword-embedding

People name embedding from name, commas and spaces to keys


I'm trying to figure out a good algorithm for embedding name as such.
space = 0, word = 1, comma = 2, double quotations = 3

So "Bob Dylan" should embed as "101" While "Brown, Millie Bobby" should embed as "120101"
and "Dwayne "The Rock" Johnson" should embed as "103101301"


Solution

  • I would suggest a very simple solution:

    • Search for all the words using \w+ and replace them with 1.
    • Then for spaces \s and replace it with 0.
    • Comma , and replace it with 2.
    • And eventually double quote " with 3.