Search code examples
javascriptarraysencodingencodeone-hot-encoding

Convert a tokenized string into integers by using Javascript


I have seen this question but as it is in python, I would like to ask a similar question. Without using a library how would I take a tokenized array of strings which are in this format:

[["hi","how","are", "you"], ["how", "are", "you", "doing"]] 

If I have the dictionary displayed below, how would I create an array which has the same format as the tokenized array but instead of having strings I would have a single integer which represents its position inside of the dictionary?

["how","hi","doing"]

So the output would look like this:

[[2,1,0,0],[1,0,0,3]]

Solution

  • Use map and indexOf methods

    arr = [
      ["hi", "how", "are", "you"],
      ["how", "are", "you", "doing"],
    ];
    
    // your input is array in javascript (not a dictionary)
    const keys = ["how", "hi", "doing"];
    
    const res = arr.map((arr) => arr.map((word) => keys.indexOf(word) + 1));
    
    console.log(res)