I would like to know what kind of technique in Machine Learning domain can solve the problem below? (For example: Classification, CNN, RNN, etc.)
Problem Description:
User would input a string, and I would like to decompose the string to get the information I want. For example:
There are not fix string length in product type, batch number, and place of origin. Like product type may be "R21" or "TT3S", meaning that product type may comprise 2 or 3 character.
Also sometimes the string may contain wrong input information, like the "X" in example 2 shown above.
I’ve tried to find related solution, but what I got the most related is this one: https://github.com/philipperemy/Stanford-NER-Python
However, the string I got is not a sentence. A sentence comprises spaces & grammar, but the string I got doesn’t fit this situation.
Your problem is not reasonnably solved with any ML since you have a defined list of product type etc, since there may not be any actual simple logic, and since typically you are never working in the continuum (vector space etc). The purpose of ML is to build a regression function from few pieces of data and hope/expect a good generalisation (the regression fits all the unseen examples, past present and future).
Basically you are trying to reverse engineer the input grammar and generation (which was done by an algorithm, including possibly a random number generator). But in order to assert that your classifier function is working properly you need all your data to be also groundtruth, which breaks the ML principle.
You want to list all your list of defined product types (ground truth), and scatter bits of your input (with or without a regex pattern) into different types (batch number, place of origin). The "learning" is actually building a function (or few, one per type), element by element, which is filling a map (c++) or a dictionary (c#), and using it to parse the input.