Search code examples
google-apps-scriptgoogle-sheetsgoogle-sheets-formulavlookup

How To Assign Same IDs to Aliases/partial identicals Strings in the same Column in Google Sheets?


In short, I need a script or a formula able to scan the range D2:D21 / D2:D for strings "partial" matches and output corresponding IDS to Basic Strings Cells and their partial match cells in E2:E21 / E2:E .

I have this minimal testing example:

IDs

Assign Same IDs to Aliases/partial identicals Strings

ORDERED DATA IDS WORKING DATASET EXPECTED RESULT
BYTE id_0 U SHORT (VAR 1) id_3
S BYTE id_1 S BYTE (VAR2) id_1
SHORT id_2 BYTE (VAR2) id_0
U SHORT id_3 BYTE id_0
INT id_4 INT id_4
BYTE (VAR 1) S BYTE id_1
S BYTE (VAR 1) INT (VAR 1) id_4
SHORT (VAR 1) S BYTE (VAR 1) id_1
U SHORT (VAR 1) U SHORT (VAR2) id_3
INT (VAR 1) SHORT (VAR2) id_2
BYTE (VAR2) S BYTE (VAR'3) id_1
S BYTE (VAR2) SHORT id_2
SHORT (VAR2) U SHORT id_3
U SHORT (VAR2) BYTE (VAR 1) id_0
INT (VAR2) U SHORT (VAR'3) id_3
BYTE (VAR'3) SHORT (VAR'3) id_2
S BYTE (VAR'3) INT (VAR2) id_4
SHORT (VAR'3) BYTE (VAR'3) id_0
U SHORT (VAR'3) INT (VAR'3) id_4
INT (VAR'3) SHORT (VAR 1) id_2
BASIC STRINGS IDS
BYTE id_0
S BYTE id_1
SHORT id_2
U SHORT id_3
INT id_4
BASIC STRINGS VARIANTS 1 IDS BASIC STRINGS VARIANTS 2 IDS BASIC STRINGS VARIANTS 3
BYTE (VAR 1) id_0 BYTE (VAR2) id_0 BYTE (VAR'3)
S BYTE (VAR 1) id_1 S BYTE (VAR2) id_1 S BYTE (VAR'3)
SHORT (VAR 1) id_2 SHORT (VAR2) id_2 SHORT (VAR'3)
U SHORT (VAR 1) id_3 U SHORT (VAR2) id_3 U SHORT (VAR'3)
INT (VAR 1) id_4 INT (VAR2) id_4 INT (VAR'3)

Basically, I need the same IDS from B2:B6 to be assigned to all aliases of A2:A6 source data strings, as in E2:E21.

Aliases are all cells having identical matches but ending with extra "attributes" in parentheses.

For example BYTE (VAR 1) or BYTE (VAR2) or BYTE (VAR'3) are all aliases of BYTE.

Or, S BYTE (VAR 1) or S BYTE (VAR2) or S BYTE (VAR'3) are all aliases of S BYTE.

The WORKING DATASET in D2:D21 is not ordered, and not even (there are instances of A2:A6 source data strings with 0, 1 or 2, or 3 (or ideally n) number of aliases.

The expected result in E2:E21 shows the same IDS being assigned to the source data cells and its aliases cells.

I've looked at VLOOKUP and SCAN functions but it seems those can only work with full matches or parital matches on numerical approaching matches but not with substrings partial matches. Or if they are, I'm not sure how to make it work.

I also considered manually removing the aliases attributes substrings, then using Vlookup, then adding back the attributes manually. Which is error prone and lenghty process. And doesn't work past the basic strings range (F2:L21 in below screenshots).

0

1

2

3

Any idea on a better approach is greatly appreciated.


Solution

  • You may try:

    =index(vlookup(regexreplace(D2:D21,"( \(.*\))",""),A2:B6,2,))
    

    enter image description here