I have 3 columns, an id column and 2 name columns. Sometimes the 2 name columns are the same valued but one is upper case in one column and lower case in another. How do I remove these where the value is the same (or have similar characters) but the casing is different?
Ex:
a = load txt file a = foreach a generate id, name1, name2
current output:
id1, james, JAMES
id2, tom, Tom
id3, Jim, Bob
id4, Bill, billy
expected output: only this 1 result below
a = compare name1 and name2 and if there are any similar characters in name1 that are also in name 2, filter these out
id3, Jim, Bob
Thanks for any help!
Assuming you have loaded the data in Relation A and names are of type chararray.
A = FILTER A BY (LOWER(A.$1) != LOWER(A.$2))
DUMP A;