Is there a fast way to get the unique elements, especially the strings from a list or tuple of nested lists and tuples. Strings like 'min' and 'max' should be removed. The lists and tuples could be nested in any possible way. The only element which will always be the same are the tuples at the core like ('a',0,49), which contains the strings.
Like those list or tuple:
lst1=[[(('a',0,49),('b',0,70)),(('c',0,49))],
[(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]]
tuple1=([(('a',0,49),('b',0,70)),(('c',0,49))],
[(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))])
Wanted Output:
uniquestrings = ['a','b','c','e']
What I tried so far:
flat_list = list(sum([item for sublist in x for item in sublist],()))
But this does not go to the "core" of the nested object
This will get any string inside the given iterable, regardless of position inside the iterable:
def isIterable(obj):
# cudos: https://stackoverflow.com/a/1952481/7505395
try:
_ = iter(obj)
return True
except:
return False
# shortcut
isString = lambda x: isinstance(x,str)
def chainme(iterab):
# strings are iterable too, so skip those from chaining
if isIterable(iterab) and not isString(iterab):
for a in iterab:
yield from chainme(a)
else:
yield iterab
lst1=[[(('a',0,49),('b',0,70)),(('c',0,49))],
[(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]]
tuple1=([(('a',0,49),('b',0,70)),(('c',0,49))],
[(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))])
for k in [lst1,tuple1]:
# use only strings
l = [x for x in chainme(k) if isString(x)]
print(l)
print(sorted(set(l)))
print()
Output:
['a', 'b', 'c', 'c', 'e', 'a', 'max', 'b'] # list
['a', 'b', 'c', 'e', 'max'] # sorted set of list
['a', 'b', 'c', 'c', 'e', 'a', 'max', 'b']
['a', 'b', 'c', 'e', 'max']