Main list:
data = [
["629-2, text1, 12"],
["629-2, text2, 12"],
["407-3, text9, 6"],
["407-3, text4, 6"],
["000-5, text7, 0"],
["000-5, text6, 0"],
]
I want to get a list comprised of unique lists like so:
data_unique = [
["629-2, text1, 12"],
["407-3, text9, 6"],
["000-5, text6, 0"],
]
I've tried using numpy.unique
but I need to pare it down further as I need the list to be populated by lists containing a single unique version of the numerical designator in the beginning of the string, ie. 629-2...
I've also tried using chain
from itertools
like this:
def get_unique(data):
return list(set(chain(*data)))
But that only got me as far as numpy.unique
.
Thanks in advance.
Code
from itertools import groupby
def get_unique(data):
def designated_version(item):
return item[0].split(',')[0]
return [list(v)[0]
for _, v in groupby(sorted(data,
key = designated_version),
designated_version)
]
Test
print(get_unique(data))
# Output
[['629-2, text1, 12'], ['407-3, text9, 6'], ['000-5, text7, 0']]
Explanation
lambda item: item[0].split(',')[0]
list(v)[0]