Search code examples
pythonnumpycounthashable

Python: looking for duplicates in list


I have a list of floats, and I want to know how many duplicates are in it.

I have tried with this:

p = t_gw.p(sma, m1, m2)       #p is a 1d numpy array
p_list = list(p)
dup = set([x for x in p_list if p_list.count(x) > 1])
print dup

I have also tried to use collections.counter, but I always get the same error

TypeError: unhashable type: 'numpy.ndarray'

I've looked around in similar questions, but I can't understand what hashable means, why a list (or numpy array) is not hashable and what kind of type should I use.


Solution

  • Your numpy-array is two-dimensional. So list(p) does not do, what you expect. Use list(p.flat) instead.

    Or (mis)use numpy's histogram function:

    cnt, bins = numpy.histogram(p, bins=sorted(set(p.flat))+[float('inf')])
    dup = bins[cnt>1]