Search code examples
pythonalgorithmoptimization

How can I optimize the finding of duplicates?


The code is down below. It finds duplicates in the generated array. How can it be optimized for faster work with big volume of data?

There's some example of input and output data to imagine how it must work:

Input: nums = [1,2,3,1]

Output: true

Input: nums = [1,2,3,4]

Output: false

Input: nums = [1,1,1,3,3,4,3,2,4,2]

Output: true

import random

nums = list((random.randint(-100, 100) for i in range(10)))
b = False

print(nums)

for i in range(len(nums)):
    for j in range(i+1, len(nums)):
        if nums[i] == nums[j]:
            b = True
            break

print(b)

I can't even think how can it be optimized.


Solution

  • If the length of your list is not equal to the length of a set constructed from that list then it must contain duplicates.

    For example:

    import random
    
    nums = [random.randint(1, 20) for _ in range(10)]
    
    print(nums)
    
    print("Has duplicates" if len(set(nums)) != len(nums) else "No duplicates found")