pythonlistdictionaryfor-loop

# Python function to find the highest frequency number

I'm working on a Python function that is supposed to return the number with the highest frequency in an array. In cases where two numbers or more have the same frequency, it should return the larger number.

However, I'm facing an issue with my current implementation.

``````def highest_rank(arr):
count_num = {}
for i in arr:
if i not in count_num:
count_num[i] = 0
else:
count_num[i] = arr.count(i)
return max(count_num, key=count_num.get)
``````

In the this example `[9, 48, 1, 8, 44, 45, 32]` , I expect the function to return `48` but it return 9

Solution

• This is a very fast solution that finds the correct result (as tested with a few different inputs)

``````from collections import defaultdict

def highest_rank(arr):
count = defaultdict(int)
highest_rank = 0
for num in arr:
cnt = count[num]+1
count[num]=cnt
highest_rank_cnt = count[highest_rank]
if cnt > highest_rank_cnt or (cnt == highest_rank_cnt and num > highest_rank):
highest_rank = num
return highest_rank

print(highest_rank([9, 48, 1, 8, 44, 45, 32])) # Prints 48 as expected
``````

A quick comparison between my method and your solution (applied the fix from accepted solution):

``````import numpy as np
from collections import defaultdict

def highest_rank(arr):
count = defaultdict(int)
highest_rank = 0
highest_rank_cnt = 0
for num in arr:
cnt = count[num]+1
count[num]=cnt
if cnt > highest_rank_cnt or (cnt == highest_rank_cnt and num > highest_rank):
highest_rank = num
highest_rank_cnt = cnt
return highest_rank

def highest_rank_slow(arr):
count_num = {}
for i in arr:
if i not in count_num:
count_num[i] = 0
else:
count_num[i] = arr.count(i)
return max(count_num,key=lambda x:(count_num.get(x),x))

nums = list(np.random.randint(0,1000,10_000))
%timeit highest_rank(nums)          # 2.97 ms ± 393 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit highest_rank_slow(nums)     # 1.56 s ± 52.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
``````

In fact, my approach can do this for a million numbers in roughy 230ms which is still faster than your solution for 10K numbers since `.count()` is REPEATEDLY looping through the list instead of counting in one go.

Edit:

If you don't want to use the built-in `defaultdict`, you can simply use your `if num not in count` trick:

``````def highest_rank_no_defaultdict(arr):
count = {}
highest_rank = 0
highest_rank_cnt = 0
for num in arr:
if num not in count:
cnt=1
else:
cnt = count[num]+1
count[num]=cnt
if cnt > highest_rank_cnt or (cnt == highest_rank_cnt and num > highest_rank):
highest_rank = num
highest_rank_cnt = cnt
return highest_rank
``````

this is both more verbose and a bit slower than using `defaultdict` so unless you absolutely need to, stick with `defaultdict`.