python algorithm sorting complexity-theory heapq

Why is heap slower than sort for K Closest Points to Origin?

The coding task is here

The heap solution:

import heapq
class Solution:
    def kClosest(self, points: List[List[int]], K: int) -> List[List[int]]:
        return heapq.nsmallest(K, points, key = lambda P: P[0]**2 + P[1]**2)

The sort solution:

class Solution(object):
    def kClosest(self, points: List[List[int]], K: int) -> List[List[int]]:
        points.sort(key = lambda P: P[0]**2 + P[1]**2)
        return points[:K]

According to the explanation here, Python's heapq.nsmallest is O(n log(t)), and Python List.sort() is O(n log(n)). However, my submission results show that sort is faster than heapq. How did this happen? Theoretically, it's the contrary, isn't it?

Solution

As has been discussed, the fast implementation the sort using tim sort in python is one factor. The other factor here is that heap operations is not as cache-friendly as merge sort and insertion sort are(tim sort is the hybrid of these two).

Heap operations access data stored in distant indices.

Python use 0-indexed based array to implement its heap library. So for the kth value, its children nodes indices are k * 2 + 1 and k * 2 + 2.

Each time you are doing the percolate up/down operations after adding/removing an element to/from the heap, it tries to access to parent/children nodes that are far away from the current index. This is not cache-friendly. This is also why heap sort is generally slower than quick sort, albeit both of them are asymptotically the same.