Search code examples
algorithmskip-lists

Find k-th element in skiplist - explanation needed


We're learning about skiplists at my university and we have to find k-th element in the skiplist. I haven't found anything about this in the internet, since skiplist in not really a popular data structure. W. Pugh in his original article wrote:
Each element x has an index pos(x). We use this value in our invariants but do not store it. The index of the header is zero, the index of the first element is one and so on. Associated with each forward pointer is a measurement, fDistance, of the distance traversed by that pointer:
x→fDistance[i] = pos(x→forward[i]) – pos(x).
Note that the distance traversed by a level 1 pointer is always 1, so some storage economy is possible here at the cost of a slight increase in the complexity of the algorithms.

SearchByPosition(list, k)
if k < 1 or k > size(list) then return bad-index
    x := list→header
    pos := 0
    -- loop invariant: pos = pos(x)
    for i := list→level downto 1 do
        while pos + x→fDistance[i] ≤ k do
            pos := pos + x→fDistance[i]
            x := x→forward[i]
return x→value

The problem is, I still don't get what is going on here. How do we know positions of elements without storing them? How do we calculate fDistance from pos(x) if we don't store it? If we go from the highest level of the skiplist, how do we know how many nodes on level 0 (or 1, the lowest one anyway) we skip this way?


Solution

  • I'm going to assume you're referring to how to find the k-th smallest (or largest) element in a skip list. This is a rather standard assumption I think, otherwise you have to clarify what you mean.

    I'll refer to the GIF on wikipedia in this answer: https://en.wikipedia.org/wiki/Skip_list

    enter image description here

    Let's say you want to find the k = 5 smallest element.

    You start from the highest level (4 in the figure). How many elements would you skip from 30 to NIL? 6 (we also count the 30). That's too much.

    Go down a level. How many skipped from 30 to 50? 2: 30 and 40.

    So we reduced the problem to finding the k = 5 - 2 = 3 smallest element starting at 50 on level 3.

    How many skipped from 50 to NIL? 4, that's one too many.

    Go down a level. How many skipped from 50 to 70? 2. Now find the 3 - 2 = 1 smallest element starting from 70 on level 2.

    How many skipped from 70 to NIL? 2, one too many.

    From 70 to 90 on level 1? 1 (itself). So the answer is 70.

    So you need to store how many nodes are skipped for each node at each level and use that extra information in order to get an efficient solution. That seems to be what fDistance[i] does in your code.