Search code examples
rubyalgorithmsortingruntime-errorquickselect

interpreting a laconic ruby 'nil' error


I've been at this for a couple of days now, and I can't crack this error:

[3] pry(main)> my_list = (1..10).to_a.sample(10)
=> [3, 5, 9, 2, 7, 6, 10, 4, 1, 8]
[4] pry(main)> linear_select(my_list,4)
NoMethodError: undefined method `-' for nil:NilClass
from .../selector/select.rb:44:in `partition'

Background

I'm trying to implement a guaranteed linear-time SELECT function using more or less strict implementations of CLRS. Everything was going swimmingly with the randomized select, the median-pivot select, until I hit this one. Here is the code for partition:

def partition(my_list, part_start, part_end, pivot = my_list[part_end])
# From CLRS p. 171
# In-place rearrangement of subarrays to find rank of pivot. Does no rearrangement 
# if the array is already sorted.
# Returns the rank of the pivot, which is set by default to the end of the partition.

return(0) if my_list.length == 1

  sort_separator = part_start - 1
  for loop_ind in (part_start..(part_end-1)) # This is the offending line 44
    if my_list[loop_ind] <= my_list[part_end]
      sort_separator += 1
      my_list[sort_separator],my_list[loop_ind] = 
        my_list[loop_ind],my_list[sort_separator]
    end
  end
  my_list[sort_separator+1],my_list[part_end] = 
    my_list[part_end],my_list[sort_separator+1]

  return(sort_separator+1)
end

This is pretty much a word-for-word transcription of the CLRS pseudocode (with basically no error checking, accordingly), and it works when called by the other flavors of SELECT I wrote, so I think the problem's in the linear-time SELECT:

def linear_select(my_list, rank)
# From CLRS 9.3
# select algorithm in worst-case linear time

group_size = 5

# 1. Divide the elements of the input array into (n/5).floor(+1) groups
groups = my_list.each_slice(group_size).to_a

# 2. Sort, get medians of each group (the median method defined above includes
# sorting)
medians = groups.each.collect{|group| group.median}

# 3. Find median of medians using linear_select recursively
# median_of_medians = linear_select(medians,medians.length/2.floor-1) # doesn't work yet
median_of_medians = medians.median

# Partition input array around median of medians using partition with pivot
# argument -- where the pivot passes the array index
new_part = partition(my_list, 0, my_list.index(median_of_medians-1), median_of_medians)

# The rest of the algorithm follows the select() archetype.
pivot = new_part + 1

if rank == pivot
    return my_list[new_part] # -1 here for zero-indexing
  elsif rank < pivot
    return(linear_select(my_list[0..(pivot - 1)], rank))
  else
    return(linear_select(my_list[pivot..-1], rank - pivot -1 ))
  end

end

I traced it manually in the interpreter and didn't get any errors. (I haven't learned how to use the debugger yet, although I burned an hour or so looking at different packages like hammertime.) In fact, thanks to the shuffling that goes on before the error appears, it works if I run it again:

[5] pry(main)> linear_select(my_list,4)
=> 4
[6] pry(main)> my_list
=> [3, 2, 4, 5, 7, 6, 10, 9, 1, 8]

I thought the error was because the upper index (third argument in partition()) was out of bounds, but I'm not clear on how that's happening.

Any help in interpreting this error, or a push in the right direction towards spotting the cause would be greatly appreciated. I feel like I'm close.

edit: for reference, here is how I implemented the Array#median method (lower median):

class Array # extends Array to include median calculation
def median
    # returns floor-median of list of values
    self.sort[((self.length - 1)/2.0).floor()]
  end
end

Solution

  • The error undefined method '-' for nil:NilClass means that the left-hand side of a subtraction operation was the value nil instead of a number. In your case, it means that the part_end argument is nil. This will happen when my_list.index(median_of_medians-1) returns nil instead of a number, which means that median_of_medians-1 was not found in the array my_list. I'm not familiar with the algorithm that you're implementing, so this is about as far as I can take you, hope it helps.

    Edit: Getting the median of an array of numbers is guaranteed to return a number that's in that array, but you seem to be assuming that the number median_of_medians-1 will also exist in the array, which seems like a pretty questionable assumption to me.