I've seen some really beautiful examples of Ruby and I'm trying to shift my thinking to be able to produce them instead of just admire them. Here's the best I could come up with for picking a random line out of a file:
def pick_random_line
random_line = nil
File.open("data.txt") do |file|
file_lines = file.readlines()
random_line = file_lines[Random.rand(0...file_lines.size())]
end
random_line
end
I feel like it's gotta be possible to do this in a shorter, more elegant way without storing the entire file's contents in memory. Is there?
You can do it without storing anything except the most recently-read line and the current candidate for the returned random line.
def pick_random_line
chosen_line = nil
File.foreach("data.txt").each_with_index do |line, number|
chosen_line = line if rand < 1.0/(number+1)
end
return chosen_line
end
So the first line is chosen with probability 1/1 = 1; the second line is chosen with probability 1/2, so half the time it keeps the first one and half the time it switches to the second.
Then the third line is chosen with probability 1/3 - so 1/3 of the time it picks it, and the other 2/3 of the time it keeps whichever one of the first two it picked. Since each of them had a 50% chance of being chosen as of line 2, they each wind up with a 1/3 chance of being chosen as of line 3.
And so on. At line N, every line from 1-N has an even 1/N chance of being chosen, and that holds all the way through the file (as long as the file isn't so huge that 1/(number of lines in file) is less than epsilon :)). And you only make one pass through the file and never store more than two lines at once.
EDIT You're not going to get a real concise solution with this algorithm, but you can turn it into a one-liner if you want to:
def pick_random_line
File.foreach("data.txt").each_with_index.reduce(nil) { |picked,pair|
rand < 1.0/(1+pair[1]) ? pair[0] : picked }
end