Search code examples
filevectoriorust

How do I get a random line from a file?


I'm trying to get a random line from a file:

extern crate rand;

use rand::Rng;
use std::{
    fs::File,
    io::{prelude::*, BufReader},
};

const FILENAME: &str = "/etc/hosts";

fn find_word() -> String {
    let f = File::open(FILENAME).expect(&format!("(;_;) file not found: {}", FILENAME));
    let f = BufReader::new(f);

    let lines: Vec<_> = f.lines().collect();

    let n = rand::thread_rng().gen_range(0, lines.len());
    let line = lines
        .get(n)
        .expect(&format!("(;_;) Couldn't get {}th line", n))
        .unwrap_or(String::from(""));

    line
}

This code doesn't work:

error[E0507]: cannot move out of borrowed content
  --> src/main.rs:18:16
   |
18 |       let line = lines
   |  ________________^
19 | |         .get(n)
20 | |         .expect(&format!("(;_;) Couldn't get {}th line", n))
   | |____________________________________________________________^ cannot move out of borrowed content

I tried adding .clone() before .expect(...) and before .unwrap_or(...) but it gave the same error.

Is there a better way to get a random line from a file that doesn't involve collecting the whole file in a Vec?


Solution

  • Use IteratorRandom::choose to randomly sample from an iterator using reservoir sampling. This will scan through the entire file once, creating Strings for each line, but it will not create a giant vector for every line:

    use rand::seq::IteratorRandom; // 0.7.3
    use std::{
        fs::File,
        io::{BufRead, BufReader},
    };
    
    const FILENAME: &str = "/etc/hosts";
    
    fn find_word() -> String {
        let f = File::open(FILENAME)
            .unwrap_or_else(|e| panic!("(;_;) file not found: {}: {}", FILENAME, e));
        let f = BufReader::new(f);
    
        let lines = f.lines().map(|l| l.expect("Couldn't read line"));
    
        lines
            .choose(&mut rand::thread_rng())
            .expect("File had no lines")
    }
    

    Your original problem is that:

    1. slice::get returns an optional reference into the vector.

      You can either clone this or take ownership of the value:

      let line = lines[n].cloned()
      
      let line = lines.swap_remove(n)
      

      Both of these panic if n is out-of-bounds, which is reasonable here as you know that you are in bounds.

    2. BufRead::lines returns io::Result<String>, so you have to handle that error case.

    Additionally, don't use format! with expect:

    expect(&format!("..."))
    

    This will unconditionally allocate memory. When there's no failure, that allocation is wasted. Use unwrap_or_else as shown.