Search code examples
typesrustbufferclonelines

Type mismatch when cloning the lines of the trait BufRead


To get better with Rust, I've decided to implement a simple lexer that analyzes some documents line by line.

As I have to iterate at least two times over the lines of the trait BufRead, I am cloning the lines of my BufRead but I get the following error:

error[E0271]: type mismatch resolving `<std::io::Lines<T> as std::iter::Iterator>::Item == &_`
  --> <anon>:18:23
   |
18 |     let lines = lines.cloned();
   |                       ^^^^^^ expected enum `std::result::Result`, found reference
   |
   = note: expected type `std::result::Result<std::string::String, std::io::Error>`
   = note:    found type `&_

error[E0271]: type mismatch resolving `<std::io::Lines<T> as std::iter::Iterator>::Item == &_`

I understand what the error is, but based on the following code, how can I tell the compiler what the Item of the Iterator should be so it can correctly cast the type?

use std::fmt::Write;
use std::io::{BufRead, BufReader, Lines, Read};

pub struct DocumentMetadata {
    language: String,
    // ...
}

pub fn analyze<T: BufRead>(document: T) -> Result<DocumentMetadata, ()> {
    let lines = document.lines();
    let language = guess_language(&lines);

    // Do more lexical analysis based on document language

    Ok(DocumentMetadata {
        language: language,
        // ...
    })
}

fn guess_language<T: BufRead>(lines: &Lines<T>) -> String {
    let lines = lines.cloned();
    for line in lines {
        let line = line.unwrap();
        // Try to guess language
    }
    "en".to_string()
}

#[test]
fn it_guesses_document_language() {
    let mut document = String::new();
    writeln!(&mut document, "# language: en").unwrap();
    let document = BufReader::new(document.as_str().as_bytes());

    match analyze(document) {
        Ok(metadata) => assert_eq!("en".to_string(), metadata.language),
        Err(_) => panic!(),
    }
}

For unit testing purpose, I am building a buffer with a String but in a normal usage I read it from a File.


Solution

  • Review the Iterator::cloned definition:

    fn cloned<'a, T>(self) -> Cloned<Self> 
        where Self: Iterator<Item=&'a T>, 
              T: 'a + Clone
    

    And the implementation of Iterator for io::Lines:

    impl<B: BufRead> Iterator for Lines<B> {
        type Item = Result<String>;
    }
    

    You cannot use cloned because the iterator item is not a reference. You cannot "tell" the compiler otherwise; that's not how types work.

    As I have to iterate at least two times over the lines of the trait BufRead, I am cloning the lines of my BufRead

    That doesn't really make sense. Cloning the lines of the reader wouldn't save anything. In fact, it would probably just make things worse. You'd be creating the strings once, not using them except for cloning them, then creating them a third time when you iterate again.

    If you wish to avoid recreating all the strings, collect all the strings into a Vec or other collection and then iterate over that multiple times:

    pub fn analyze<T: BufRead>(document: T) -> Result<DocumentMetadata, ()> {
        let lines: Result<Vec<_>, _> = document.lines().collect();
        let lines = lines.unwrap();
        let language = guess_language(&lines);
    
        // Do more lexical analysis based on document language
    
        Ok(DocumentMetadata {
            language: language,
            // ...
        })
    }
    
    fn guess_language<'a, I>(lines: I) -> String 
        where I: IntoIterator<Item = &'a String>,
    {
        for line in lines {
            // Try to guess language
        }
        "en".to_string()
    }