Search code examples
rustmime

Binary attachments are corrupted when using mailparse's ParsedMail::get_body


I am trying to parse emails saved as MIME files using Rust. I am able to extract the body and also attachments. When the attachment is a CSV file, everything works fine. When the file is a PDF or XLSX file, the saved file is corrupted. I suspect that there is a problem with the encoding because when I inspect the headers I get

Content-Type = "application/vnd.openxmlformats officedocument.spreadsheetml.sheet"

This is my code which works for CSV but not for XLSX:

extern crate base64;
extern crate mailparse;

use mailparse::*;
use std::fs::File;
use std::string::*;
use std::io::prelude::*;

fn main() {
    let mut file = File::open("test_mail").unwrap();
    let mut contents = String::new();
    let _silent = file.read_to_string(&mut contents);
    let parsed = parse_mail(contents.as_bytes()).unwrap();

    // This is the attached file
    let attached_file = parsed.subparts[2].get_body().unwrap();

    // Write the file
    let mut out_file = File::create("out_file.xlsx").unwrap();
    out_file.write_all(attached_file.as_bytes()).unwrap();

    println!("Done")
}

I am using Rust 1.23.0, Cargo 0.24.0, and I'm running this on Debian.


Solution

  • ParsedMail::get_body only works for text data (that can be converted to a unicode string). You want to use ParsedMail::get_body_raw to access binary attachments:

    let attached_file = parsed.subparts[2].get_body()_raw.unwrap();
    
    // Write the file
    let mut out_file = File::create("out_file.xlsx").unwrap();
    out_file.write_all(attached_file).unwrap();