I am trying to parse emails saved as MIME files using Rust. I am able to extract the body and also attachments. When the attachment is a CSV file, everything works fine. When the file is a PDF or XLSX file, the saved file is corrupted. I suspect that there is a problem with the encoding because when I inspect the headers I get
Content-Type = "application/vnd.openxmlformats officedocument.spreadsheetml.sheet"
This is my code which works for CSV but not for XLSX:
extern crate base64;
extern crate mailparse;
use mailparse::*;
use std::fs::File;
use std::string::*;
use std::io::prelude::*;
fn main() {
let mut file = File::open("test_mail").unwrap();
let mut contents = String::new();
let _silent = file.read_to_string(&mut contents);
let parsed = parse_mail(contents.as_bytes()).unwrap();
// This is the attached file
let attached_file = parsed.subparts[2].get_body().unwrap();
// Write the file
let mut out_file = File::create("out_file.xlsx").unwrap();
out_file.write_all(attached_file.as_bytes()).unwrap();
println!("Done")
}
I am using Rust 1.23.0, Cargo 0.24.0, and I'm running this on Debian.
ParsedMail::get_body
only works for text data (that can be converted to a unicode string). You want to use ParsedMail::get_body_raw
to access binary attachments:
let attached_file = parsed.subparts[2].get_body()_raw.unwrap();
// Write the file
let mut out_file = File::create("out_file.xlsx").unwrap();
out_file.write_all(attached_file).unwrap();