I'm trying to upload a byte vector to cloud storage.
This byte vector should be a compressed archive. To achieve this I need to obtain a Vec<u8>
by reading the compressed archive which I have created. I know that gzipped files do not contain their size and when I try to read it normally I don't get all the bytes.
It seems that it only reads the header because the resulting vector is 10 bytes.
use std::io::Read;
fn main() {
// Creates the archive and compresses it.
let file = std::fs::File::create("example.tar.gz").unwrap();
let encoder = flate2::write::GzEncoder::new(file, flate2::Compression::default());
let mut archive = tar::Builder::new(encoder);
archive.append_dir_all("example_dir", "path/to/example_dir").unwrap();
archive.finish().unwrap();
// I see that this does not work since it reads a wrong length.
// But I don't know how to achive it.
let example_bytes : Vec<u8> = std::fs::read("example.tar.gz").unwrap();
dbg!(example_bytes.len());
// Corrupt
std::fs::write("rewritten.tar.gz", example_bytes).unwrap();
}
If I try with BufReader
,
let file = File::open("example.tar.gz").unwrap();
let mut file = std::io::BufReader::new(file);
let mut bytes = Vec::new();
file.rewind().unwrap();
file.read_to_end(&mut bytes).unwrap();
// Corrupt
// The resulting file is not 10 bytes this time but,
// 392 bytes less than the original amount.
// The corrupt file ends with the sequence
// FF D3 E5 FF 3B F6 5F A3 F8 if it means something.
std::fs::write("rewritten.tar.gz", bytes).unwrap();
Is there a way to get the raw bytes of this compressed archive so I can upload it to cloud storage?
archive.finish().unwrap();
That is not sufficient:
This function should only be called when the archive has been written entirely and if an I/O error happens the underlying object still needs to be acquired.
All finish
does is write out two empty records (which signals the end of archive) and then set the finished
flag on the Builder
.
You need to call into_inner
to finish()
the gzip archive, following which you need to flush / close the file itself.
In fact calling tar::Builder::finish
is unnecessary since tar::Builder::into_inner
does that for you if the archive is not yet finish-ed.