I've a method to compress a byte-array. I used a memorystream and a filestream for testing. the result from the memorystream is larger, even if its the same method, can anyone explain why?
public byte[] DeflateCompress(byte[] data2Compress)
{
using (FileStream _fileToCompress = File.Create("_deflatecompressed.bin"))
{
using (DeflateStream _compressionStream = new DeflateStream(_fileToCompress, CompressionMode.Compress))
{
_compressionStream.Write(data2Compress, 0, data2Compress.Length);
_compressionStream.Close();
}
}
return File.ReadAllBytes("_deflatecompressed.bin");
}
public byte[] DeflateCompress(byte[] data2Compress)
{
using (MemoryStream _memStreamCompress = new MemoryStream())
{
using (DeflateStream _defalteStreamCompress = new DeflateStream(_memStreamCompress, CompressionMode.Compress))
{
_defalteStreamCompress.Write(data2Compress, 0, data2Compress.Length);
_defalteStreamCompress.Close();
}
return _memStreamCompress.GetBuffer();
}
}
If I write the output byte-array to a file, then the one created with memorystream is larger.
MemoryStream.GetBuffer()
will return the full internal buffer, which can be larger than the data. It's resized in chunks as needed. When you exceed the buffer capacity, the internal buffer size is doubled.
If you need to convert the MemoryStream
to a byte array containing only the data, use MemoryStream.ToArray()
. It will create a new array of the adequate size, and copy the relevant buffer contents into it.
As MSDN puts it:
Note that the buffer contains allocated bytes which might be unused. For example, if the string
"test"
is written into theMemoryStream
object, the length of the buffer returned fromGetBuffer
is256
, not4
, with252
bytes unused. To obtain only the data in the buffer, use theToArray
method; however,ToArray
creates a copy of the data in memory.
The GetBuffer
function is useful in the case when you want to read a chunk from the buffer, and you don't care if the size doesn't exactly match. ToArray
is slower as it has to copy the whole buffer contents on each call, while GetBuffer
will simply return a reference to the buffer.
For instance, GetBuffer
can be useful if you use methods such as Stream.Write
:
public abstract void Write(
byte[] buffer,
int offset,
int count
)
There are many places in the framework with such overloads that take a buffer but only process a chunk of it.