I have to process thousands of binary files (each of 16MB) by reading pairs of them and creating a bit-level data structure (usually a 1x134217728 array) in order to process them on bit level.
Currently I am doing this the following way:
conv = @(c) uint8(bitget(c,1:32));
measurement = NaN(1,(sizeOfMeasurements*8)) %(1,134217728)
fid = fopen(fileName, 'rb');
byteContent = fread(fid,'uint32');
fclose(fid);
bitRepresentation1 = arrayfun(conv, byteContent, 'UniformOutput', false);
measurement=[bitRepresentation1{:}];
Thus, I replaced fopen
with memmapfile
as below:
m = memmapfile(fileName,'Format',{'uint32', [4194304 1], 'byteContent'});
byteContent = m.data.byteContent;
byteContent = double(byteContent);
I printed timing information (using tic
/toc
) for the individual instructions and it turns out that the bottleneck is:
bitRepresentation1 = arrayfun(conv, byteContent, 'UniformOutput', false); % see first line of code for conv
Are there more efficient ways of transforming byteContent
into an array that stores a bit per index (i.e. that is a bit representation of byteContent
)?
Let looping over all numbers be handled by bitget
. You loop over the bits:
fid = fopen(fileName, 'rb');
bitContent = fread(fid,'*ubit64');
fclose(fid);
conv = @(ii) uint8(bitget(bitContent, ii));
bitRepresentation = arrayfun(conv, 1:64, 'UniformOutput', false);
measurement = [bitRepresentation{:}]';
measurement = measurement(:).';
EDIT you can also try a direct loop:
fid = fopen(fileName, 'rb');
bitContent = fread(fid,'*ubit64');
fclose(fid);
sz = 64 * size(bitContent,1);
measurement3 = zeros(1, sz, 'uint8');
weave = 1:64:sz;
for ii = 1:64
measurement3(weave + ii - 1) = uint8(bitget(bitContent, ii)); end
but on my system, that is (surprisingly) slower than arrayfun
...but, my MATLAB version is from the stone age, your mileage may be different. Give it a try