I am perl beginner. I am trying to build a 2d array at run time from a binary file. I am getting a "out of memory" error. I am using Perl 5.16.3 in windows7. My input file size is ~4.2MB. My system has a physical memory of 4GB and I am hitting 90% usage and then showing up the out of memory error when I run this code.
I tried lot of ways to debug this. Only If I reduce the b32 to b16 or less, I am able to run successfully. Even with this, if the file size increase beyond 4MB, the error shows up again. I tried looking at physical memory usage in task manager while executing the code, it keep on increasing.
My friend suspected this should be memory leak issue. I couldnt make out with his suspect. I need help on fixing this.
#!/usr/bin/perl
use strict;
use warnings;
open( DATA, 'debug.bin' ) or die "Unable to open:$!";
binmode DATA;
my ( $data, $n, $i );
my @2dmatrix;
while ( $n = read DATA, $data, 4 ) {
push @2dmatrix, [ split( '', unpack( 'b32', $data ) ) ];
}
print scalar(@2dmatrix);
print "completed reading";
close(DATA);
Just to clear the requirement. From the 2d array build, I need to extract contents from a column A corresponding to a particular pattern (11111111000000001111111100000000) in column B. This needs to be done on 4 set of columns with a file size of 500Mb.
It's not a memory leak, your program is just very inefficient with memory use.
For every 4 bytes you read in, you do an unpack 'b32'
which creates a 32-character string; split //
it, which turns it into 32 1-character strings, make an arrayref of the resulting list, and push the arrayref on @2dmatrix
. That results in:
"0\0"
or "1\0"
) although perl might decide to use more to avoid reallocations if the strings grow: 64 bytes.@2dmatrix
's array body: 4 bytes on 32-bit, 8 bytes on 64-bit.With a result of 1136 bytes per 4 bytes (284x multiplication) on 32-bit and 1672 bytes per 4 bytes (418x multiplication) on 64-bit, not accounting for constant factors and the fact that perl might choose to use larger string bodies (on two versions of perl I tested here, I got either 10 or 16 bytes, not 2.) As such your program will use upwards of 1.1GB of memory for a 4.2MB input on a 32-bit system, and upwards of 1.7GB of memory for a 4.2MB input on a 64-bit system.
The solution here is to store and access the data in a more efficient way, but I can't give any specific advice because you haven't said what you're actually trying to do with @2dmatrix
once you have it.