Search code examples
fileperlbinarydata-conversion

Problem converting a text file into binary file using perl


I would like to convert some data in binary using Perl. The data needs to be output in 8 bits binary. The original data comes in this format:

137.0000
136.0000
133.0000
136.0000
10.0000
134.0000
0.0000
132.0000
132.0000

To do so, I transformed the data to suppress the ".0000" then I use the pack function with the option C* (this format correspond to an "unsigned character (usually 8 bits)" according to the documentation). I called this file txt2bin.pl:

my $file = $ARGV[0];
my $fileout = $file.".bin";

if($file eq "-h" or $file eq "-help" or $file eq "")
{  
    print "Usage : txt2bin.pl file_in\n";
    print "        file_out = file_in.bin\n";
    print "This script converts a txt file to a binary file\n";

}

else
{
  print "File in = $file\n";
  print "File out = $fileout\n";

  open(FILE,"<$file") or die $!;
  open(FILEOUT,">$fileout") or die $!;
    binmode FILEOUT;
    while(defined(my $line=<FILE>))
    {
      chomp($line);
      $line =~ s/\.0000//;
      syswrite FILEOUT, pack("C*",$line);
    }
  close(FILE);
  close(FILEOUT);
}

I also need to be able to do the reverse operation, so, I created another file bin2txt.pl:

my $file = $ARGV[0];
my $fileout = $file.".txt";

if($file eq "-h" or $file eq "-help" or $file eq "")
{
    print "Usage : bin2txt.pl file_in\n";
    print "        file_out = file_in.txt\n";
    print "This script converts a binairy file to a txt file\n";
}

else
{
  print "File in = $file\n";
  print "File out = $fileout\n";

  my $file = "<$file";

  # undef $/ to read whole file in one go
  undef $/;

  open(FILE,$file) or die $!;
  open(FILEOUT,">$fileout") or die $!;

  # binmode FILE to supress conversion of line endings
  binmode FILE;

  my $data = <FILE>;
  $data =~ s/(.{1})/unpack("C*",$1).".0000 \n"/eg;
  syswrite FILEOUT, $data;
}

However, when I execute the first program txt2bin.pl, then the second, I should get:

137.0000
136.0000
133.0000
136.0000
10.0000
134.0000
0.0000
132.0000
132.0000

Instead, of that, I get this:

137.0000
136.0000
133.0000
136.0000

134.0000
0.0000
132.0000
132.0000

The 10.0000 does not show up, do you guys have any idea about this ? Thanks for helping.


Solution

  • You need to add the s modifier to the regexp substitution in order to match a 10 (newline) :

    $data =~ s/(.{1})/unpack("C*",$1).".0000 \n"/seg;
    

    From perldoc perlre :

    s
    Treat the string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.