In the quest to make my data more accessible, I want to store my tabulated data in a complex hash. I am trying to grow a 'HoHoHoA' as the script loops over my data. As per the guidelines in 'perldsc':
push @ { $hash{$column[$i]}{$date}{$hour} }, $data[$i];
The script compiles and runs without a problem, but doesn't not add any data to the hash:
print $hash{"Frequency Min"}{"09/07/08"}{"15"};
returns nothing even though the keys should exist. Running an 'exists' on the hash shows that it does not exist.
The data file that I am reading looks like this:
DATE TIME COLUMN1 COLUMN2 COLUMN3...
09/06/2008 06:12:56 56.23 54.23 56.35...
09/06/2008 06:42:56 56.73 55.28 54.52...
09/06/2008 07:12:56 57.31 56.79 56.41...
09/06/2008 07:42:56 58.24 57.30 58.86...
.
.
.
I want to group together the values of each column in an array for any given date and hour, hence the three hashes for {COLUMN}, {DATE} and {HOUR}.
The resultant structure will look like this:
%monthData = (
"COLUMN1" => {
"09/06/2008" => {
"06" => [56.23,56.73...],
"07" => [57.31,58.24...]
}
},
"COLUMN2" => {
"09/06/2008" => {
"06" => [54.23,55.28...],
"07" => [56.79,57.30...]
}
},
"COLUMN3" => {
"09/06/2008" => {
"06" => [56.35,54.52...],
"07" => [56.41,58.86...]
}
}
);
Take a look at my code:
use feature 'switch';
open DATAFILE, "<", $fileName or die "Unable to open $fileName !\n";
my %monthData;
while ( my $line = <DATAFILE> ) {
chomp $line;
SCANROWS: given ($row) {
when (0) { # PROCESS HEADERS
@headers = split /\t\t|\t/, $line;
}
default {
@current = split /\t\t|\t/, $line;
my $date = $current[0];
my ($hour,$min,$sec) = split /:/, $current[1];
# TIMESTAMP FORMAT: dd/mm/yyyy\t\thh:mm:ss
SCANLINE: for my $i (2 .. $#headers) {
push @{ $monthData{$headers[$i]}{$date}{$hour} }, $current[$i];
}
}
}
}
close DATAFILE;
foreach (@{ $monthData{"Active Power N Avg"}{"09/07/08"}{"06"} }) {
$sum += $_;
$count++;
}
$avg = $sum/$count; # $sum and $count are not initialized to begin with.
print $avg; # hence $avg is also not defined.
Hope my need is clear enough. How can I append values to an array inside these sub-hashes?
This should do it for you.
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw/sum/;
sub avg { sum(@_) / @_ }
my $fileName = shift;
open my $fh, "<", $fileName
or die "Unable to open $fileName: $!\n";
my %monthData;
chomp(my @headers = split /\t+/, <$fh>);
while (<$fh>) {
chomp;
my %rec;
@rec{@headers} = split /\t+/;
my ($hour) = split /:/, $rec{TIME}, 2;
for my $key (grep { not /^(DATE|TIME)$/ } keys %rec) {
push @{ $monthData{$key}{$rec{DATE}}{$hour} }, $rec{$key};
}
}
for my $column (keys %monthData) {
for my $date (keys %{ $monthData{$column} }) {
for my $hour (keys %{ $monthData{$column}{$date} }) {
my $avg = avg @{ $monthData{$column}{$date}{$hour} };
print "average of $column for $date $hour is $avg\n";
}
}
}
Things to pay attention to: