I'm trying to implement a multi-dimensional table with headers.
Here's an example for 2D:
< dimension1 >
/\ 'column0' 'column1'
dimension0 'row0' data00 data10
\/ 'row1' data01 data11
The headers for rows and columns are text, and the data is anything. I want to be able to do something like this (syntax can be different, I'm beginner in Perl):
my $table = new table(2); # 2 is the number of dimensions
# the following line creates a new row/column if it didn't exist previously
$table['row0']['column0'] = data00;
$table['row0']['column1'] = data01;
$table['row1']['column0'] = data10;
$table['row1']['column1'] = data11;
# the following line returns the headers of the specified dimension
$table->headers(0);
=> ('row0', 'row1')
First question: Is there something like this already done in CPAN? (before you ask I did search for a significant amount of time and I didn't find anything like it)
Second question: Here's my try, I know it's ugly and probably wrong. Any Perl expert out there care to review my code?
package table;
sub new {
my $class = shift;
my $dimensions = shift;
my $self = bless({}, $class);
$self->{dimensions} = $dimensions;
$self->{data} = [];
$self->{headers} = [];
return $self;
}
sub get_dimensions {
my $self = shift;
return $self->{dimensions};
}
# This function creates a header or return its index if it already existed.
# Headers are encoded as an array of hashes so that this is O(1) amortized.
sub header {
my $self = shift;
my $dimension = shift;
my $header = shift;
my $headers = $self->{headers}[$dimension];
if(!defined($headers)) {
$headers = $self->{headers}[$dimension] = {};
}
if(!defined($headers->{$header})) {
$headers->{$header} = scalar keys %$headers;
}
return $headers->{$header};
}
# This function returns the list of headers. Because the headers are
# stored as a hash (`header=>index`), I need to retrieve the keys
# and sort them by value.
sub get_headers {
my $self = shift;
my $dimension = shift;
my $headers = $self->{headers}[$dimension];
return [sort { $headers->{$a} cmp $headers->{$b} } keys %$headers];
}
# This last function stores/retrieves data from the table.
sub data {
my $self = shift;
my $data = $self->{data};
my $dimensions = $self->{dimensions};
for(my $i = 0; $i < $dimensions-1; ++$i) {
my $index = $self->header($i, shift);
if(!defined($data->[$index])) {
$data->[$index] = [];
}
$data = $data->[$index];
}
my $index = $self->header($dimensions-1, shift);
my $value = shift;
if(defined($value)) {
$data->[$index] = $value;
}
return $data->[$index];
}
You want a structure for an "N" dimensional table. I doubt there's a CPAN module that can do this because it's just not that common a situation.
The problem is that the data structure grows quite rapidly and so does the complexity.
You can store an N dimensional table in a single list by using a bit of mathematics to transform the N dimensional array into a single dimension. Let's say that X represents the X dimension and X' represents the length of that dimension. For a two dimensional table, you could get the value by doing:
X * Y` + Y.
For a 3 dimensional table X, Y, Z, the answer would be:
X * (Y' * Z') + Y * Z' + Z
For a 4 dimensional table W, X, Y, Z, the answer would be:
W * (X' * Y' * Z') + X * (Y' + Z') + Y * Z' + Z'
(I hope the math is right).
Therefore, I can imagine a structure like this for an N dimensional table. It would involve two different classes: One represents the dimensional information and the other represents the actual data (including all of the dimensions).
You can get the number of dimensions by looking at:
my $numOfDimensions = scalar @{$ntable->{DIMENSIONS}};
And, you can get the heading of dimension $x
by looking at:
my xDimensionHeading = $ntable->{DIMENSION}->[$x]->{HEADING};
And, the size of that dimension by looking at:
my xDimensionSize = $ntable->{DIMENSION}->[$x]->{SIZE};
Of course, you'd do this with true object oriented calls, and not bare references, but this gives you an idea how the structure would work.
Now, you need a way of transforming a list of integers that would represent a cell's location into a cell's location along a single dimensional array, and you'll have a way of getting and retrieving your data.
Would this be what you're looking for?
Close to it, but I actually resize the table dimensions a lot (I can't determine their size in advance) and if I understood your solution doesn't accomodate for this.
This adds a lot of complication...
We need to throw out the Size in the Dimension class. And, we can't use a single dimensional array to store our data.
I hope you don't change the table dimensionality.
We could do something like this:
The {DATA} list is a link of lists depending on the depth of the table. For example:
my data_3D = $table_3D->{DATA}->[$x]->[$y]->[$z];
my data_2D = $table_2D->{DATA}->[$x]->[$y];
The number of dimensions is scalar @{$table->{DIMENSION}}
.
The question is how do I access the data in a way that's dimensional neutral. I could require 2, 3, 4, or more dimensions, and I have to have someway of structuring my address to pull it out.
We could have some sort of looping mechanism. We get a list of coordinates in @coordinates
, and then look at each coordinate. The last will point to data. The rest will simply be another reference to another array.
my $data = pop @coordinates; #First Coordinate
$data = $table->[$data]; #Could be data if 1D table, could be a reference
foreach my $coordinate (@coordinates) {
die qq(Not enough coordinates) if ref $data ne 'ARRAY';
$data = $data->[$coordinate]; #Could be data, could be a reference
}
# Cell value is in $data
It also may be possible to build a list of coordinates, and then evaluating it. Again completely untested:
$coordinates = "[" . join ("]->[" => @coordinates . "]";
If there were three coordinates, this would be
$coordinates = "[$x]->[$y]->[$z]";
I'm not sure how a 1 dimensional array would work...
From there, you could build a statement and use eval
on it and get the data.
You'll have to have several methods.
This is more a brain dump, but it I think this might work. You don't have any set table dimensions and it might work for any N-dimensional table.