Search code examples
regexperltabular

Tabulation of data in perl


I have the input format as below, I want to create a tabular format for these data.

CELL    = "abc"
        "model"         "abc"
        "description"   "qwerty+keypad with slide"
**tech**
        size    (big \$l \$w m)
        termOrder         (x y z)
        namePrefix        "S"
        prop       (nil \$l l \$w w)
    **spec**
        term      (nil C \:1 B \:2 E \:3)
        termOrder         (x y z)
***********************************************************
 CELL    = "efg"
        "model"        "efg"
        "description"  "touchscreen+qwerty no slide"
**tech**
        size    (small \$l \$w m)
        termOrder         (x y z)
        namePrefix        "S"
        prop       (nil \$l l \$w w)
 **spec**
        term          (nil x \:1 y \:2 z \:3)
        termOrder         (x y z)

I want a table with names on left as headers and the data on the right to be its values.

.                                         tech                            spec   
CELL   model   description   size   termOrder   namePrefix  prop  termOrder Term       

These are the headers and I want the corresponding values below these headers. I tried using this code which I had used for another kind of tabulation:

my $pr      = "%-12s";  
my @headers = qw/............../;  
my %names;

while (<DATA>) {          

    chomp;          
    my $line = <DATA>;          
    %{$names{$_}} = split /=|\s+/, $line;  
} 

printf $pr x @headers . "\n", @headers;  

for (keys %names) {

    my @ds = ($_);          
    for my $k (@headers[1 .. $#headers]) {     

        my $v = $names{$_}->{$k};                 
        push @ds, $v ? $v : '-';
    }         
    printf $pr x @ds . "\n", @ds;
 } 

This doesn't yield a required result, so kindly help me out with this.


Solution

  • This is really not the easiest task and like always, there's more than one way to do it. Here's one. If there are any questions, feel free to ask because it's really too much code to explain everything.

    However, if it was my task, I would have chosen HTML as the output format to get rid of all these width calculations - also there are comfortable JS tools to sort those tables. If you really want to do things like this with text only, maybe "good old formats" are for you. ;)

    Code

    #!/usr/bin/env perl
    
    use strict;
    use warnings;
    use feature 'switch';
    use List::Util 'sum';
    
    # preparations
    my @blocks; # array for all data block
    my $block;  # the data block we're working with
    my $part;   # the data block part we're working with
    
    # read things and decide what to do
    for (<DATA>) {
        chomp;
    
        # start of a new data block, first part: main
        when (/CELL\s*=\s*"?([^"]+)"?/) {
            $part   = 'main';
            $block  = {
                $part   => {CELL => $1},
                tech    => {},
                spec    => {},
            };
            push @blocks, $block;
            next;
        }
    
        # start a new part
        when (/\*\*(tech|spec)\*\*/) {
            $part = $1;
            next;
        }
    
        # fill parts
        when (/"?(\w+)"?\s+"?([^"]+)"?/) {
            $block->{$part}{$1} = $2;
            next;
        }
    }
    
    # prepare output
    my %columns = (
        main => [
            {name => 'CELL',        length =>  5},
            {name => 'model',       length =>  5},
            {name => 'description', length => 30},
        ],
        tech => [
            {name => 'size',        length => 20},
            {name => 'termOrder',   length => 10},
            {name => 'namePrefix',  length => 10},
            {name => 'prop',        length => 20},
        ],
        spec => [
            {name => 'term',        length => 30},
            {name => 'termOrder',   length => 10},
        ],
    );
    
    # part legend
    foreach my $part (qw(main tech spec)) {
        my $width = sum map {$_->{length} + 2} @{$columns{$part}};
        print $part . ' ' x ($width - length $part);
    }
    print "\n";
    
    # column legend
    foreach my $part (qw(main tech spec)) {
        foreach my $column (@{$columns{$part}}) {
            my ($name, $length) = @{$column}{qw(name length)};
            print $name . ' ' x ($length - length($name) + 2);
        }
    }
    print "\n";
    
    # print each block in columns
    foreach my $block (@blocks) {
        foreach my $part (qw(main tech spec)) {
            foreach my $column (@{$columns{$part}}) {
                my $value = $block->{$part}{$column->{name}};
                print $value . ' ' x ($column->{length} - length($value) + 2);
            }
        }
        print "\n";
    }
    
    __DATA__
    CELL    = "abc"
            "model"         "abc"
            "description"   "qwerty+keypad with slide"
    **tech**
            size    (big \$l \$w m)
            termOrder         (x y z)
            namePrefix        "S"
            prop       (nil \$l l \$w w)
        **spec**
            term      (nil C \:1 B \:2 E \:3)
            termOrder         (x y z)
    ***********************************************************
     CELL    = "efg"
            "model"        "efg"
            "description"  "touchscreen+qwerty no slide"
    **tech**
            size    (small \$l \$w m)
            termOrder         (x y z)
            namePrefix        "S"
            prop       (nil \$l l \$w w)
     **spec**
            term          (nil x \:1 y \:2 z \:3)
            termOrder         (x y z)
    

    Output

    main                                          tech                                                                spec                                        
    CELL   model  description                     size                  termOrder   namePrefix  prop                  term                            termOrder   
    abc    abc    qwerty+keypad with slide        (big \$l \$w m)       (x y z)     S           (nil \$l l \$w w)     (nil C \:1 B \:2 E \:3)         (x y z)     
    efg    efg    touchscreen+qwerty no slide     (small \$l \$w m)     (x y z)     S           (nil \$l l \$w w)     (nil x \:1 y \:2 z \:3)         (x y z)