Search code examples
htmlperlcsvwritetofile

Get results to write to CSV using Perl


The following Perl script cureently reads in an html file and strips off what I don't need. It also opens up a csv document which is blank.

My problem being is I want to import the stripped down results into the CSV's 3 fields using Name as field 1, Lives in as field 2 and commented as field 3.

The results are getting displayed in the cmd prompt but not in the CSV.

use warnings; 
use strict;  
use DBI;
use HTML::TreeBuilder;  
use Text::CSV;

open (FILE, 'file.htm'); 
open (F1, ">file.csv") || die "couldn't open the file!";


my $csv = Text::CSV->new ({ binary => 1, empty_is_undef => 1 }) 
    or die "Cannot use CSV: ".Text::CSV->error_diag (); 

open my $fh, "<", 'file.csv' or die "ERROR: $!"; 
$csv->column_names('field1', 'field2', 'field3'); 
while ( my $l = $csv->getline_hr($fh)) { 
    next if ($l->{'field1'} =~ /xxx/); 
    printf "Field1: %s Field2: %s Field3: %s\n", 
           $l->{'field1'}, $l->{'field2'}, $1->{'field3'} 
} 
close $fh; 

my $tree = HTML::TreeBuilder->new_from_content( do { local $/; <FILE> } ); 

for ( $tree->look_down( 'class' => 'postbody' ) ) {
    my $location = $_->look_down
    ( 'class' => 'posthilit' )->as_trimmed_text;     

    my $comment  = $_->look_down( 'class' => 'content' )->as_trimmed_text;
    my $name     = $_->look_down( '_tag'  => 'h3' )->as_text;     

    $name =~ s/^Re:\s*//;
    $name =~ s/\s*$location\s*$//;      

    print "Name: $name\nLives in: $location\nCommented: $comment\n";
} 

An example of the html is:

<div class="postbody">
    <h3><a href "foo">Re: John Smith <span class="posthilit">England</span></a></h3>
    <div class="content">Is C# better than Visula Basic?</div>
</div>

Solution

  • You don't actually write anything to the CSV file. Firstly, it isn't clear why you open the file for writing and then later for reading. You then read from the (now empty) file. Then you read from the HTML, and display the contents you want.

    Surely you will need to write to the CSV file somewhere if you want data to appear in it!

    Also, it's best to avoid barewords for file handles if you want to use them through Text::CSV.

    Maybe you need something like:

    my $csv = Text::CSV->new();
    $csv->column_names('field1', 'field2', 'field3');
    open $fh, ">", "file.csv" or die "new.csv: $!";
    ...
    # As you handle the HTML
    $csv->print ($fh, [$name, $location, $comment]);
    ...
    close $fh or die "$!";