Search code examples
perl

Separating an output with a Tab / Space


I am working with three text documents. The first one is the main input (Input 1) with words and the word type (Noun, Verb, etc.) separated by a tab.

Input 1

John    N
goes    V
to      P
school  N
.       S
Mary    N
comes   V
from    P
home    N
.       S

The second and third input text files look like this:

Input 2

John
Mary

Input 3

to
from

My objective is to compare and match the second and third text files with the main input and get an output like this:

Expected output:

John    N   N
goes    V
to      P   P
school  N
.       S
Mary    N   N
comes   V
from    P   P
home    N
.       S

All the three columns should be separated by tab or space. However, I am getting an output like this:

John N  
N
goes    
V
to P    
P
school  
N
.   
S
Mary N  
N
comes   
V
from P  
P
home    
N
.   
S

I believe this is happening as I took the input of the first text file into an array and printed the values. Please suggest a way to get the desired output.

The program coding that I have used is below:

#!/usr/bin/perl

use warnings;
use strict;

my @file = ('Input 1.txt');

open my $word_fh, '<', 'Input 2.txt' or die $!;
open my $word2_fh, '<', 'Input 3.txt' or die $!;

my %words_to_match = map {chomp $_; $_ => 0} <$word_fh>;
my %words_to_match2 = map {chomp $_; $_ => 0} <$word2_fh>;

close $word_fh;
close $word2_fh;

check($_) for @file;

sub check {
    my $file = shift;

open my $fh, '<', $file or die $!;

while (<$fh>){
    chomp;
    my @words_in_line = split;

    for my $word (@words_in_line){
        $word =~ s/[(\.,;:!)]//g;
        $word .= '  N' if exists $words_to_match{$word};
        $word .= '  P' if exists $words_to_match2{$word};
        
        print "$word\n";
    }
    print "\n";
}

Again, the objective is to have an output with all the three columns separated by tab or space.


Solution

  • You are outputting an unnecessary newline, and you are constructing your new output line incorrectly. There is no need to search your hashes for the "type" column. This produces the desired output.

    use warnings;
    use strict;
    
    my @file = ('Input 1.txt');
    
    open my $word_fh,  '<', 'Input 2.txt' or die $!;
    open my $word2_fh, '<', 'Input 3.txt' or die $!;
    
    my %words_to_match  = map { chomp $_; $_ => 0 } <$word_fh>;
    my %words_to_match2 = map { chomp $_; $_ => 0 } <$word2_fh>;
    
    close $word_fh;
    close $word2_fh;
    
    check($_) for @file;
    
    sub check {
        my $file = shift;
        open my $fh, '<', $file or die $!;
        while (<$fh>) {
            chomp;
            my ($word, $type) = split;
            my $line = $_;
            $line .= '  N' if exists $words_to_match{$word};
            $line .= '  P' if exists $words_to_match2{$word};
            print "$line\n";
        }
    }