Hye..how to remove punctuation?.. Actually I already try using [:punct:]
but it does not work for all punctuation. It just only remove the dot .
only...but other punctuations still have. My task is to remove paragraph, remove punctuation and change all text to lower case..
this is my text file which is snuker.txt
snuker berjaya menarik perhatian kbs.
19981230
Sam Chong"" kiri dan ooi Chin Kay memberi sumbangan besar kepada pembangunan snuker tanah air
dengan merangkul pingat' emas sukan asia ti'ga belas tahun sembi'lan belas sembilan puluh lapan membuka
lembaran baru snuker dan biliard tanah air apabila mereka kian disegani dan berjaya menukar tanggapan.
negatif masyarakat tempatan terhadap sukan itu
and this is my perl script
#!/usr/bin/perl
use utf8;
if(! open(INPUT, '< snuker.txt'))
{
die "cannot opent input file: $!";
}
if(! open(OUTPUT, '> output.txt'))
{
die "cannot open output file: $!";
}
select OUTPUT;
while($lines = <INPUT>)
{
if($lines =~ s/[\s[:punct:]]+$/ /g)
{
print "$lines";
}
}
close INPUT;
close OUTPUT;
close STDOUT;
the output are like this...the other punctuations still have..only .
are gone..
snuker berjaya menarik perhatian kbs 19981230 Sam Chong"" kiri dan ooi Chin Kay memberi sumbangan besar kepada pembangunan snuker tanah air dengan merangkul pingat' emas sukan asia ti'ga belas tahun sembi'lan belas sembilan puluh lapan membuka lembaran baru snuker dan biliard tanah air apabila mereka kian disegani dan berjaya menukar tanggapan negatif masyarakat tempatan terhadap sukan itu
#!/usr/bin/perl
use utf8;
if(! open(INPUT, '< test_file'))
{
die "cannot opent input file: $!";
}
if(! open(OUTPUT, '> output.txt'))
{
die "cannot open output file: $!";
}
select OUTPUT;
while($lines = <INPUT>)
{
$lines =~ s/\n/ /g;
$lines =~ s/[[:punct:]]//g;
print lc("$lines");
}
close INPUT;
close OUTPUT;
close STDOUT;