I have the following code:
my $str = 'Uploaded 07-02▒05:14, Size 212.14▒MiB, ULed by someone';
print "Pre:".$str."\n";
my $str =~ s/^[a-zA-z0-9,]//g;
print "Post:".$str."\n";
My aim was to remove those special characters and spaces so that I could split the string for further processing.
With the regex above, I was trying to remove all characters except alphanumeric characters and comma. Unfortunately I am getting a blank line. I'm a beginner to regex and would like to know what is wrong with my expression.
You have three errors conspiring to break your program. If you had use strict
and use warnings
at the top of your code as you should have then Perl would have printed messages to alert you
You have declared a second $str
, which is therefore undef
and is printed as an empty string
You have the caret outside the character class, so it is acting as a start-of-string anchor instead of negating the class
You have [a-zA-z0-9]
as your character class. A-z
includes the characters [
, \
, ]
, ^
, _
, and `
as well as the upper and lower case alphabet. You need [a-zA-Z0-9]
instead
Here is some working code. Your text string contains a Unicode character U+2592 Medium Shade so I've had to use utf8
to mark the code as being encoded in UTF-8, and use open
to set STDOUT to accept UTF-8 encoding
use utf8;
use strict;
use warnings;
use open qw/ :std :encoding(utf-8) /;
my $str = 'Uploaded 07-02▒05:14, Size 212.14▒MiB, ULed by someone';
print "Pre: $str\n";
$str =~ s/[^a-zA-Z0-9,]//g;
print "Post: $str\n";
Pre: Uploaded 07-02▒05:14, Size 212.14▒MiB, ULed by someone
Post: Uploaded07020514,Size21214MiB,ULedbysomeone