I have a string referred to as ($date) that I am trying to split into two parts using Perl.
$date= (June 25, 2018–July 1, 2018)
From what I have read it seems that the proper way to split this string into the two separate dates would be to create a new array, use the Perl split() function with the hyphen as a delimiter and then assign the array index values to my StartDate/EndDate variables like this...
@dates = split(/-/, $date);
$StartDate = @dates[0];
$EndDate = @dates[1];
print "Effective Date: ($date)\n";
print "($StartDate)";
print "\n";
print "($EndDate)";
However this does not work as I expected it to.
Please keep in mind that the code above is only a small section of the source code.
Current Output (Incorrect)
Effective Date: (June 25, 2018–July 1, 2018)
(June 25, 2018–July 1, 2018)
()
Expected Output (Correct)
Effective Date: (June 25, 2018–July 1, 2018)
(June 25, 2018)
(July 1, 2018)
Looking for any advice on how to achieve my goal.
The problem here is that you're trying to split on -
(U+002D HYPHEN-MINUS) but your string contains –
(U+2013 EN DASH).
There are a couple of ways you can specify this character in a regex:
use utf8;
...
my ($StartDate, $EndDate) = split /–/, $date;
use utf8
tells perl that your source code is in UTF-8, so you can use Unicode characters literally.
my ($StartDate, $EndDate) = split /\x{2013}/, $date;
Or you can use a hex character code.
my ($StartDate, $EndDate) = split /\N{EN DASH}/, $date;
Or a named character reference.
If you don't necessarily want to split on EN DASH but any dash-like character, you can use a character class based on the "Dash" property:
my ($StartDate, $EndDate) = split /\p{Dash}/, $date;
Note that @dates[0]
will trigger a warning (if use warnings
is enabled, which it should be) because a single element of an array @foo
is spelled $foo[0]
in Perl. The syntax @array[ LIST ]
is used for array slices, i.e. extracting multiple elements by their indices.