I retrieve data from the net containing real geodesic expressions, by that I mean degrees, minutes and seconds with Unicode symbols: U+00B0, U+2032 and U+2033
, named Degree, Prime and Double Prime. Example:
my $Lat = "48° 25′ 43″ N";
My objective is to convert such an expression first to degrees and then to radians to be used in a Perl module I am writing that implements the Vincenty inverse formula to calculate ellipsoidal great-circle distances. All my code objectives have been met with pseudo geodesics, such as "48:25:43 N", but of course, this is hand entered test data, not real world data. I am struggling with crafting a regular expression that can split this real data as I now do pseudo data, as in:
my ($deg, $min, $sec, $dir) = split(/[\s:]+/, $_[0], 4); # this works
I have tried many regular expressions including
/[°′″\s]+/ and
/[\x{0B00}\x{2032}\x{2033}\s]/+
all with dismal results, such as $deg = "48?", $min = "?", $sec = "25′43″ N" and $dir = undef
. I've encapsulated the code inside braces {}
and included within that scope use utf8; and use feature 'unicode_strings'; all with nada results.
input data example:
my $Lat = "48° 25′ 43″ N";
Expected output:
$deg = 48, $min = 25, $sec = 43 and $dir = "N"
You may try this regex to split the string:
[^\dNSEW.]+
Sample source: ( run here )
my $str = '48° 25′ 43″ N';
my $regex = qr/[^\dNSEW.]+/p;
my ($deg, $min, $sec, $dir) = split $regex, $str;