I have access to a dataset that contains latitude and longitude pairs, but they are ill formatted and are not properly represented in the data set.
For example, a lat-long pair might look like this: 31333445, 105530865
when it should be 31.333445, -105.530865
. Given the data set I am working with I know that the min value for latitude is 31.0
and the max is 37.0
, and the min/max of longitude is -103
to -109
.
If I was given a piece of paper and a pencil I could easily correct these myself, but the correction needs to happen on the fly when we receive input from a different program. We have no control over how the data is formatted until it hits our system and then we can make changes and corrections, and the lat-long pairs are all in a integer format listed above rather than a float.
What would be the best way to go about manually correcting this error? I am using PHP for our processing system.
If they're the same length then just divide by 1000000
and make negative where needed:
echo $lat / 1000000;
echo -$lon / 1000000;
If not then get the number of numbers at the start (2 and 3 here) making negative if needed, then insert a decimal and the remaining:
echo substr($lat, 0, 2) . '.' . substr($lat, 2);
echo -substr($lon, 0, 3) . '.' . substr($lon, 3);
You can use floatval()
on the results if needed.