Search code examples
perlhashtablekey-valueautovivification

Unintentionally adding keys to hash while iterating


I'm iterating through a cache of a hash of hashes of latitude keys that point to key/value pairs of longitudes/cities. I'm trying to find approximate matches for latitudes/longitudes that are close enough to what's already been looked up and is in the hash.

I'm doing it like this

    foreach my $lat_key ( keys $lookup_cache_latlonhash ) {

        if ( ($lat > ($lat_key - .5)) && ($lat < ($lat_key + .5)) ) {

            foreach my $lon_key ( keys %{ $lookup_cache_latlonhash->{$lat_key}} ) {

                if ( ($lon > ($lon_key - .5)) && ($lon < ($lon_key + .5)) ) {

                    $country = $$lookup_cache_latlonhash{$lat_key}{$lon_key};
                    print "Approx match found: $lat_key $lon_key $country\n";
                    return $country;
                }
            }
        }
    }

The code works to find these lat/lon pairs within the range. However for each latitude it loops through using, when it does find it's in range (the first nested condition), it's adding it to the hash (presumably keys %{ $goog_lookup_cache_latlonhash->{$lat_key}}) which is not intended, adding useless/empty keys to the hash:

$VAR1 = {
      '37.59' => {},
      '37.84' => {},
      '37.86' => {},
      '37.42' => {
                   '126.44' => 'South Korea/Jung-gu'
                 },
      '37.92' => {},
      '37.81' => {},
      '38.06' => {
                   '-122.53' => 'America/Novato'
                 },
      '37.8' => {},
      '37.99' => {},
      '37.61' => {},
       ...

What's the clever, or at least sane, way to do this lookup? So I'm not unintentionally adding keys to the hash just by looking them up?


Solution

  • What you're experiencing is auto-vivification. It's a feature of Perl to make working with nested structures a little easier.

    Any time an undefined value is dereferenced, perl will automatically create the object you're accessing.

    use Data::Dumper; 
    my $hash = {}; if ($hash->{'a'}) {} #No auto-vivification because you're just checking the value   
    keys %{$hash->{'b'}}; #auto-vivification because you're acting on the value (getting the keys of it) $hash->{b} 
    print Dumper($hash);
    

    There are a couple of ways to avoid this -

    1. Add no autovivification in the scope you want to avoid this functionality
    2. Check to see if they item you're accessing is defined or exists (and is of the type you need)

    I recommend the second one because it helps build the habit of checking your code for correct data structuring and makes debugging much easier.

    foreach my $lat_key (keys $lookup_cache_latlonhash) {
        if (($lat > ($lat_key - .5)) 
            && ($lat < ($lat_key + .5)) 
            && ref($lookup_cache_latlonhash->{$lat_key}) eq 'HASH')  #expecting a hash here - undefined or any non-hash value will skip the foreach
        {
            foreach my $lon_key (keys %{ $lookup_cache_latlonhash->{$lat_key}}) {
                if (($lon > ($lon_key - .5)) && ($lon < ($lon_key + .5))) {
                    $country = $$lookup_cache_latlonhash{$lat_key}{$lon_key};
                    print "Approx match found: $lat_key $lon_key $country\n";
                    return $country;
                }
            }
        }
    }