Search code examples
pythonhtmlseleniumgispokemon-go

How to scrape GIS coordinates from a map (e.g. Pokevision) using python and selenium?


I'd like to scrape PokemonVision so that I can get all the longitude and latitude coordinates of the Pokemon being displayed.

The URL of the webpage contains the longitude and latitude of the flag marker . For example the following url contains 39.95142302373031,-75.17986178398132: https://pokevision.com/#/@39.95142302373031,-75.17986178398132

In the source code, the flag marker has the following div:

<div class="leaflet-marker-pane"><img class="leaflet-marker-icon leaflet-zoom-animated leaflet-clickable" src="/asset/image/leaflet//marker-icon.png" style="margin-left: -12px; margin-top: -41px; width: 25px; height: 41px; transform: translate(324px, 135px); z-index: 135;" tabindex="0"/>

I've also noticed that every pokemon being displayed has a div like the following:

<div class="leaflet-marker-icon-wrapper leaflet-zoom-animated leaflet-clickable" style="margin-left: 0px; margin-top: 0px; transform: translate(215px, 113px); z-index: 113;" tabindex="0"><img class="leaflet-marker-icon " src="//ugc.pokevision.com/images/pokemon/116.png" style="margin-left: 0px; margin-top: 0px; width: 48px; height: 48px;"/><span class="leaflet-marker-iconlabel home-map-label" style="margin-left: 10px; margin-top: 26px;">08:15</span></div>

I'm assuming the position of the pokemon and the flag marker can be found in the div, particularly right after the text "transform: translate(".

Considering that we know the both the pixel position and the longitude and latitude of the flag, as well as the pixel position of the pokemon, I believe I should be able to get the longitude and latitude of the pokemon.

For example, the flag marker is always at 324px, 135px and we know that the gis coordinates of the flag marker is 39.95142302373031,-75.17986178398132. We also know the coordinates of a pokemon (e.g. 215px, 113px). However, I can't seems to figure out how to get the longitude and latitude of the pokemon.


Solution

  • If you click the map the URL updates with the coordinates at that point. You can find all visible pokemon on the map, click them, then parse the coordinates from the updated URL. Example code:

    from pprint import pprint as pp
    
    from selenium import webdriver
    from selenium.common.exceptions import WebDriverException
    
    poke_names = {
        21: "Spearow",
        23: "Ekans",
        39: "Jigglypuff",
        98: "Krabby",
        129: "Pidgey",
    
    }
    
    driver = webdriver.Chrome()
    try:
        driver.get("https://pokevision.com/#/@39.95142302373031,-75.17986178398132")
    
        # Zoom out once
        zoom_css = "a.leaflet-control-zoom-out"
        driver.find_element_by_css_selector(zoom_css).click()
    
        # Find all pokemon in the source
        poke_css = "div.leaflet-marker-pane div.leaflet-marker-icon-wrapper"
        pokemon = driver.find_elements_by_css_selector(poke_css)
        print("Found {0} pokemon".format(len(pokemon)))
    
        # Filter for only the ones that are displayed on screen
        on_screen_pokemon = [p for p in pokemon if p.is_displayed()]
        print("There are {0} pokemon on screen".format(len(on_screen_pokemon)))
    
        # Click each pokemon, which moves the marker and thus updates the URL with
        # the coords of that pokemon
        coords = list()
        for pokemon in on_screen_pokemon:
            try:
                pokemon.click()
                # Example URL: https://ugc.pokevision.com/images/pokemon/21.png
                img_url = pokemon.find_element_by_css_selector('img').get_attribute("src")
                img_num = int(img_url.split('.png')[0].split('/')[-1])
            except WebDriverException:
                # Some are hidden by other elements, move on
                continue
            else:
                # Example
                # https://pokevision.com/#/@39.95142302373031,-75.17986178398132
                poke_coords = driver.current_url.split('#/@')[1].split(',')
                poke_name = poke_names[img_num] if img_num in poke_names else "Unknown"
                coords.append((poke_name, poke_coords))
    
        print("Found coordinates for {0} pokemon".format(len(coords)))
        for poke_name, poke_coords in coords:
            print("Found {0} pokemon at coordinates {1}".format(poke_name, poke_coords))
    
    finally:
        driver.quit()
    

    Output:

    (.venv35) ➜  stackoverflow python pokefinder.py
    Found 103 pokemon
    There are 85 pokemon on screen
    Found coordinates for 27 pokemon
    Found Unknown pokemon at coordinates ['39.95481970299595', '-75.18772602081299']
    Found Spearow pokemon at coordinates ['39.952878764070974', '-75.18424987792967']
    Found Spearow pokemon at coordinates ['39.95625069896077', '-75.18845558166504']
    Found Unknown pokemon at coordinates ['39.95685927437669', '-75.18216848373413']
    Found Unknown pokemon at coordinates ['39.95174378273782', '-75.17852067947388']
    Found Unknown pokemon at coordinates ['39.9509706687274', '-75.17377853393555']
    Found Unknown pokemon at coordinates ['39.95241819420643', '-75.17523765563965']
    Found Unknown pokemon at coordinates ['39.95409596949794', '-75.17422914505005']
    Found Unknown pokemon at coordinates ['39.95131610372689', '-75.17277002334595']
    Found Unknown pokemon at coordinates ['39.95276362189558', '-75.17313480377197']
    Found Unknown pokemon at coordinates ['39.95254978591276', '-75.17257690429688']
    Found Unknown pokemon at coordinates ['39.95319129185564', '-75.17094612121582']
    Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17195463180542']
    Found Unknown pokemon at coordinates ['39.95488549657055', '-75.17096757888794']
    Found Unknown pokemon at coordinates ['39.9571224404468', '-75.17251253128052']
    Found Unknown pokemon at coordinates ['39.95633293919831', '-75.17088174819946']
    Found Spearow pokemon at coordinates ['39.94958891128449', '-75.1890778541565']
    Found Pidgey pokemon at coordinates ['39.94958891128449', '-75.18671751022339']
    Found Unknown pokemon at coordinates ['39.94769717428357', '-75.18306970596313']
    Found Unknown pokemon at coordinates ['39.948174225938324', '-75.18070936203003']
    Found Unknown pokemon at coordinates ['39.94458803200817', '-75.17658948898315']
    Found Unknown pokemon at coordinates ['39.94689111392826', '-75.174400806427']
    Found Unknown pokemon at coordinates ['39.948322275775425', '-75.1739501953125']
    Found Ekans pokemon at coordinates ['39.94749977262573', '-75.17088174819946']
    Found Unknown pokemon at coordinates ['39.94842097548884', '-75.17317771911621']
    Found Unknown pokemon at coordinates ['39.94934216594682', '-75.17180442810059']
    Found Unknown pokemon at coordinates ['39.948075525868894', '-75.17107486724852']
    

    This code is problematic for a couple reasons, chief among them being the overly broad and careless exception handling. You should, however, be able to adapt the concept into a more robust solution.