Search code examples
pythonweb-scrapingzipcode

List of all US ZIP Codes using uszipcode


I've been trying to fetch all US Zipcodes for a web scraping project for my company. I'm trying to use uszipcode library for doing it automatically rather than manually from the website im intersted in but cant figure it out.

this is my manual attempt:

from bs4 import BeautifulSoup
import requests

url = 'https://www.unitedstateszipcodes.org'
headers = {'User-Agent': 'Chrome/50.0.2661.102'}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.text, 'html.parser')

hrefs = []
all_zipcodes = []

# Extract all
for data in soup.find_all('div', class_='state-list'):
    for a in data.find_all('a'):
        if a is not None:
            hrefs.append(a.get('href'))
hrefs.remove(None)



def get_zipcode_list():
    """
           get_zipcode_list gets the GET response from the web archives server using CDX API
           :return: CDX API output in json format.
        """
    for state in hrefs:
        state_url = url + state
        state_page = requests.get(state_url, headers=headers)
        states_soup = BeautifulSoup(state_page.text, 'html.parser')
        div = states_soup.find(class_='list-group')
        for a in div.findAll('a'):
            if str(a.string).isdigit():
                all_zipcodes.append(a.string)
    return all_zipcodes

This takes alot of time and would like to know how to do the same in more efficient way using uszipcodes


Solution

  • You can download the list of zip codes from the official source) and then parse it if it's for one-time use and you don't need any other metadata associated with each of the zip codes like the one which uszipcodes provides.

    The uszipcodes also has another database which is quite big and should have all the data you need.

    from uszipcode import SearchEngine
    zipSearch = SearchEngine(simple_zipcode=False)
    allZipCodes = zipSearch.by_pattern('', returns=200000)
    print(len(allZipCodes)