Search code examples
socratasodaopendata

Socrata: Find all datasets from a domain


I would like to get a list of all data available at the Socrata website for Mesa, AZ using an API from Python. Based on the web search on the site, there are 1176 total results and 305 datasets.

I have tried using the Socrata Open Data Network as described in this answer. However, that only returns 41 results not the 1176 expected:

https://api.us.socrata.com/api/catalog/v1?domains=data.mesaaz.gov&offset=0

A web search of opendatanetwork gives the same result.

I have also tried using the datasets function in the sodapy Python library. That returns the same 41 results. When I look under the hood of sodapy, it appears to be making an API request specific to Mesa:

https://data.mesaaz.gov/api/catalog/v1?domains=data.mesaaz.gov

In fact, removing the domains filter from the api.us.socrata.com and data.mesaaz.gov queries give the same result, which includes datasets outside Mesa. It appears that the data.mesaaz.gov is misleading and just searches the Open Data Network not data.mesa.gov.

I do not see anything the Socrata API documentation for querying available datasets. There appears to only be tools for query individual datasets.


Solution

  • So mesaaz.gov gets their data from data.mesaaz.gov and citydata.mesaaz.gov

    This returns 305 results:
    https://api.us.socrata.com/api/catalog/v1?only=dataset&domains=data.mesaaz.gov,citydata.mesaaz.gov

    Get list of domains (how I got citydata.mesaaz.gov):
    https://api.us.socrata.com/api/catalog/v1/domains