Search code examples
socratasoda

How do I get random data from Socrata API?


How can I get random data sample from Socrata API? Namely, I'm trying to get https://health.data.ny.gov/resource/s8d9-z734.json, but in moment do not prefer to download it whole, as it is very large.


Solution

  • For performance and caching reasons (imagine the impact of a bunch of clients calling $order=rand() over and over...), we don't have any sort of rand() or sampling functions, but you can create your own sample set with a little bit of work:

    1. Perform a $select=count(*) query to determine how large the set is
    2. Use rand() locally to come up with some offsets
    3. Use $limit and $offset in conjunction with a stable $order to pick out individual records. Ex: $order=facility_id&$limit=1&$offset=<some rand() number>

    Unfortunately to get a sample size of 1000, that'll take 1001 calls to the API. Make sure you sign up for an app token...