Search code examples
pythonapiflaskuploadflask-restful

Sending Millions of data points to flask API


This question might be a little too subjective, but I am looking for an optimal way to send millions of datapoints to a flask API.

My current approach is essentially as follows:

  • Send a list of data points that are JSON objects, as well as sending some information that pertains to all of the data points such as the person it was collected on and the date it was collected
  • This updates two tables, a Use table that records the person, date, etc. and then a Data table that associates data points to a given use. This all occurs as one POST request to the Use endpoint

I'm afraid that with this approach it might timeout when sending millions of datapoints.

I'm looking for a way to combat this, some ways I have been considering are

  • Sending an initial POST request to create the Use, then sending the datapoints in patches as a PATCH to the same endpoint or a POST to a new data endpoint
  • sending a csv in a POST request and then parsing through the csv on the server

Haven't been able to find any similar questions online, so looking to see if there is an industry standard or best practice when doing something like this


Solution

  • Whether you're receiving via json or csv, it will remain a lot of data. You might want to shorten your json keys or change your json data types to consume less space.

    It kind of depends if you're using the api to connect to your own website, because if so you might just chop up the data (using js), and send several ajax requests preventing timeouts on slower connections. If you want others to use your api, then you might want to have a look at the last answer on this question