I have the following dataframe in Structured Streaming:
TimeStamp|Room|Temperature|
00:01:29 | 1 | 55 |
00:01:34 | 2 | 51 |
00:01:36 | 1 | 56 |
00:02:03 | 2 | 49 |
I am trying to detect when temperatures fall below a certain temperature (50 in this case). I have that part of the query working. Now, I need to pass this information to an API endpoint via a POST call like this: '/api/lowTemperature/' with the timestamp and the temperature in the body of the request. So, in the above case, I need to send along:
POST /api/lowTemperature/2
BODY: { "TimeStamp":"00:02:03",
"Temperature":"49" }
Any idea how I can achieve this using PySpark?
One way I thought of doing this was using Custom streaming sink, but, I can't seem to find any documentation on achieving this using Python.
Good news, as support for Python has recently been added for the ForeachWriter. I created one in Python for REST and Azure Event Grid and it's rather straightforward. The (basic) documentation, can be found here: https://docs.databricks.com/spark/latest/structured-streaming/foreach.html#using-python