I need some senior advice here. I want to create an API using JS, but all the ML functionality using Python. I dont want to get rid of the awesome JS libraries like GraphQL, but i dont want to sacrifice the Python performance. I know I can use Tensorflow.js, but as I said, in terms of performance, Python is way better.
I have in mind something like deploying to the cloud a ML model using Python and then fetch the predictions in my JS API or something like that.
Other idea is to create the inference using Python, save it in form of .h5 or .json, and then load them directly with Tensorflow.js in my API.
##### LOCAL #####
inputs = Input(shape=(trainX.shape[1], trainX.shape[2], trainX.shape[3]))
...
Conv2D
Conv2D
Conv2D
...
model = Model(inputs=inputs, outputs=predictions)
model.compile(...)
model.fit(...)
model.save(model.json) # I dont think I can save the weights in Python in the .json format
##### API #####
https.get('SOMEURL', (resp) => {
const model = await tf.loadLayersModel('file://path/to/my-model/model.json');
const { data } = resp
return model.predict(data)
}).on("error", (err) => {
console.log("Error: " + err.message);
});
I dont really know if this could ever work, or there is a better form for this (or is it even possible).
All ideas and advices are appreciated. Thank You.
You have pointed out the two methods that you can use to performing predictions for your ML/DL model. I will list down the steps needed for each and my own personal recommendations.
Here you would have to build and train the model using Tensorflow and Python. Then to use the model on your web application you would need to convert it to the correct format using tfjs-converter. For example, you would get back a model.json
and group1-shard1of1.bin
file which you can then use to make predictions on data from the client side
. To improve performance you can quantize the model when converting it.
General Data Protection Regulation (GDPR)
which really complicates things.online-learning
(training the model on new data it sees and improving it on the fly).Here you would have to use some sort of library for making REST API's. I would recommend FastAPI which is quite easy to get up and running. You would need to create routes for you to POST
data to the model. You create routes that you make POST
request to where these request receive the data from the client side and then using the model you have perform predictions on the data. Then it will send back the predictions in the request's body. The API and the code for making predictions would have to be hosted somewhere for you to query it from the client side, you could use Heroku for this. This article goes over the entire process.
I dont want to get rid of the awesome JS libraries like GraphQL, but i dont want to sacrifice the Python performance. I know I can use Tensorflow.js, but as I said, in terms of performance, Python is way better.
One thing I would like to point out is that the prediction speed for the model is going to be the same regardless if you use Python
or Javascript
. The only way you can improve it is by quantization, which reduces the model size while also improving CPU and hardware accelerator latency, with little degradation in model accuracy as all that you are doing is making predictions using the model. Unless you are sending huge amounts of data to the endpoint in an area with slow internet speeds the differences between using either method is negligible.