Search code examples
pythonflaskkaggle

extract Kaggle files by using flask API in python


I am creating an API using Python Flask API. below is the kaggle url which I am extracting. https://www.kaggle.com/datasets/alphiree/cardiovascular-diseases-risk-prediction-dataset?select=CVD_cleaned.csv

Below is the script which I am using. I am passing parameters through API to extract the kaggle csv public file.

from flask import Flask, request, jsonify
import pandas as pd
from kaggle.api.kaggle_api_extended import KaggleApi

app = Flask(__name__)

api = KaggleApi()
api.authenticate()

def get_file_from_kaggle(dataset_value, select):
    dataset = str(dataset_value)
    file_name = str(select)
    path = 'Assessment/files/'
    api.dataset_download_files(dataset, file_name, path, unzip=True)
    csv_file = f'./{select}'
    return pd.read_csv(csv_file)

@app.route("/get-kaggle-data/<dataset_value>", methods=["GET"])
def get_kaggle_data(dataset_value):
    try:
        select = request.args.get("select")
        if select:
            df = get_file_from_kaggle(dataset_value, select)
            return df.to_json(orient='records')
        else:
            return jsonify({'error': 'Missing "select" parameter'}), 400
    except Exception as e:
        return jsonify({'error': str(e)}), 500

if __name__ == "__main__":
    app.run(debug=True)

below is the URL which I am testing. http://127.0.0.1:5000/get-kaggle-data/alphiree/cardiovascular-diseases-risk-prediction-dataset?select=CVD_cleaned.csv

But getting below error: 404 Not Found The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.


Solution

  • Normally, the / character is used to separate components of the request path. That means that a request for /one/two/three is indicates a different resource than /one/two/four.

    In your code, your kaggle dataset includes / characters, so Flask is looking for a route that matches /get-kaggle-data/alphiree/..., and no such route exists.

    If you want a parameter to consume everything after a particular path component, even if it includes /, you need to tell Flask by using a path parameter type:

    @app.route("/get-kaggle-data/<path:dataset_value>", methods=["GET"])
    def get_kaggle_data(dataset_value):
    

    With the above code, a request for /get-kaggle-data/one/two/three will match this function and will result in dataset_value having the value one/two/three.