Search code examples
pyscript

Retrieving CSV file from URL


I’m trying to read a csv file from a url using pyscript. (To eventual load into pandas, not shown in my example.) Following the example here, https://docs.pyscript.net/latest/guides/http-requests.html, I’m able to use pyfetch to retrieve the example payload, which appears to be json. However, I can’t seem to use it to retrieve a csv file (or any other non-json payload.)

An example is below. The first download works; the second does not.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width,initial-scale=1" />
    <link rel="stylesheet" href="https://pyscript.net/latest/pyscript.css" />
    <script defer src="https://pyscript.net/latest/pyscript.js"></script>
  </head>
  <body>
    <py-script>
        import asyncio
        import json
        
        from pyodide.http import pyfetch

        
        async def main():
            # This works
            url = "https://jsonplaceholder.typicode.com/posts/2"
            response = await pyfetch(url)
            print(f"status:{response.status}")

            # This does not work
            url = "https://people.math.sc.edu/Burkardt/datasets/csv/turtles.csv"
            response = await pyfetch(url)
            print(f"status:{response.status}")


        asyncio.ensure_future(main())
    </py-script>
  </body>
</html>

I should note, I did find a couple pyscript tutorials on csv files specifically; however, they appear to use deprecated approaches.


Solution

  • PyScript has a builtin feature to allow loading of remote files into the in-browser filesystem called [[fetch]] configurations. These remove the need to rely on open_url directly.

    A fetch configuration can take several keys - in your case, the from key fetches the content from a given URL, and places the content in a folder in the in-browser filesystem (that Python has access to) in a file that looks like te last part of that URL's path. (In this case, turtles.csv).

    You can also use the to_file key to specify a different filename, or the to_folder key to place that file within a different folder in the in-browser filesystem.

    <py-config>
        packages = [
            "numpy",
            "pandas",
            "jinja2"
        ]
        [[fetch]]
        from = 'https://raw.githubusercontent.com/fomightez/pyscript_test/main/turtles.csv'
    </py-config>
    
    <py-script>
        import pandas as pd
    
        df = pd.read_csv('turtles.csv')
        Element("pandas-output").element.style.display = "block"
        display (df.head().style.format(precision=2), target="pandas-output-inner", append="False")
    </py-script>