Search code examples
python-polars

Read from generated csv file in memory without writing to hard disk using `scan_csv`


numpy supports reading from StringIO. I can create a string representation of csv and feed it to numpy.

How to do it in pypolars?

I came to know that I can use read_csv() with lazy(), but I want to use scan_csv to get the benefits of lazy eval. How to go about doing that?

EDIT: Updated question to specify scan_csv.


Solution

  • This now works as of Polars 1.7.0

    # bytes

    pl.scan_csv(b"a,b,c\n1,2,3").collect()
    
    shape: (1, 3)
    ┌─────┬─────┬─────┐
    │ a   ┆ b   ┆ c   │
    │ --- ┆ --- ┆ --- │
    │ i64 ┆ i64 ┆ i64 │
    ╞═════╪═════╪═════╡
    │ 1   ┆ 2   ┆ 3   │
    └─────┴─────┴─────┘
    

    # io.BytesIO

    import io
    
    f = io.BytesIO(b"a,b,c\n1,2,3")
    
    pl.scan_csv(f).collect()
    
    shape: (1, 3)
    ┌─────┬─────┬─────┐
    │ a   ┆ b   ┆ c   │
    │ --- ┆ --- ┆ --- │
    │ i64 ┆ i64 ┆ i64 │
    ╞═════╪═════╪═════╡
    │ 1   ┆ 2   ┆ 3   │
    └─────┴─────┴─────┘