Search code examples
azuredaskfastparquetintake

Moving data from a database to Azure blob storage


I'm able to use dask.dataframe.read_sql_table to read the data e.g. df = dd.read_sql_table(table='TABLE', uri=uri, index_col='field', npartitions=N)

What would be the next (best) steps to saving it as a parquet file in Azure blob storage?

From my small research there are a couple of options:


Solution

  • $ pip install adlfs

    dd.to_parquet(
        df=df, 
        path='absf://{BLOB}/{FILE_NAME}.parquet', 
        storage_options={'account_name': 'ACCOUNT_NAME',
                         'account_key': 'ACCOUNT_KEY'},
        )