I want to read a big zarr
file from my minio(s3) server,however,after I changed three methods,they are all crashed:
import hydrodata.configs.config as conf
# Method 1
# https://pastebin.com/vkM1M3VV
zarr_path = await conf.FS.open_async('s3://datasets-origin/usgs_streamflow_nldas_hourly.zarr')
zds = xr.open_dataset(zarr_path, engine='zarr')
# Method 2
# https://pastebin.com/fKKECf3U
zarr_path = conf.FS.get_mapper('s3://datasets-origin/usgs_streamflow_nldas_hourly.zarr')
wrapped_store = zarr.storage.KVStore(zarr_path)
zds = xr.open_zarr(wrapped_store)
# Method 3
# AttributeError: __enter__
with conf.FS.open_async('s3://datasets-origin/usgs_streamflow_nldas_hourly.zarr') as zarr_path:
zds = xr.open_dataset(zarr_path)
And this is conf.FS
:
FS = s3fs.S3FileSystem(
client_kwargs={"endpoint_url": MINIO_PARAM["endpoint_url"]},
key=MINIO_PARAM["key"],
secret=MINIO_PARAM["secret"],
use_ssl=False,
)
So how to solve their problem and let me get correct data?
———————————————————————————————————
This is my crash report in Method2:
name = 'xarray.core.daskmanager'
import_ = <function _gcd_import at 0x7fe2aabbb400>
> ???
E ModuleNotFoundError: No module named 'xarray.core.daskmanager'
However I have run pip install xarray[complete]
and conda install -c conda-forge xarray dask netCDF4 bottleneck
before,so where's the problem?
This is my pip list: https://pastebin.com/BUbcNqtT
I have redone this practice on another computer, at last I found this problem can't be reproduced. Now I should close the problem.