Search code examples
pythondaskchunkingrapidscudf

Explain Dask-cuDF behavior


I try to read and process the 8gb csv file using cudf. Reading all file at once doesn't fit neither into GPU memory nor into my RAM. That's why I use the dask_cudf library. Here is the code:

import dask_cudf as dcf
import dask.dataframe as dd

exceptions = ["a", "b", "t", "c"]
x = dcf.read_csv("./data/all.csv", blocksize="256 MiB", sep=',', header=0, decimal='.', skip_blank_lines=True)
x["timestamp"] = dd.to_datetime(x["t"])
x = x.set_index("timestamp", sorted=False)
x = x.loc[pd.to_datetime("2020-01-01 00:00:00"):]
x = x.drop(columns=exceptions)
x.to_csv("./data/")

And it works and does produce a bunch of csv files like 00.part, 01.part and so on. While processing the data I can see in the task manager that the GPU memory is being used. But in the project root directory I can see the cufile.log. Which says this:

25-01-2024 12:18:54:703 [pid=11847 tid=11862] ERROR 0:140 unable to load, liburcu-bp.so.6

25-01-2024 12:18:54:703 [pid=11847 tid=11862] ERROR 0:140 unable to load, liburcu-bp.so.1

25-01-2024 12:18:54:704 [pid=11847 tid=11862] WARN 0:168 failed to open /proc/driver/nvidia-fs/devcount error: No such file or directory

25-01-2024 12:18:54:704 [pid=11847 tid=11862] NOTICE cufio-drv:727 running in compatible mode

25-01-2024 12:18:54:704 [pid=11847 tid=11862] ERROR cufio-plat:98 cannot open path /sys/bus/pci/devices/0000:01:00.0/resource No such file or directory

25-01-2024 12:18:54:705 [pid=11847 tid=11862] WARN cufio-plat:431 GPU index 0 NVIDIA GeForce GTX 1060: Model Not Supported

25-01-2024 12:18:54:705 [pid=11847 tid=11862] WARN cufio-udev:168 failed in udev device create: /sys/bus/pci/devices/0000:01:00.0

25-01-2024 12:18:54:899 [pid=11847 tid=11862] ERROR cufio-topo-udev:431 no device entries present in platform topology

I have

Windows 10 22H2 (19045.3930)
WSL 2.0.9.0 with Ubuntu 22.04.3 LTS
Cuda compilation tools, release 12.0, V12.0.76 Build cuda_12.0.r12.0/compiler.31968024_0 (from windows console)
NVIDIA-SMI 546.65 (from wsl distro terminal)
Driver Version: 546.65 (from wsl distro terminal)
NVIDIA GeForce GTX 1060 6144MiB
cudf 23.12.01
dask-cudf 23.12.01

While doing all the job the CPU is also extensively loaded. So does it actually use my GPU or used the CPU as a fallback strategy? Processing this code took 16 minutes. The file says that it was running in some kind of compatability mode and my GPU is not supported. The docs state that:

NVIDIA Pascal™ or better with compute capability 6.0+

which is the case for 1060.


Solution

  • It sounds like your GPU is being utilized by dask-cudf, based on the GPU memory usage you mentioned.

    The log messages you're seeing are specific to a feature for GPU Direct Storage (GDS). If GDS is supported, the system can accelerate I/O operations and avoid some CPU-GPU copying in favor of direct memory access from the storage to the GPU. However, GDS is only supported on "Data Center and Quadro (desktop) cards with compute capability >6" (https://docs.nvidia.com/gpudirect-storage/release-notes/index.html#support-matrix). Your GPU is a GeForce GTX 1060, which is not supported for GDS because it is not a Data Center or Quadro card (it has compute capability 6.1, according to this chart: https://developer.nvidia.com/cuda-gpus#compute). You'll still get GPU speedup for the computations, but not direct memory transfers from storage to GPU memory.

    The lack of GDS support should not be too much of a problem for performance, but the small memory size (6 GB) of your GTX 1060 card may be a significant limitation for the data sizes you can process. dask-cudf is a good way to chunk the problem into small enough sizes to fit in your GPU memory, but keep that in mind as you decide on the file/chunk sizes to process if you experience out-of-memory errors.