I'm trying to diagnose a memory allocation error thrown by ibv_reg_mr() in software that I use, and my suspicion is that it's related to known problems with some Mellanox Infiniband cards where the default maximum memory that can be registered is about 2GB (see FAQ #18 here http://www.open-mpi.org/faq/?category=openfabrics ).
I would like to be able to confirm unequivocally whether this is the case or not so I can quickly negotiate a solution with my system administrators. Being unfamiliar with RDMA and Infiniband, would someone possibly be able to suggest either (a) a simple program that could register arbitrary amounts of memory such that I may trigger the error at the maximum allowed value, or (b) suggest a way that I may determine the way Infiniband is currently configured considering that I do not have root access?
Thanks everyone!
Jason
You can read the parameters for the Mellanox InfiniBand HCA drivers from sysfs
and you don't need root access to do so. The parameters for module <modname>
are found in /sys/module/<modname>/parameters/
. Each parameter is exposed as a text pseudofile there and its value can be read by simply reading the content of the file. You can even do that using standard Unix command line tools.
For the mlx4_core
module the maximum amount of registrable memory is determined using the following formula:
max_reg = (1 << log_num_mtt) * (1 << log_mtts_per_seg) * PAGE_SIZE
For the ib_mthca
module the formula is:
max_reg = (num_mtt - fmr_reserved_mtts) * (1 << log_mtts_per_seg) * PAGE_SIZE
where:
num_mtt
is the maximum number of memory translation table (MTT) segments per HCA;log_num_mtt
is the binary logarithm of num_mtt
;fmr_reserved_mtts
is the number of MTT segments, reserved for FMR;log_mtts_per_seg
is the binary logarithm of the number of MTT entries per segment.PAGE_SIZE
is the system page size, usually 4 KiB on most current platforms.Each of these parameters (except PAGE_SIZE
) can be read from its corresponding module directory in sysfs
.
It is possible that both modules are loaded. In this case just do what Open MPI does: look for mlx4_core
first and ib_mthca
second.