In the past few weeks I received a report from a team using the SPDK where they were
experiencing a failure ( "EAL: Cannot init memory") during early memory
initialization resulting from a failed mmap() call. This occurred on a system with a
large amount of memory (> 180GB). With many thanks to Jim Harris, we learned that
bumping up Linux's vm.max_map_count value to a (much) larger value resolved the
problem. I'd like to inquire what other types of resource limits others in the SPDK
community are adjusting upward to account for large memory use. Has anyone adjusted such
parameters in conf files, e.g. /etc/sysctl.d/ or /etc/security/limits.d/ for the systems
in the SPDK CI's test pools, and if so, which parameters and respective values are you
using?
Related to this topic, I've also heard reported and observed ibv_mr_reg() failures on
large memory systems when launching SPDK's nvmf_tgt, and discovered that on those
systems they had relatively small memlock limits. I'm not sure how, if at all, this
is impacts ibv_mr_reg() calls when the associated regions are backed by hugepages which by
their nature are already pinned. If anyone here has any greater knowledge on this
specific topic, I'd be very grateful to learn more about this.
thanks much,
--
Lance Hartmann
lance.hartmann(a)oracle.com