Okay, so clearly this needs a kernel side NVMe specific allocator
and locking so users don't step on each other..
Yup, ideally. That's why device dax isn't ideal for this application: it
doesn't provide any way to prevent users from stepping on each other.
Or as Christoph says some kind of general mechanism to get these
Yeah, I imagine a general allocate from BAR/region system would be very
Ah, I see.
As a first draft I'd stick with some kind of API built into the
/dev/nvmeX that backs the filesystem. The user app would fstat the
target file, open /dev/block/MAJOR(st_dev):MINOR(st_dev), do some
ioctl to get a CMB mmap, and then proceed from there..
When that is all working kernel-side, it would make sense to look at a
more general mechanism that could be used unprivileged??
That makes a lot of sense to me. I suggested mmapping the char device
because it's really easy, but I can see that an ioctl on the block
device does seem more general and device agnostic.
This is similar to the GPU issues too.. On NVMe you don't need to
the pages, you just need to lock that VMA so it doesn't get freed from
the NVMe CMB allocator while the IO is running...
Probably in the long run the get_user_pages is going to have to be
pushed down into drivers.. Future MMU coherent IO hardware also does
not need the pinning or other overheads.