On 23/11/16 01:33 PM, Jason Gunthorpe wrote:
On Wed, Nov 23, 2016 at 02:58:38PM -0500, Serguei Sagalovitch wrote:
> We do not want to have "highly" dynamic translation due to
> performance cost. We need to support "overcommit" but would
> like to minimize impact. To support RDMA MRs for GPU/VRAM/PCIe
> device memory (which is must) we need either globally force
> pinning for the scope of "get_user_pages() / "put_pages" or have
> special handling for RDMA MRs and similar cases.
As I said, there is no possible special handling. Standard IB hardware
does not support changing the DMA address once a MR is created. Forget
about doing that.
Yeah, that's essentially the point I was trying to make. Not to mention
all the other unrelated hardware that can't DMA to an address that might
Only ODP hardware allows changing the DMA address on the fly, and it
works at the page table level. We do not need special handling for
I am aware of ODP but, noted by others, it doesn't provide a general
solution to the points above.
Like I said, this is the direction the industry seems to be moving
so any solution here should focus on VMAs/page tables as the way to link
the peer-peer devices.
Yes, this was the appeal to us of using ZONE_DEVICE.
To me this means at least items #1 and #3 should be removed from
It's also worth noting that #4 makes use of ZONE_DEVICE (#2) so they are
really the same option. iopmem is really just one way to get BAR
addresses to user-space while inside the kernel it's ZONE_DEVICE.