On Thu, Nov 24, 2016 at 12:40:37AM +0000, Sagalovitch, Serguei wrote:
On Wed, Nov 23, 2016 at 02:11:29PM -0700, Logan Gunthorpe wrote:
> Perhaps I am not following what Serguei is asking for, but I
> understood the desire was for a complex GPU allocator that could
> migrate pages between GPU and CPU memory under control of the GPU
> driver, among other things. The desire is for DMA to continue to work
> even after these migrations happen.
The main issue is to how to solve use cases when p2p is
requested/initiated via CPU pointers where such pointers could
point to non-system memory location e.g. VRAM.
Okay, but your list is conflating a whole bunch of problems..
1) How to go from a __user pointer to a p2p DMA address
a) How to validate, setup iommu and maybe worst case bounce buffer
these p2p DMAs
2) How to allow drivers (ie GPU allocator) dynamically
remap pages in a VMA to/from p2p DMA addresses
3) How to expose uncachable p2p DMA address to user space via mmap
to allow "get_user_pages" to work transparently similar
how it is/was done for "DAX Device" case. Unfortunately
based on my understanding "DAX Device" implementation
deal only with permanently "locked" memory (fixed location)
unrelated to "get_user_pages"/"put_page" scope
which doesn't satisfy requirements for "eviction" / "moving" of
memory keeping CPU address intact.
Hurm, isn't that issue with DAX only to do with being coherent with
the page cache?
A GPU allocator would not use the page cache, it would have to
construct VMAs some other way.
My understanding is that It will not solve RDMA MR issue where
could be during the whole application life but (a) it will not make
RDMA MR case worse (b) should be enough for all other cases for
"get_user_pages"/"put_page" controlled by kernel.
Right. There is no solution to the RDMA MR issue on old hardware. Apps
that are using GPU+RDMA+Old hardware will have to use short lived MRs
and pay that performance cost, or give up on migration.