On Sun, Dec 04, 2016 at 07:23:00AM -0600, Stephen Bates wrote:
This has been a great thread (thanks to Alex for kicking it off) and I
wanted to jump in and maybe try and put some summary around the
discussion. I also wanted to propose we include this as a topic for LFS/MM
because I think we need more discussion on the best way to add this
functionality to the kernel.
As far as I can tell the people looking for P2P support in the kernel fall
into two main camps:
1. Those who simply want to expose static BARs on PCIe devices that can be
used as the source/destination for DMAs from another PCIe device. This
group has no need for memory invalidation and are happy to use
physical/bus addresses and not virtual addresses.
I didn't think there was much on this topic except for the CMB
thing.. Even that is really a mapped kernel address..
I think something like the iopmem patches Logan and I submitted
come close to addressing use case 1. There are some issues around
routability but based on feedback to date that does not seem to be a
show-stopper for an initial inclusion.
If it is kernel only with physical addresess we don't need a uAPI for
it, so I'm not sure #1 is at all related to iopmem.
Most people who want #1 probably can just mmap
/sys/../pci/../resourceX to get a user handle to it, or pass around
__iomem pointers in the kernel. This has been asked for before with
I'm still not really clear what iopmem is for, or why DAX should ever
be involved in this..
For use-case 2 it looks like there are several options and some of
(like HMM) have been around for quite some time without gaining
acceptance. I think there needs to be more discussion on this usecase and
it could be some time before we get something upstreamable.
AFAIK, hmm makes parts easier, but isn't directly addressing this
I think you need to get ZONE_DEVICE accepted for non-cachable PCI BARs
as the first step.
From there is pretty clear we the DMA API needs to be updated to
support that use and work can be done to solve the various problems
there on the basis of using ZONE_DEVICE pages to figure out to the
PCI-E end points