On Feb 15, 2018, at 8:02 AM, Luse, Paul E
Yeah, awesome job working on these. Some replies from me below, others I'm sure will
have thoughts as well. We're also starting weekly community calls starting the week
after next so can you always put some of these up on the board (details coming soon) and
From: Stephen Bates [mailto:email@example.com]
Sent: Thursday, February 15, 2018 7:33 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Cc: Harris, James R <james.r.harris(a)intel.com>; Verkamp, Daniel
<daniel.verkamp(a)intel.com>; Stojaczyk, DariuszX
<dariuszx.stojaczyk(a)intel.com>; Chang, Cunyin <cunyin.chang(a)intel.com>; Luse,
Paul E <paul.e.luse(a)intel.com>
Subject: SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
Hi SPDK Team
I wanted to start by thanking everyone for the great feedback on the first set of CMB
WDS/RDS enablement patches which went into master over the past few days (e.g. ). There
are already some reported suspected bug sightings so with luck those patches will mature
Thanks for your work on this!
Now I wanted to pick the communities brains' on a the best way to approach a couple
1. Documentation. I would like to update the API documentation (which I believe is
auto-generated) as well as add a new file in docs/ discussing some of the issues setting
up Peer-2-Peer DMAs (which cmb_copy does). Any tips for how best to do this?
All docs are done via patches. There's the API docs which you can easily see
examples of in other code comments, just submit a patch. For http://www.spdk.io/doc/
those are also done via public patches, her's an example of one of those:
You can also do a blog post on the website via a
patch however I don't think we document how to do that anywhere, you set everything up
the same as the regular repo but git info is:
url = https://review.gerrithub.io/spdk/spdk.github.io
fetch = +refs/heads/*:refs/remotes/origin/*
url = https://review.gerrithub.io/spdk/spdk.github.io
push = HEAD:refs/for/master
If you want to do one, let me or anyone know and we can provide more info
2. CI Testing. Upstream QEMU has support in its NVMe model for SSDs with WDS/RDS CMBs 
(I should know as I added that support ;-)). Can we discuss adding this to the CI pool so
we can do some form of emulated P2P testing? In addition is there interest in real HW
testing? If so we could discuss adding some of our HW to the pool (but as a lowly startup
I think donating HW is beyond our budget right now).
The community CI pool isn't really open to adding new HW due to limited resources.
You, however, are welcome to setup your own CI system at your site and tie into the
community GerritHub so all patches would run on your CI as well and reports would be
visible on the main page but not 'count' as voting for merge.
We are already using emulated QEMU NVMe devices in the test pool - so in lieu of adding
new HW, we could get at least some level of CMB WDS/RDS testing there. You could look at
test/lib/nvme/nvme.sh initially for where to plumb something in. Of course, whatever
tests get added need to know the difference between failing due to lack of CMB v. a real
Adding some sort of nightly test with VMs is very doable. There is a lot of
restructuring going on with the tests right now but you can propose a patch at any time.
All of the tests are public and in the same repo. If you're up for doing this, go for
it. Any of us can provide some starting points/guidelines/tips if so. There's some
doc written up but not posted yet (not final) that are part of the test restructuring I
just mentioned. I can share some of those on the dist list early too if you're
3. VFIO Support. Right now I have only tested with UIO. VFIO adds some interesting issues
around BAR address translations, PCI ACS and PCI ATS.
Good point. Testing this will be predicated on getting real HW. For now, we should
probably at least have a warning message emitted when we find a CMB-enabled SSD with vfio
4. Fabrics Support. An obvious extension of this work is to allow other devices (aside
from NVMe SSDs) to initiate DMAs to the NVMe CMBs. The prime candidate for that is a RDMA
capable NIC which ties superbly well into NVMe over Fabrics. I would like to start a
discussion on how best to approach this.
Step 1 would just be testing this I/O path to confirm it works. You’ve already added the
spdk_mem_register() calls which should register the CMB region with each RDMA NIC. So
first make sure that works. rxe might be OK to start but really you’ll want a real RDMA
NIC. Then you could read to a CMB buffer and write to a remote NVMe namespace using the
SPDK NVMe-oF driver. Then read it back into a different CMB buffer, etc.
Step 2 would be a lot more involved. In an ideal world, there’s enough CMB space to
replace all of the existing host memory buffer pools used by the NVMe-oF target. If not -
well, that’s where a lot more work will be needed. :-)
Is Trello the right place to enter and discuss these topics? Or is it OK to hash them out
on the mailing list? Or do the community have a better way of discussing these items?
Trello is good, this list is good. IRC is GREAT, haven't seen you there I don't
think freenode, #spdk and then there's the weekly con calls that I mentioned that will
be starting soon.
Yep - everything Paul said.
We also need a better CMB allocation scheme. What’s there currently was just to get CMB
working at some level but isn’t really functional (i.e. the free routine is a nop). A
full blown allocator is probably overkill at best - these regions are somewhat limited so
fragmentation can be a problem. There’s also no synchronization currently in
nvme_pci_ctrlr_alloc_cmb() to protect concurrent allocations on multiple threads. Until
that is ready, we will need to consider the CMB functionality as experimental and make
sure the docs reflect that.