There has been a lot of activity lately dealing with abstracting SPDK's dependencies to allow porting SPDK to additional platforms and frameworks. I think we'll end up talking about this extensively at the summit this week. The ability to use SPDK in a variety of environments is very important to us, so I look forward to the discussions. I wanted to write a basic summary of the current state of SPDK's dependencies for everyone to have, and so that those who are unable to attend the summit can provide feedback. I'll try to update this thread with the results of the discussions at the summit for everyone's benefit.
We've already eliminated all SPDK dependencies except for two:
SPDK already has a mechanism for abstracting away the "environment" it is running in. We do this by moving all calls to dependencies to the "env" library, whose interface is defined by include/spdk/env.h. The build system allows users to point to an alternate implementation of the env library if required. SPDK's default implementation of env is based on DPDK and runs in Linux and FreeBSD user space.
DPDK is a very large toolkit, of course, but until just recently SPDK only used DPDK's 'eal' module (environment abstraction library). The original goal of the SPDK env library was just to abstract all uses of DPDK's eal, but not all uses of POSIX APIs. We made some progress in this area, but there are a number of remaining calls as of this writing. Fortunately, the wider community has really come through. There are at least two pull requests out from two independent users that mostly finish the job. We'll work through those pull requests and try to get them merged shortly. The only part of the code that will continue to explicitly depend on DPDK will be the newly released vhost-scsi target because that relies on DPDK vhost infrastructure. This infrastructure is outside of DPDK's eal and is quite extensive. We feel this is acceptable because building vhost in SPDK is optional - you can just turn it off to eliminate the dependency.
There are a number of legitimate reasons to want to remove the DPDK dependency from SPDK. For instance, DPDK dictates a particular memory management model, requires a specific threading model, and performs PCI device management that may conflict with other libraries. However, DPDK provides tremendous value to SPDK in a number of areas, especially by implementing memory allocation suitable for DMA in user space. Memory management is a major challenge with all sorts of hidden traps and challenges, so we see significant value in using a production-tested library here. Re-implement the env without DPDK at your own risk!
Abstracting away our POSIX dependency is a similar but larger challenge. While we foresaw the need for people to replace DPDK, we didn't see any use case for dropping POSIX. We were definitely wrong though - people are using SPDK in all sorts of environments that we didn't predict, from embedded firmware to operating system kernels and beyond. So we'll need to address this as time goes on. I think this will result in further expansion of the 'env' library. Fortunately, we don't require too much from POSIX - mostly pthreads, the C standard library, and maybe a couple of other miscellaneous headers.
There are other less obvious dependencies in SPDK. For instance, it also depends on a relatively recent version of gcc or clang. This is mostly a practical concern and not so much by intention. Our continuous integration system (which I'll talk about at the summit - exciting developments here) currently consists of 4 machines, 3 of which are running Fedora Linux 25 while the other is running FreeBSD 11.0. These are very much up to date systems, so we tend not to catch bugs on older systems today. We also use cutting edge C features and tools, so we'll probably over time need to select a particular minimum version of the C standard that we're going to develop to.
I'm also concerned with the scope of the env library and I'd like to work as hard as possible to limit the size of the API. This is for two reasons - first and foremost to minimize the number of calls to these wrapper functions so that it doesn't impact performance. Performance really is everything and we can't sacrifice that. This isn't shaping up to be much of a problem so far because very few calls to our dependencies happen within the I/O path. Secondarily, I'd like to make it as straightforward as possible for users to re-implement the env. The best way to make it easy to do that is to keep the API small.
I look forward to the discussions at the summit this week. We'd love to hear about all of the different environments, platforms, and frameworks where SPDK is being used and how we can make it easier to integrate SPDK wherever it is valuable.
I've tried db_bench in the rocksdb on blobfs, but didn't get performance improvement as expected.
Compared with NO_SPDK, insert increased 70%, readwrite increased 28%, but overwrite decreased 38%, randread decreased 48%.
(the attached pic is the detail results.)
Is it a normal result or is there a way that we can get a better result by adjusting some parameters?
PS, here are the environment:
Storage: P3700, OS: CentOS 7.3, gcc: 6.3, Keys: 16B, Values: 1000B, Entries: 500M
When I started using spdk, the program failed to start.
The output is as follows:
EAL: Detected 24 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: Can only reserve 424 pages from 448 requested
Current CONFIG_RTE_MAX_MEMSEG=256 is not enough
Please either increase it or request less amount of memory.
PANIC in rte_eal_init():
Cannot init memory
Restart the server, this problem will be solved.
Are there any other solutions?
I don’t think so but want to ask that Can connected target devices be treated as local nvme for spdk to work as initiator? Is there any near term plan to support SPDK host initiator?
I appreciate your response.
On side note: when can we get spdk conference presentations?
Vishal and I put together some benchmarks using the new Intel Optane SSD DC
P4800X with SPDK. Take a look:
Feedback and questions are welcome on this thread or in IRC (#spdk on FreeNode).
Is it possible to change the Transfer size of the iSCSI initiator and target to a configurable value. Right now MAXRECVDATASEGMENTLENGTH is 65536. If I change this to 4096, I still see the initiator requesting more than 4k. Is there any other parameters that i should change at the target level to get read and write requests exactly of the Block size and the Block size that I use is 4096.