I simply tested the BlobFS Asynchronous API by using SPDK events framework to execute multi tasks, each task writes one file.
But it doesn't work, the spdk_file_write_async() reported an error when resizing the file size.
The call stack looks like this:
spdk_file_write_async() -> __readwrite() -> spdk_file_truncate_async() -> spdk_blob_resize()
The resize operation must be done in the metadata thread which invoked the spdk_fs_load(), so only the task dispatched to the metadata CPU core works.
That's to say only one thread can be used to write files. It's hard to use, and performance issues may arise.
Does anyone knows further more about this?
thanks very much
I’ve seen a lot of cases recently where -1 votes from the test pool have been removed from a patch due to a failure unrelated to the patch, but then nothing was filed in GitHub for that failure. The filing in GitHub could be a new issue, or a comment on an existing issue.
Please make those GitHub updates a priority. It’s the only way the project can understand the frequency of those intermittent failures and gather to get them fixed. If you’re not sure if a failure has been seen before, search GitHub issues with the “Intermittent Failure” label, or ask on Slack if anyone else has seen the issue. There is no harm in filing a new issue that may be a duplicate – we can always clean these up later during the next bug scrub meeting. The important thing is that we get the failure tracked.
With this message I wanted to update SPDK community on state of VPP socket abstraction as of SPDK 19.07 release.
At this time there does not seem to be a clear efficiency improvements with VPP. There is no further work planned on SPDK and VPP integration.
As some of you may remember, SPDK 18.04 release introduced support for alternative socket types. Along with that release, Vector Packet Processing (VPP)<https://wiki.fd.io/view/VPP> 18.01 was integrated with SPDK, by expanding socket abstraction to use VPP Communications Library (VCL). TCP/IP stack in VPP<https://wiki.fd.io/view/VPP/HostStack> was in early stages back then and has seen improvements throughout the last year.
To better use VPP capabilities, following fruitful collaboration with VPP team, in SPDK 19.07, this implementation was changed from VCL to VPP Session API from VPP 19.04.2.
VPP socket abstraction has met some challenges due to inherent design of both projects, in particular related to running separate processes and memory copies.
Seeing improvements from original implementation was encouraging, yet measuring against posix socket abstraction (taking into consideration entire system, i.e. both processes), results are comparable. In other words, at this time there does not seem to be a clear benefit of either socket abstraction from standpoint of CPU efficiency or IOPS.
With this message I just wanted to update SPDK community on state of socket abstraction layers as of SPDK 19.07 release. Each SPDK release always brings improvements to the abstraction and its implementations, with exciting work on more efficient use of kernel TCP stack - changes in SPDK 19.10 and SPDK 20.01.
However there is no active involvement at this point around VPP implementation of socket abstraction in SPDK. Contributions in this area are always welcome. In case you're interested in implementing further enhancements of VPP and SPDK integration feel free to reply, or to use one of the many SPDK community communications channels<https://spdk.io/community/>.
Using spdk 19.07, NVME target over TCP, in multi process environment
The zone name that is reserved for nvme target is not unique (i.e.
1. Does it mean that two nvme target processes running as secondary can
not run simultaneously ?
2. In case of only one secondary as nvme target , if the nvme target
process exits unexpectedly, it looks like it will not be able to create
memory zone, because it already exist from a previous run.
Am I right ? and if yes, is there any simple workaround ?
I'm trying to create mock spdk_nvme_ctrlr structs for use in my unit tests, and can see this is done in spdk/test/unit/lib/nvme/nvme.c/nvme_ut.c by importing nvme/nvme.c which indirectly gives access to the underlying definition of spdk_nvme_ctrlr in lib/nvme/nvme_internal.h. My unit test source is outside of the spdk unit test scaffolding and so I'm getting the expected "incomplete type" messages when I try to allocate memory with the size of the type. I haven't been able to figure out a way of importing nvme_internal.h to get access to the underlying type. any suggestions?
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Should it make sense to ./configure --with-rdma --enable-lto --enable-debug ? I'm trying to get debug logging along with some of my testing, but
when I combine these two options, I get the following make errors:
/tmp/cczVeafv.ltrans0.ltrans.o: In function `spdk_bdev_io_get_scsi_status':
<artificial>:(.text+0x9bbb): undefined reference to `spdk_scsi_nvme_translate'
collect2: error: ld returned 1 exit status
spdk/mk/spdk.unittest.mk:71: recipe for target 'bdev_ut' failed
make: *** [bdev_ut] Error 1
spdk/mk/spdk.subdirs.mk:44: recipe for target 'bdev.c' failed
make: *** [bdev.c] Error 2
spdk/mk/spdk.subdirs.mk:44: recipe for target 'mt' failed
make: *** [mt] Error 2
spdk/mk/spdk.subdirs.mk:44: recipe for target 'bdev' failed
make: *** [bdev] Error 2
spdk/mk/spdk.subdirs.mk:44: recipe for target 'lib' failed
make: *** [lib] Error 2
spdk/mk/spdk.subdirs.mk:44: recipe for target 'unit' failed
make: *** [unit] Error 2
spdk/mk/spdk.subdirs.mk:44: recipe for target 'test' failed
make: *** [test] Error 2
If I leave out --enable-debug, the make succeeds.
SPDK version: v19.04.1
I tried to use spdk_pci_device_map_bar to get the mapped virtual address of a BAR, but the obtained value is 0. Thy physical address and size are correct.
If I use uio_pci_generic instead, the same code can run without problem. Is this a bug?
So, the engineers in our group think SPDK/DPDK/NVMe is the right solution for the performance storage component on the system we are designing. That's not the same thing as evidence, or a convincing argument with some simple numbers that management (and the prospect) can understand. I've now discovered the FIO tool and the "plugin" for NVMe in SPDK and it looks like the right objective performance measurement tool for the job. (I wish I'd found FIO for a post-mortum of a rather disasterous storage solution we took way to long to analyze. IOmeter finally showed up the problem, which was never solved).
So we would like to be able to walk management and the prospect through the world before and the world after SPDK and be able to demonstrate features and benefits using FIO.
Here's the walk:
1 [X] FIO => NVMEe native block device
2 [X] FIO => "plugin" SPDK NVMEe
3 [ ] FIO => "block device emulation" SPDK NVMEe
4 [ ] FIO => Host NVMEe-oF <=network=> Target NVMe-oF SPDK NVMe
1 - basic performance of any given NVMe device
2 - performance advantage of SPDK for NVMe device(s)
3 - non-network overhead of SPDK as a means to NVMe device(s)
4 - network-connected storage accelerated by NVMe
So I think I understand the cases 1 and 2. The 'X' means I think I've actually done it. I think I understand the case 4 (waiting for the rest of the networking hardware to show up).
I suspect that there is some way to present the SPDK NVMe device(s) as block device locally so that I might be able to do case 3, but I just have not found something that states plainly that this is supported or unsupported.
5 - Soft-RoCE-based NVMe-oF implementation to work through software issues when hardware is not available
In my shop the RDMA stuff is rare as hens teeth. Or getting a system with NVMe and RDMA in the same chassis is rarer. I have a bucketload of software issues to understand that don't involve the hardware. Has anybody danced Soft-RoCE around with SPDK and NVMe-oF? (It's a wonderful way to get your head wrapped around RDMA programming when you just can't get the right hardware to play with.)
I have a need to be able to run SPDK in a performance mode and also check for "health" especially package temperatures as the device is running.
I was very frustrated that I could not run smartctl concurrently with SPDK to get device information on NVMe devices.Building a performance system without being able to determine health is a non-starter.
I got my mind right this morning and I thought I'd share:
* In "examples/nvme", there exists "identify" which provides much of the "health" information one would usually get from "smartctl".
* If you naively attempt to run "identify" concurrently with "perf", you will discover that you can run one or the other, but not both, complaining about "claiming" a device.
* If you look at the command options, you will find "shared memory ID", typically "-i ID", which indicates an shared memory ID that multiple processes can access concurrently. You can now run "perf -i ID ..." and then run "identify -i ID ..." and, for instance, watch the temperature on the packages rise over time.
* If you look at the code for "nvme/hello_world", you will find that spdk_env_opts has a field "shm_id". This is apparently what gets populated from the above "-i ID" options on the command line of these other examples. If you fix up "hello_world" to set shm_id = -1 (default - no shared memory), then capture and option and update this field to the ID value, you will be able to get the "hello_world" to work along with "perf" and/or "identify".
* hello_world could be a place to make a simpler temperature sensor (using _HEALTH_ message as the data source), or to include health sensing in a larger application.
* This process still gets blivits in the involved processes. I haven't figured this out [yet].
I am encountering some issues with multi process and nvme. Bellow is the
Is anyone experience this kind of issue ? What I am doing wrong?
* 4 nvme pci devices (addresses - 0000:00:0b.0, 0000:00:0c.0, 0000:00:0d.0,
* two perf processes , one as primary and one as secondary
first process to run (as primary), probe and access all 4 available nvme
./perf -q 1 -o 4096 -w randread -c 0x1 -t 360 -i 1
Second process to run (secondary), probe only two of the 4 devices:
./perf -r 'trtype:PCIe traddr:0000:00:0b.0' -r 'trtype:PCIe
traddr:0000:00:0c.0' -q 8 -o 131072 -w write -c 0x10 -t 60 -i 1
The secondary crashes on segmentation fault
Scenarios that do work:
* When running the secondary to probe all devices, it works fine
i.e. ./perf -r -q 8 -o 131072 -w write -c 0x10 -t 60 -i 1
* when running only one processes with pci device list, works fine as well
(i.e. ./perf -r 'trtype:PCIe traddr:0000:00:0b.0' -r 'trtype:PCIe
traddr:0000:00:0c.0' -q 8 -o 131072 -w write -c 0x10 -t 60 -i 1)