We are trying to use NBD and SPDK on client side. Data path looks like this
File System ----> NBD client ------>SPDK------->NVMEoF
Currently we are seeing a high latency in the order of 50 us by using this path. It seems like there is data buffer copy happening for write commands from kernel to user space when spdk nbd read data from the nbd socket.
I think that there could be two ways to prevent data copy .
1. Memory mapped the kernel buffers to spdk virtual space. I am not sure if it is possible to mmap a buffer. And what is the impact to call mmap for each IO.
2. If NBD kernel give the physical address of a buffer and SPDK use that to DMA it to NVMEoF. I think spdk must also be changing a virtual address to physical address before sending it to nvmeof.
Option 2 makes more sense to me. Please let me know if option 2 is feasible in spdk
Looking to support native NVMe multipathing on Linux, I am looking at the specifications regarding controller IDs.
Our system is a distributed system exposing the same logical devices through multiple physical hosts, each running its own SPDK instance.
Looking at the code (v19.04), I see that controller IDs are generated in the 0-0xFFF0 range, and verify that within the subsystem they are unique before returning the value to the controller. This method of serially generating the controller ID means that on different nodes we will probably get the same controller ID, which means that the host may identify a new controller as one which already exists.
This means I need to either limit the controller ID range per spdk instance, and remain spec aligned, or expose a different subsystem per physical host, solving the controller ID issue, but not conforming to the spec...
I looked at the namespace ID section in NVMe 1.4, and there doesn't seem to be any mention of world wide uniqueness, so it seems that the correct implementation would be to limit the controller ID range. Would an API to limit the controller ID range in SPDK be acceptable?
Do you know of any work being done on namespace sharing between subsystems, and on world wide unique namespace IDs?
I am working on comparing the discovery log page to the spec and I noticed that there is a trello task on this issue (Audit and make fully spec compliant the implementation of the Discovery Log Page) and I have some questions:
1. TREQ (byte 03) - We don't need a secure session and in our system this field is set to not specified (0).
Do we need to change it to not required (2)?
What is the different between these two options?
2. I compared the log page entry struct with the log page entry from the spec and they are the same.
In addition, I checked that the values are valid to my machine.
What is required to close this task?
I was looking into the bdev IO code path, and found this
in_submit_request flag, which seems to have been set every time we do
an IO submission, which makes sense considering the name of the flag.
But, in the function _spdk_bdev_io_submit(), the flag
in_submit_request is being set before the conditional "if" statement,
and depending on the conditions of the statement, it can also call
*_io_complete(). So, should the flag in_submit_request be set to true,
while called the *_io_complete() function here?
Also, can someone shed some more light on the significance of this
Md Haris Iqbal,
Contact: +91 8861996962
I've just made another round of cleaning up old and outdated trello cards. I might have asked you to update some items, so check your trello notifications. :)
I also started moving some cards around. Eventually we would like to like to get rid of the "Things to do" board since it's way too packed. Cards related to blobstore should go to the blobstore board, iSCSI to iSCSI board, and so on. Not every SPDK component has its own trello board though. I think cards for such a component should go into Miscellaneous Backlog  for now and if it later turns out there's too many of them, then we'll create a separate board. This helps us ensure that we don't have stale empty boards that just make the active boards harder to find.
Following the instructions at:
(Which are not completely correct, there are some problems with interdependencies between the FC library and the DPDK libraries)
I’m seeing the following compile time errors. Is this the compiler I’m using?
src/spdk_nvmf_xport.c: In function ‘nvmf_fc_fill_sgl’:
src/spdk_nvmf_xport.c:2410:8: error: taking address of packed member of ‘struct spdk_nvmf_fc_rq_buf_nvme_cmd’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
2410 | sge = &req_buf->sge;
cc1: all warnings being treated as errors
make: *** [Makefile:122: src/spdk_nvmf_xport.o] Error 1
ssan-rx2560-01:fc(master) > gcc -v
Using built-in specs.
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 9.1.1 20190503 (Red Hat 9.1.1-1) (GCC)
Does the bdev fio plugin support replaying fio trace with the --read_iolog flag?
I encountered the following runtime error when using this feature:
Starting 1 thread
fio: pid=9537, err=12/file:memory.c:333, func=iomem allocation, error=Cannot allocate memory
The trace file I tried to replay was generated by running the fio plugin with the --write_iolog flag and with <spdk dir>/examples/bdev/fio_plugin/example_config.fio
The target bdev is a NVMe drive which is specified in bdev.conf.in as following:
TransportID "trtype:PCIe traddr:0000:0b:00.0" Nvme0
Trace replay works if I directly use fio v3.3 without the plugin.
I wonder if this is a limitation of the plugin. If so, how can I modify it to enable this feature?