I have setup a SPDK lvol as a data disk, and run QEMU with it, as per the
vhost documents instructed.
Then I exposed a empty lvol to host nbd, and used `dd` to copy a existing
raw disk into lvol, which contains a guest OS. But the guest was not able
Was there any step that I missed?
I'm using spdk for local nvme through the nvme interface, I find that
physical disk removals are not handled properly for my use case and wonder
if others see it that way as well and if there is an intention to fix this.
Our system uses long running processes that control one or more disks at a
time, if a disk fails it may drop completely from the pcie bus and it will
also look like that if the disk is physically removed (say a technician
mistakes the disk that he should replace).
The problem that I see is that spdk doesnt consider a device completely
disappearing from the bus and will try to release the io qpair by sending
the delete io sq and delete io cq commands, both of these will never get an
answer (the device is not on the pcie device anymore) and there is no
timeout logic in that code path. This means two things, the process will
halt forever and there is an effective memory leak which currently means
that we need to restart the process. Now, our system is resilient enough
that restarting the process is not a big deal but it is a very messy way to
go about handlign a physical drive removal.
Have others seen this behavior? Does it bother others?
For my own use I put a timeout in there of a few seconds and that solves it
*Baruch Even, Software Developer E baruch(a)weka.io <liran(a)weka.io>