Hi Hugh,
Delete the TransportID lines from the [Nvme] section and try again.
The entries in that section instruct the blockdev layer to claim the devices, but your
subsystems are also configured in direct mode which will directly claim devices and bypass
the blockdev layer. I realize that's probably clear as mud, but we're thinking
through some major changes to the nvmf library right now that will hopefully make it a lot
clearer.
Thanks,
Ben
Sent from my Sprint Samsung Galaxy S7.
-------- Original message --------
From: Hugh Daschbach <hugh.daschbach(a)enmotus.com>
Date: 5/18/17 7:52 PM (GMT-07:00)
To: spdk(a)lists.01.org
Subject: [SPDK] Could not find NVMe controller at PCI address 0000:06:00.0
I'm having issues getting app/nvmf_tgt/nvmf_tgt running. I've
configured four of five local NVMe devices in nvmf.conf. The devices
are discovered during early initialization, but spdk_nvme_probe()
fails to find a previously probed device. It looks like
nvme_pci_ctrlr_scan() expects a hotplug event that it never sees.
What is supposed to trigger the hotplug event?
Here's the output of nvmf_tgt:
Starting DPDK 17.08.0-rc0 initialization...
[ DPDK EAL parameters: nvmf -c 0x5555 --file-prefix=spdk_pid184703 ]
EAL: Detected 72 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Debug dataplane logs available - lower performance
EAL: Probing VFIO support...
EAL: VFIO support initialized
Occupied cpu socket mask is 0x1
Ioat Copy Engine Offload Enabled
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: using IOMMU type 1 (Type 1)
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL: PCI device 0000:08:00.0 on NUMA socket 0
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: Releasing pci mapped resource for 0000:84:00.0
EAL: Calling pci_unmap_resource for 0000:84:00.0 at 0x7f194ae10000
Reactor started on core 2 on socket 0
Reactor started on core 4 on socket 0
Reactor started on core 6 on socket 0
Reactor started on core 8 on socket 0
Reactor started on core 10 on socket 0
Reactor started on core 12 on socket 0
Reactor started on core 14 on socket 0
Reactor started on core 0 on socket 0
*** RDMA Transport Init ***
allocated subsystem nqn.2014-08.org.nvmexpress.discovery on lcore 0 on socket 0
allocated subsystem nqn.2016-06.io.spdk:cnode0 on lcore 2 on socket 0
Total cores available: 8
EAL: PCI device 0000:06:00.0 on NUMA socket 0
conf.c: 578:spdk_nvmf_construct_subsystem: ***ERROR*** Could not find NVMe controller at
PCI address 0000:06:00.0
nvmf_tgt.c: 279:spdk_nvmf_startup: ***ERROR*** spdk_nvmf_parse_conf() failed
EAL: Releasing pci mapped resource for 0000:06:00.0
EAL: Calling pci_unmap_resource for 0000:06:00.0 at 0x7f194ae00000
EAL: Releasing pci mapped resource for 0000:07:00.0
EAL: Calling pci_unmap_resource for 0000:07:00.0 at 0x7f194ae04000
EAL: Releasing pci mapped resource for 0000:08:00.0
EAL: Calling pci_unmap_resource for 0000:08:00.0 at 0x7f194ae08000
EAL: Releasing pci mapped resource for 0000:09:00.0
EAL: Calling pci_unmap_resource for 0000:09:00.0 at 0x7f194ae0c000
Select bits from nvmf.conf:
[Nvme]
TransportID "trtype:PCIe traddr:0000:05:00.0" Nvme0
TransportID "trtype:PCIe traddr:0000:06:00.0" Nvme1
TransportID "trtype:PCIe traddr:0000:07:00.0" Nvme2
TransportID "trtype:PCIe traddr:0000:08:00.0" Nvme3
TransportID "trtype:PCIe traddr:0000:09:00.0" Nvme4
[Subsystem0]
NQN nqn.2016-06.io.spdk:cnode0
Core 2
Mode Direct
Listen RDMA 192.168.10.1:4420<http://192.168.10.1:4420>
SN SPDK00000000000001
NVMe 0000:06:00.0
[Subsystem1]
NQN nqn.2016-06.io.spdk:cnode1
Core 4
Mode Direct
Listen RDMA 192.168.10.1:4420<http://192.168.10.1:4420>
SN SPDK00000000000002
NVMe 0000:07:00.0
[Subsystem2]
NQN nqn.2016-06.io.spdk:cnode2
Core 6
Mode Direct
Listen RDMA 192.168.10.1:4420<http://192.168.10.1:4420>
SN SPDK00000000000003
NVMe 0000:08:00.0
[Subsystem3]
NQN nqn.2016-06.io.spdk:cnode3
Core 8
Mode Direct
Listen RDMA 192.168.10.1:4420<http://192.168.10.1:4420>
SN SPDK00000000000004
NVMe 0000:09:00.0
and lspci output following the errror:
[
[email protected] spdk]# lspci -v -s 0:6:0.0
06:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller
171X (rev 03) (prog-if 02 [NVM Express])
Subsystem: Dell Express Flash NVMe XS1715 SSD 800GB
Physical Slot: 180
Flags: fast devsel, IRQ 31, NUMA node 0
Memory at 9a400000 (64-bit, non-prefetchable) [disabled] [size=16K]
Capabilities: [c0] Power Management version 3
Capabilities: [c8] MSI: Enable- Count=1/32 Maskable+ 64bit+
Capabilities: [e0] MSI-X: Enable- Count=129 Masked-
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [40] Vendor Specific Information: Len=24 <?>
Capabilities: [100] Advanced Error Reporting
Capabilities: [180] #19
Capabilities: [150] Vendor Specific Information: ID=0001 Rev=1 Len=02c <?>
Kernel driver in use: vfio-pci
Kernel modules: nvme
The above is from branch master (both SPDK and DPDK). I've tested v17.03 with similar
results. What am I missing?
Thanks,
Hugh