That did it.  Many thanks.  I saw the earlier probes but did not associate that with the configuration file.

Again, thanks,
Hugh

On Thu, May 18, 2017 at 9:21 PM, Walker, Benjamin <benjamin.walker@intel.com> wrote:
Hi Hugh,

Delete the TransportID lines from the [Nvme] section and try again.

The entries in that section instruct the blockdev layer to claim the devices, but your subsystems are also configured in direct mode which will directly claim devices and bypass the blockdev layer. I realize that's probably clear as mud, but we're thinking through some major changes to the nvmf library right now that will hopefully make it a lot clearer. 

Thanks, 
Ben



Sent from my Sprint Samsung Galaxy S7.


-------- Original message --------
From: Hugh Daschbach <hugh.daschbach@enmotus.com>
Date: 5/18/17 7:52 PM (GMT-07:00)
Subject: [SPDK] Could not find NVMe controller at PCI address 0000:06:00.0

I'm having issues getting app/nvmf_tgt/nvmf_tgt running.  I've
configured four of five local NVMe devices in nvmf.conf.  The devices
are discovered during early initialization, but spdk_nvme_probe()
fails to find a previously probed device.  It looks like
nvme_pci_ctrlr_scan() expects a hotplug event that it never sees.
What is supposed to trigger the hotplug event?

Here's the output of nvmf_tgt:

Starting DPDK 17.08.0-rc0 initialization...
[ DPDK EAL parameters: nvmf -c 0x5555 --file-prefix=spdk_pid184703 ]
EAL: Detected 72 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Debug dataplane logs available - lower performance
EAL: Probing VFIO support...
EAL: VFIO support initialized
Occupied cpu socket mask is 0x1
Ioat Copy Engine Offload Enabled
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL:   using IOMMU type 1 (Type 1)
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL: PCI device 0000:08:00.0 on NUMA socket 0
EAL: PCI device 0000:09:00.0 on NUMA socket 0
EAL: PCI device 0000:84:00.0 on NUMA socket 1
EAL: Releasing pci mapped resource for 0000:84:00.0
EAL: Calling pci_unmap_resource for 0000:84:00.0 at 0x7f194ae10000
Reactor started on core 2 on socket 0
Reactor started on core 4 on socket 0
Reactor started on core 6 on socket 0
Reactor started on core 8 on socket 0
Reactor started on core 10 on socket 0
Reactor started on core 12 on socket 0
Reactor started on core 14 on socket 0
Reactor started on core 0 on socket 0
*** RDMA Transport Init ***
allocated subsystem nqn.2014-08.org.nvmexpress.discovery on lcore 0 on socket 0
allocated subsystem nqn.2016-06.io.spdk:cnode0 on lcore 2 on socket 0
Total cores available: 8
EAL: PCI device 0000:06:00.0 on NUMA socket 0
conf.c: 578:spdk_nvmf_construct_subsystem: ***ERROR*** Could not find NVMe controller at PCI address 0000:06:00.0
nvmf_tgt.c: 279:spdk_nvmf_startup: ***ERROR*** spdk_nvmf_parse_conf() failed
EAL: Releasing pci mapped resource for 0000:06:00.0
EAL: Calling pci_unmap_resource for 0000:06:00.0 at 0x7f194ae00000
EAL: Releasing pci mapped resource for 0000:07:00.0
EAL: Calling pci_unmap_resource for 0000:07:00.0 at 0x7f194ae04000
EAL: Releasing pci mapped resource for 0000:08:00.0
EAL: Calling pci_unmap_resource for 0000:08:00.0 at 0x7f194ae08000
EAL: Releasing pci mapped resource for 0000:09:00.0
EAL: Calling pci_unmap_resource for 0000:09:00.0 at 0x7f194ae0c000


Select bits from nvmf.conf:
[Nvme]
TransportID "trtype:PCIe traddr:0000:05:00.0" Nvme0
TransportID "trtype:PCIe traddr:0000:06:00.0" Nvme1
TransportID "trtype:PCIe traddr:0000:07:00.0" Nvme2
TransportID "trtype:PCIe traddr:0000:08:00.0" Nvme3
TransportID "trtype:PCIe traddr:0000:09:00.0" Nvme4

[Subsystem0]
  NQN nqn.2016-06.io.spdk:cnode0
  Core 2
  Mode Direct
  Listen RDMA 192.168.10.1:4420
  SN SPDK00000000000001
  NVMe 0000:06:00.0
[Subsystem1]
  NQN nqn.2016-06.io.spdk:cnode1
  Core 4
  Mode Direct
  Listen RDMA 192.168.10.1:4420
  SN SPDK00000000000002
  NVMe 0000:07:00.0
[Subsystem2]
  NQN nqn.2016-06.io.spdk:cnode2
  Core 6
  Mode Direct
  Listen RDMA 192.168.10.1:4420
  SN SPDK00000000000003
  NVMe 0000:08:00.0
[Subsystem3]
  NQN nqn.2016-06.io.spdk:cnode3
  Core 8
  Mode Direct
  Listen RDMA 192.168.10.1:4420
  SN SPDK00000000000004
  NVMe 0000:09:00.0

and lspci output following the errror:

[root@localhost spdk]# lspci -v -s 0:6:0.0
06:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 171X (rev 03) (prog-if 02 [NVM Express])
        Subsystem: Dell Express Flash NVMe XS1715 SSD 800GB
        Physical Slot: 180
        Flags: fast devsel, IRQ 31, NUMA node 0
        Memory at 9a400000 (64-bit, non-prefetchable) [disabled] [size=16K]
        Capabilities: [c0] Power Management version 3
        Capabilities: [c8] MSI: Enable- Count=1/32 Maskable+ 64bit+
        Capabilities: [e0] MSI-X: Enable- Count=129 Masked-
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [40] Vendor Specific Information: Len=24 <?>
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [180] #19
        Capabilities: [150] Vendor Specific Information: ID=0001 Rev=1 Len=02c <?>
        Kernel driver in use: vfio-pci
        Kernel modules: nvme

The above is from branch master (both SPDK and DPDK).  I've tested v17.03 with similar results.  What am I missing?

Thanks,
Hugh

_______________________________________________
SPDK mailing list
SPDK@lists.01.org
https://lists.01.org/mailman/listinfo/spdk