SPDK errors
by Santhebachalli Ganesh
Folks,
My name is Ganesh, and I am working on NVEMoF performance metrics using
SPDK (and kernel).
I would appreciate your expert insights.
I am observing errors when QD on perf is increased above >=64 most of the
times. Sometimes, even for <=16
Errors are not consistent.
Attached are some details.
Please let me know if have any additional questions.
Thanks.
-Ganesh
4 years, 7 months
Shared library build
by Jonas Pfefferle1
Hi,
Is there a way to build SPDK as a shared library? If no, are there any
plans to support this?
Regards,
Jonas
4 years, 10 months
PCI hotplug and SPDK
by Oza Oza
Hi All,
PCI hotplug support; requires creation of root bus and probe to go ahead
with all PCIe configuration.
Which means following APIs ae not called.
pci_stop_root_bus(bus);
pci_remove_root_bus(bus);
And then If I run SPDK, It makes system crash with following info.
Note: if the disk is connected then SPDK is fine.
Otherwise it stalls the system with following crash.
[email protected]:~# echo 2048 > /proc/sys/vm/nr_hugepages;
/usr/share/spdk/scripts/setup.sh
grep: /usr/share/spdk/scripts/../include/spdk/pci_ids.h: No such[
34.621325] pci 0008:00:00.0: PCI bridge to [bus 01]
file or directory
[ 34.640586] pci 0000:00:00.0: bridge configuration invalid ([bus
00-00]), reconfiguring
[ 50.267056] pci 0000:00:00.0: PCI bridge to [bus 01]
[ 50.272337] pci 0001:00:00.0: bridge configuration invalid ([bus
00-00]), reconfiguring
[ 65.898762] pci 0001:00:00.0: PCI bridge to [bus 01]
[ 65.904015] pci 0006:00:00.0: bridge configuration invalid ([bus
00-00]), reconfiguring
[ 81.530437] pci 0006:00:00.0: PCI bridge to [bus 01]
[ 81.535680] pci 0007:00:00.0: bridge configuration invalid ([bus
00-00]), reconfiguring
[ 97.162103] pci 0007:00:00.0: PCI bridge to [bus 01]
[ 97.167255] Bad mode in Error handler detected on CPU6, code 0xbf000002
-- SError
[ 97.174974] Internal error: Oops - bad mode: 0 [#1] SMP
[ 97.180364] Modules linked in:
[ 97.183515] CPU: 6 PID: 2104 Comm: bash Not tainted
4.12.0-01560-gc83093d-dirty #89
[ 97.191413] Hardware name: Stingray Combo SVK w/PCIe IOMMU (BCM958742K)
(DT)
[ 97.198683] task: ffff80a163a40000 task.stack: ffff80a1612b4000
[ 97.204790] PC is at 0xffff7cbdfba8
[ 97.208387] LR is at 0xffff7cb8f288
[ 97.211983] pc : [<0000ffff7cbdfba8>] lr : [<0000ffff7cb8f288>] pstate:
20000000
[ 97.219612] sp : 0000fffffe564040
[ 97.223029] x29: 0000fffffe564040 x28: 000000001054ce60
[ 97.228509] x27: 0000000000000000 x26: 00000000004e2000
[ 97.233989] x25: 00000000004e5000 x24: 0000000000000002
[ 97.239468] x23: 0000ffff7cc63638 x22: 0000000000000002
[ 97.244947] x21: 0000ffff7cc67480 x20: 000000001054db10
[ 97.250427] x19: 0000000000000002 x18: 0000000000000000
[ 97.255906] x17: 00000000004daac8 x16: 0000000000000000
[ 97.261386] x15: 0000000000000096 x14: 0000000000000000
[ 97.266865] x13: 0000000000000000 x12: 0000000000000000
[ 97.272344] x11: 0000000000000020 x10: 0101010101010101
[ 97.277824] x9 : ffffff80ffffffc8 x8 : 0000000000000040
[ 97.283303] x7 : 0000000000000001 x6 : 0000ffff7cc669f0
[ 97.288782] x5 : 0000000000015551 x4 : 0000000000000888
[ 97.294261] x3 : 0000000000000000 x2 : 0000000000000002
[ 97.299741] x1 : 000000001054db10 x0 : 0000000000000002
[ 97.305220] Process bash (pid: 2104, stack limit = 0xffff80a1612b4000)
[ 97.311960] ---[ end trace a1f48abe30820241 ]---
Regards,
Oza.
4 years, 10 months
Configuring multiple NVMeoF targets using SPDK.
by Naveen Shankar
Hi All,
I have a working NVMeoF setup using SPDK with me and I used the following configuration details saved in my file:
[Nvmf]
MaxQueuesPerSession 4
AcceptorPollRate 100
[Nvme]
TransportId "trtype:PCIe traddr:0000:06:00.0" Nvme0
[Subsystem1]
NQN nqn.2016-06.io.spdk:cnode1
Core 1
Listen RDMA 192.168.10.20:4420
#Host nqn.2016-06.io.spdk:init
SN SPDK00000000000001
Namespace Nvme0n1
Nvme 0000:06:00.0
Can anyone please share the significance of "Host" parameter in this file?
I also have a plan of configuring 2 NVMe Subsystems in a single file where one of them should be accessed by one host and the other by second host.
Hence can anyone please guide me about utilizing this "Host" parameter in the configuration file to achieve the above mentioned setup?
Thanks in advance.
Regards,
Naveen Shankar,
Senior Engineer @ SanDisk | a Western Digital brand,
No.143 /1, Prestige Tech Park, Prestige Excelsior Building, Marathahalli, Bengaluru, Karnataka-560103
Ext : 080 42721554
Mob: +91 9742574527
Naveen.Shankar(a)sandisk.com<mailto:[email protected]>
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
4 years, 10 months
NVMeOF and multipath
by Ankit Jain
Hi
Is Multipath supported with SPDK over NVMeOF Target ?
If yes, how can we configure that ?
Thanks
Ankit Jain
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
4 years, 10 months
FW: Not able to start nvmf_tgt with NVMe device
by Ankit Jain
Hi
We are trying to setup NVMeOF target using SPDK but are unable to start using "nvmf_tgt" application with an NVMe Device.
Please find the steps followed below
[[email protected] nvmf_tgt]# ./nvmf_tgt -c ../../etc/spdk/mynvmf.conf.in >>>>>>>>>>>>>>>>> Trying to load the config file
Starting DPDK 17.05.0 initialization...
[ DPDK EAL parameters: nvmf -c 0x1 --file-prefix=spdk_pid19063 ]
EAL: Detected 6 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Total cores available: 1
Occupied cpu socket mask is 0x1
reactor.c: 314:_spdk_reactor_run: *NOTICE*: Reactor started on core 0 on socket 0
copy_engine_ioat.c: 306:copy_engine_ioat_init: *NOTICE*: Ioat Copy Engine Offload Enabled
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL: probe driver: 15b7:2001 spdk_nvme
nvmf_tgt.c: 215:nvmf_tgt_create_subsystem: *NOTICE*: allocated subsystem nqn.2014-08.org.nvmexpress.discovery on lcore 0 on socket 0
nvmf_tgt.c: 215:nvmf_tgt_create_subsystem: *NOTICE*: allocated subsystem nqn.2016-06.io.spdk:cnode1 on lcore 0 on socket 0
rdma.c: 955:spdk_nvmf_rdma_create: *NOTICE*: *** RDMA Transport Init ***
rdma.c:1120:spdk_nvmf_rdma_listen: *NOTICE*: *** NVMf Target Listening on 192.168.10.20 port 4420 ***
conf.c: 492:spdk_nvmf_construct_subsystem: *ERROR*: Could not find namespace bdev 'Nvme0n1'
nvmf_tgt.c: 276:spdk_nvmf_startup: *ERROR*: spdk_nvmf_parse_conf() failed
[[email protected] nvmf_tgt]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931.5G 0 disk
├─sda2 8:2 0 931G 0 part
│ ├─centos-swap 253:1 0 15.7G 0 lvm [SWAP]
│ ├─centos-home 253:2 0 865.3G 0 lvm /home
│ └─centos-root 253:0 0 50G 0 lvm /
└─sda1 8:1 0 500M 0 part /boot
nvme0n1 259:0 0 1.8T 0 disk
The Configuration file used is:
[Nvmf]
MaxQueuesPerSession 4
AcceptorPollRate 10000
[Nvme]
TransportId "trtype:PCIe traddr:0000:07:00.0" Nvme0
#[Malloc]
# NumberOfLuns 1
# LunSizeInMB 512
[Subsystem1]
NQN nqn.2016-06.io.spdk:cnode1
Core 0
#Mode Direct
Listen RDMA 192.168.10.20:4420
#Host nqn.2016-06.io.spdk:init
SN SPDK00000000000001
Namespace Nvme0n1
#NVMe 0000:07:00.0
Can anyone please let me know what went wrong with this configuration when NVMe device is used.
Please note that this is working fine when malloc devices are configured instead of NVMe Devices.
Thanks
Ankit Jain
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
4 years, 10 months
SPDK errors
by Santhebachalli Ganesh
PS: Have provided setup info later.
Getting errors when QD on perf is increased above >=64 most of the times.
Sometimes, even for <=16
Errors are not consistent. Provided a sample here.
Hope someone can provide some insights.
Thanks,
Ganesh
--- initiator cmd line
sudo ./perf -q 32 -s 512 -w randread -t 30 -r 'trtype:RDMA adrfam:IPv4
traddr:1.1.1.80 trsvcid:4420 subnqn:nqn.2016-06.io.spdk:cnode1' -c 0x2
--errors on stdout on target
Aug 24 17:14:09 dell730-80 nvmf[38006]:
nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not
map to outstanding cmd
Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c:
284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:201
cdw0:0 sqhd:0094 p:0 m:0 dnr:0
Aug 24 17:14:09 dell730-80 nvmf[38006]:
nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not
map to outstanding cmd
Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c:
284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:201
cdw0:0 sqhd:0094 p:0 m:0 dnr:0
Aug 24 17:14:09 dell730-80 nvmf[38006]:
nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not
map to outstanding cmd
Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c:
284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:198
cdw0:0 sqhd:0094 p:0 m:0 dnr:0
Aug 24 17:14:09 dell730-80 nvmf[38006]:
nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not
map to outstanding cmd
Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c:
284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:222
cdw0:0 sqhd:0094 p:0 m:0 dnr:0
Aug 24 17:14:09 dell730-80 nvmf[38006]:
nvme_pcie.c:1910:nvme_pcie_qpair_process_completions: *ERROR*: cpl does not
map to outstanding cmd
Aug 24 17:14:09 dell730-80 nvmf[38006]: nvme_qpair.c:
284:nvme_qpair_print_completion: *NOTICE*: SUCCESS (00/00) sqid:1 cid:222
cdw0:0 sqhd:0094 p:0 m:0 dnr:0
Aug 24 17:14:09 dell730-80 nvmf[38006]:
bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
Aug 24 17:14:09 dell730-80 nvmf[38006]:
bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
Aug 24 17:14:09 dell730-80 nvmf[38006]:
bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
Aug 24 17:14:09 dell730-80 nvmf[38006]:
bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
Aug 24 17:14:09 dell730-80 nvmf[38006]:
bdev_nvme.c:1248:bdev_nvme_queue_cmd: *ERROR*: readv failed: rc = -12
Aug 24 17:14:13 dell730-80 nvmf[38006]: rdma.c:1622:spdk_nvmf_rdma_poll:
*ERROR*: CQ error on CQ 0x7f8a3803cae0, Request 0x140231622050400 (12):
transport retry counter exceeded
--- errros seen on client
nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on
Queue Pair 0x1fdb580, Response Index 33408520 (13): RNR retry counter
exceeded
nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on
Queue Pair 0x1fdb580, Response Index 33408016 (5): Work Request Flushed
Error
nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on
Queue Pair 0x1fdb580, Response Index 14 (5): Work Request Flushed Error
nvme_rdma.c:1470:nvme_rdma_qpair_process_completions: *ERROR*: CQ error on
Queue Pair 0x1fdb580, Response Index 15 (5): Work Request Flushed Error
--- Some info on setup
Same HW/SW on target and initiator.
[email protected]:~> hostnamectl
Static hostname: dell730-80
Icon name: computer-server
Chassis: server
Machine ID: b5abb0fe67afd04c59521c40599b3115
Boot ID: f825aa6338194338a6f80125caa836c7
Operating System: openSUSE Leap 42.3
CPE OS Name: cpe:/o:opensuse:leap:42.3
Kernel: Linux 4.12.8-1.g4d7933a-default
Architecture: x86-64
[email protected]:~> lscpu | grep -i socket
Core(s) per socket: 12
Socket(s): 2
2MB and/or 1GB pages set,
Latest spdk/dpdk from respective GIT,
compiled with RDMA flag,
nvmf.conf file: (have played around with the values)
reactor mask 0x5555
AcceptorCore 2
1 - 3 Subsystems on cores 4,8,10
[email protected]:~> sudo gits/spdk/app/nvmf_tgt/nvmf_tgt -c
gits/spdk/etc/spdk/nvmf.conf -p 6
PCI, NVME cards (16GB)
[email protected]:~> sudo lspci | grep -i pmc
04:00.0 Non-Volatile memory controller: PMC-Sierra Inc. Device f117 (rev 06)
06:00.0 Non-Volatile memory controller: PMC-Sierra Inc. Device f117 (rev 06)
85:00.0 Non-Volatile memory controller: PMC-Sierra Inc. Device f117 (rev 06)
Network cards: (latest associated FW from vendor)
[email protected]:~> sudo lspci | grep -i connect
05:00.0 Ethernet controller: Mellanox Technologies MT27520 Family
[ConnectX-3 Pro]
4 years, 10 months