On Wed, 2018-07-18 at 19:15 +0000, Philipp Skadorov wrote:
> -----Original Message-----
> From: Walker, Benjamin [mailto:firstname.lastname@example.org]
> Sent: Tuesday, July 17, 2018 7:10 PM
> To: spdk(a)lists.01.org; Philipp Skadorov <Philipp.Skadorov(a)wdc.com>
> Subject: Re: #416879 qp recovery: outstanding requests
> On Tue, 2018-07-17 at 22:41 +0000, Philipp Skadorov wrote:
> > Hi Benjamin,
> > I have played with SoftRoCE and run through the real SNIC IB driver
> > sources to see how it is possible to continue with the outstanding
> > requests after the QP is recovered.
> > When the QP goes into an error state (async event: IB_EVENT_QP_FATAL)
> > and drains the CQ, it sends responses back with the error code
> > IB_WC_WR_FLUSH_ERR
> > (5) which makes the outstanding requests in SPDK sort of invalidated.
> > It looks to me that dropping those outstanding SPDK requests and
> > freeing resources the best way to go.
> Ok - I assume there is some handling we need to implement on the initiator
> side to deal with the IB_WC_WR_FLUSH_ERR responses. Maybe we should
> retry those on the initiator side once the RDMA queue pair recovers?
Right. Looking at assert(0), I was thinking of the request retry counter
Could probably pick up some related tasks from trello board if you wish.
Thinking about this a bit more, there are a number of different cases that may
need to be handled depending on when the link goes down relative to where the
command is at within the state machine.
1) If the link goes down while the command is being sent, presumably it never
arrives at the target and the initiator receives a IB_WC_WR_FLUSH_ERR. In this
case, the initiator should queue it and wait for the link to recover before re-
sending (today, it just disconnects).
2) If the link goes down during an RDMA read or write operation, the target will
presumably see an IB_WC_WR_FLUSH_ERR, recover the connection, and could continue
on (today, it will drop the request).
3) If the link goes down while the request is being processed by the storage
device, no errors are seen at the network layer related to this request. The
connection manager will notify the RDMA transport of the problem, your code will
recover the connection, and the request could complete as normal (today, it will
drop the request).
All of these depend on the initiator not immediately killing the connection when
it sees that the connection goes down. I'll throw something up on Trello for
that task so at least the SPDK initiator take advantage of this recovery
process. We can't do anything about other initiators, of course.
> I just
> looked at the specification and it doesn't have much to say on the issue
> beyond errors may cause the RDMA QP to be terminated, and that the
> details were up to the specific transport specification (Infiniband Verbs in
> I wonder if the other NVMe-oF initiator implementations attempt any sort of
> error handling, or if they just terminate the QP.
I see the Linux kernel disconnects immediately when receiving