From 87ebcb08c19dccb690c3366f5a67999fb8ec42e2 Mon Sep 17 00:00:00 2001 From: Evgeniy Kochetov Date: Thu, 15 Aug 2019 06:06:39 +0000 Subject: [PATCH] nvmf/rdma: Handle completions for destroyed QP associated with SRQ IB Architecture Specification vol.1 rel.13. in ch.10.3.1 "QUEUE PAIR AND EE CONTEXT STATES" suggests the following destroy procedure for QPs associated with SRQ: - Put the QP in the Error State; - wait for the Affiliated Asynchronous Last WQE Reached Event; - either: * drain the CQ by invoking the Poll CQ verb and either wait for CQ to be empty or the number of Poll CQ operations has exceeded CQ capacity size; or * post another WR that completes on the same CQ and wait for this WR to return as a WC; - and then invoke a Destroy QP or Reset QP. Without the drain step it is possible that LAST_WQE_REACHED event is received and QP is destroyed before the last receive WR completion is polled from the CQ. In SPDK there is no risk of resource leakage in this case. So, instead of draining we can destroy QP and then just ignore receive completions without QP and post receive WRs back to SRQ. Fixes #903 Signed-off-by: Evgeniy Kochetov Signed-off-by: Sasha Kotchubievsky Signed-off-by: Alexey Marchuk Change-Id: Ice6d3d5afc205c489f768e3b51c6cda8809bee9a Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465747 Reviewed-by: Seth Howell Reviewed-by: Ben Walker Reviewed-by: Shuhei Matsumoto Tested-by: SPDK CI Jenkins --- lib/nvmf/rdma.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/lib/nvmf/rdma.c b/lib/nvmf/rdma.c index ecb5c2207..f72b64e87 100644 --- a/lib/nvmf/rdma.c +++ b/lib/nvmf/rdma.c @@ -3406,6 +3406,23 @@ spdk_nvmf_rdma_poller_poll(struct spdk_nvmf_rdma_transport *rtransport, rdma_recv = SPDK_CONTAINEROF(rdma_wr, struct spdk_nvmf_rdma_recv, rdma_wr); if (rpoller->srq != NULL) { rdma_recv->qpair = get_rdma_qpair_from_wc(rpoller, &wc[i]); + /* It is possible that there are still some completions for destroyed QP + * associated with SRQ. We just ignore these late completions and re-post + * receive WRs back to SRQ. + */ + if (spdk_unlikely(NULL == rdma_recv->qpair)) { + struct ibv_recv_wr *bad_wr; + int rc; + + rdma_recv->wr.next = NULL; + rc = ibv_post_srq_recv(rpoller->srq, + &rdma_recv->wr, + &bad_wr); + if (rc) { + SPDK_ERRLOG("Failed to re-post recv WR to SRQ, err %d\n", rc); + } + continue; + } } rqpair = rdma_recv->qpair;