nvmf/rdma: Fix data_wr_pool corruption

When there are not enought transport buffers for
multi SGL request in state NEED_BUFFER, WRs
received from the data_wr_pool are returned back
to the pool. However rdma_req->data.wr.next pointer
still points to the first WR from the pool. Usually
it doesn't cause any problems since rdma_req will
try to fill buffers again, but when qpair is being
destroyed, all requests are completed forcefully.
When the request is completed and data.wr.next
pointer is not NULL, we'll try to put already
released WRs into the pool one more time.
That corrupts the pool and leads to undefined
behavior.

Fixes #2541

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I238b92eec132d8d845330362af6f335421177454
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13760
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This commit is contained in:
Alexey Marchuk 2022-07-22 19:01:53 +03:00 committed by Konrad Sztyber
parent 673c8a65e1
commit 7fbda6d916

View File

@ -613,11 +613,14 @@ nvmf_rdma_request_free_data(struct spdk_nvmf_rdma_request *rdma_req,
data_wr->wr.num_sge = 0;
next_send_wr = data_wr->wr.next;
if (data_wr != &rdma_req->data) {
data_wr->wr.next = NULL;
spdk_mempool_put(rtransport->data_wr_pool, data_wr);
}
data_wr = (!next_send_wr || next_send_wr == &rdma_req->rsp.wr) ? NULL :
SPDK_CONTAINEROF(next_send_wr, struct spdk_nvmf_rdma_request_data, wr);
}
rdma_req->data.wr.next = NULL;
rdma_req->rsp.wr.next = NULL;
}
static void
@ -1914,8 +1917,6 @@ _nvmf_rdma_request_free(struct spdk_nvmf_rdma_request *rdma_req,
rdma_req->req.length = 0;
rdma_req->req.iovcnt = 0;
rdma_req->req.data = NULL;
rdma_req->rsp.wr.next = NULL;
rdma_req->data.wr.next = NULL;
rdma_req->offset = 0;
rdma_req->req.dif_enabled = false;
rdma_req->fused_failed = false;