nvmf/rdma: Fix data_wr_pool corruption

When there are not enought transport buffers for multi SGL request in state NEED_BUFFER, WRs received from the data_wr_pool are returned back to the pool. However rdma_req->data.wr.next pointer still points to the first WR from the pool. Usually it doesn't cause any problems since rdma_req will try to fill buffers again, but when qpair is being destroyed, all requests are completed forcefully. When the request is completed and data.wr.next pointer is not NULL, we'll try to put already released WRs into the pool one more time. That corrupts the pool and leads to undefined behavior. Fixes #2541 Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com> Change-Id: I238b92eec132d8d845330362af6f335421177454 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13760 Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-07-22 19:01:53 +03:00 · 2022-07-22 19:01:53 +03:00 · 7fbda6d916
commit 7fbda6d916
parent 673c8a65e1
1 changed files with 3 additions and 2 deletions
--- a/lib/nvmf/rdma.c
+++ b/lib/nvmf/rdma.c
@ -613,11 +613,14 @@ nvmf_rdma_request_free_data(struct spdk_nvmf_rdma_request *rdma_req,
 		data_wr->wr.num_sge = 0;
 		next_send_wr = data_wr->wr.next;
 		if (data_wr != &rdma_req->data) {
+			data_wr->wr.next = NULL;
 			spdk_mempool_put(rtransport->data_wr_pool, data_wr);
 		}
 		data_wr = (!next_send_wr || next_send_wr == &rdma_req->rsp.wr) ? NULL :
 			  SPDK_CONTAINEROF(next_send_wr, struct spdk_nvmf_rdma_request_data, wr);
 	}
+	rdma_req->data.wr.next = NULL;
+	rdma_req->rsp.wr.next = NULL;
 }

 static void
@ -1914,8 +1917,6 @@ _nvmf_rdma_request_free(struct spdk_nvmf_rdma_request *rdma_req,
 	rdma_req->req.length = 0;
 	rdma_req->req.iovcnt = 0;
 	rdma_req->req.data = NULL;
-	rdma_req->rsp.wr.next = NULL;
-	rdma_req->data.wr.next = NULL;
 	rdma_req->offset = 0;
 	rdma_req->req.dif_enabled = false;
 	rdma_req->fused_failed = false;