From 792807a444528e8c0a0bee7988de615f2358be79 Mon Sep 17 00:00:00 2001 From: Jim Harris Date: Wed, 16 Sep 2020 23:19:34 +0000 Subject: [PATCH] nvme: fix infinite loop when aborting queued reqs When we disconnect a qpair, part of the code path is calling _nvme_qpair_abort_queued_reqs. This takes care of aborting any requests that were queued waiting for slots to open on the submission queue. It walks the STAILQ one by one and manually completes them with ABORT status back to the caller. But if the callback path submits another request, this request may also get queued to the end of the queued_req TAILQ. This can result in an infinite loop. The solution is to use an STAILQ_SWAP to a local, empty STAILQ. Then we ensure we only abort the requests that were queued when _nvme_qpair_abort_queued_reqs() started executing. Fixes issue #1588. I used the multipath.sh test to reproduce this on my local system. If it ever dropped into the STAILQ loop in this function, we would hit the infinite loop. With this patch, I confirmed locally that now we safely avoid the infinite loop and the test passes. Signed-off-by: Jim Harris Change-Id: I657db23efe5983bd8613c870ad62695a7fc7f689 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4284 Tested-by: SPDK CI Jenkins Reviewed-by: Changpeng Liu Reviewed-by: Shuhei Matsumoto Reviewed-by: Ziye Yang Reviewed-by: Reviewed-by: Ben Walker --- lib/nvme/nvme_qpair.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/lib/nvme/nvme_qpair.c b/lib/nvme/nvme_qpair.c index 2303b916b..401ff017d 100644 --- a/lib/nvme/nvme_qpair.c +++ b/lib/nvme/nvme_qpair.c @@ -545,10 +545,14 @@ static void _nvme_qpair_abort_queued_reqs(struct spdk_nvme_qpair *qpair, uint32_t dnr) { struct nvme_request *req; + STAILQ_HEAD(, nvme_request) tmp; - while (!STAILQ_EMPTY(&qpair->queued_req)) { - req = STAILQ_FIRST(&qpair->queued_req); - STAILQ_REMOVE_HEAD(&qpair->queued_req, stailq); + STAILQ_INIT(&tmp); + STAILQ_SWAP(&tmp, &qpair->queued_req, nvme_request); + + while (!STAILQ_EMPTY(&tmp)) { + req = STAILQ_FIRST(&tmp); + STAILQ_REMOVE_HEAD(&tmp, stailq); if (!qpair->ctrlr->opts.disable_error_logging) { SPDK_ERRLOG("aborting queued i/o\n"); }