nvme_qpair: fix check_enabled.

check_enabled had a couple bugs in it that made it unfriendly for enabling I/O qpairs after a reset. 1. It was calling nvme_qpair_abort_queued_requests before setting the enabled flag to true. For applications that submit new I/O in the completion callback for old I/O, this means you enter an infinite loop of submitting requests, and then immediately completing them. SO instead, wait for the qpair to reset, then just submit those requests to the lower layer. 2. It didn't check whether we were already in the middle of calling it, so we could reenter function calls like nvme_qpair_abort_queued_requests. Also, now that we have a coherent state machine for qpairs, we can limit the enabling to a specific state in that state machine. Change-Id: Ie0b74819a6b16839965bced47c33dec967f725a8 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470256 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-02 14:51:10 -07:00 · 2019-10-02 14:51:10 -07:00 · 4c1a18c41d
commit 4c1a18c41d
parent a1ce725c0a
2 changed files with 18 additions and 5 deletions
--- a/lib/nvme/nvme_internal.h
+++ b/lib/nvme/nvme_internal.h
@ -335,6 +335,7 @@ enum nvme_qpair_state {
 	NVME_QPAIR_DISABLED,
 	NVME_QPAIR_CONNECTING,
 	NVME_QPAIR_CONNECTED,
+	NVME_QPAIR_ENABLING,
 	NVME_QPAIR_ENABLED,
 };

--- a/lib/nvme/nvme_qpair.c
+++ b/lib/nvme/nvme_qpair.c
@ -407,15 +407,27 @@ nvme_qpair_abort_queued_reqs(struct spdk_nvme_qpair *qpair, uint32_t dnr)
 static inline bool
 nvme_qpair_check_enabled(struct spdk_nvme_qpair *qpair)
 {
+	struct nvme_request *req;
+
 	/*
-	 * This is the point at which we re-enable the qpair after a reset. If the qpair has been
-	 * disabled for some reason, we need to flush it and restart submitting and completing I/O.
+	 * Either during initial connect or reset, the qpair should follow the given state machine.
+	 * QPAIR_DISABLED->QPAIR_CONNECTING->QPAIR_CONNECTED->QPAIR_ENABLING->QPAIR_ENABLED. In the
+	 * reset case, once the qpair is properly connected, we need to abort any outstanding requests
+	 * from the old transport connection and encourage the application to retry them. We also need
+	 * to submit any queued requests that built up while we were in the connected or enabling state.
 	 */
-	if (!nvme_qpair_state_equals(qpair, NVME_QPAIR_ENABLED) && !qpair->ctrlr->is_resetting) {
+	if (nvme_qpair_state_equals(qpair, NVME_QPAIR_CONNECTED) && !qpair->ctrlr->is_resetting) {
+		nvme_qpair_set_state(qpair, NVME_QPAIR_ENABLING);
 		nvme_qpair_complete_error_reqs(qpair);
-		nvme_qpair_abort_queued_reqs(qpair, 0 /* retry */);
-		nvme_qpair_set_state(qpair, NVME_QPAIR_ENABLED);
 		nvme_transport_qpair_abort_reqs(qpair, 0 /* retry */);
+		nvme_qpair_set_state(qpair, NVME_QPAIR_ENABLED);
+		while (!STAILQ_EMPTY(&qpair->queued_req)) {
+			req = STAILQ_FIRST(&qpair->queued_req);
+			STAILQ_REMOVE_HEAD(&qpair->queued_req, stailq);
+			if (nvme_qpair_resubmit_request(qpair, req)) {
+				break;
+			}
+		}
 	}

 	return nvme_qpair_state_equals(qpair, NVME_QPAIR_ENABLED);