nvme_tcp: fix bug about qpair stuck in CONNECTING state

When running perf test, sometimes after CONNECT req's resp was received and processed, the qpair still failed to change from state CONNECTING to CONNECTED. For when it goes to nvme_fabric_qpair_connect_poll -> nvme_wait_for_completion_robust_lock_timeout_poll to process the CONNECT req's resp, the req may have not been finished in sock_check_zcopy, although its resp has been received and processed, which means the tcp_req->ordering.bits.send_ack is still 0 and the status->done still is false. And after the req is completed in sock_check_zcopy, we need to poll this qpair again to make the state enter CONNECTED. And if icreq's resp received and processed before nvme_tcp_send_icreq_complete is called by _sock_check_zcopy, the qpair will be stuck in CONNECTING and it never proceed to send the CONNECT req. We also need to put it in pgroup->needs_poll to fix it. I can reproduce this bug with the following configuration. target: 16NVMe SSD, running on 20 cores; initiator: randread test using nvme perf with 32 cpu cores and zerocopy enabled. The error doesn't always occur. CONNECT failure is about 1 failure in ten with the following log. And icreq failure is less frequent with only target side's "keep alive timeout" log. Error reported in initiator side: Initialization complete. Launching workers. [2022-05-23 14:51:07.286794] nvme_qpair.c: 760:spdk_nvme_qpair_process_completions: *ERROR*: CQ transport error -6 (No such device or address) on qpair id 2 ERROR: unable to connect I/O qpair. ERROR: init_ns_worker_ctx() failed And target side shows: Disconnecting host from subsystem nqn.2016-06.io.spdk:cnode2 due to keep alive timeout Change-Id: Id72c2ffd615ab73c5fc67d36c3ff8b730cebcef7 Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12975 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-06-07 18:13:17 +08:00 · 2022-06-07 18:13:17 +08:00 · 4295661eb8
commit 4295661eb8
parent 0fb87395b0
1 changed files with 8 additions and 3 deletions
--- a/lib/nvme/nvme_tcp.c
+++ b/lib/nvme/nvme_tcp.c
@ -381,9 +381,14 @@ _pdu_write_done(void *cb_arg, int err)
 	 * response to a PDU completing here. However, to attempt to make forward progress
 	 * the qpair needs to be polled and we can't rely on another network event to make
 	 * that happen. Add it to a list of qpairs to poll regardless of network activity
-	 * here. */
+	 * here.
-	if (tqpair->qpair.poll_group && !STAILQ_EMPTY(&tqpair->qpair.queued_req) &&
+	 * Besides, when tqpair state is NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_POLL or
-	    !tqpair->needs_poll) {
+	 * NVME_TCP_QPAIR_STATE_INITIALIZING, need to add it to needs_poll list too to make
 	 * forward progress in case that the resources are released after icreq's or CONNECT's
 	 * resp is processed. */
 	if (tqpair->qpair.poll_group && !tqpair->needs_poll && (!STAILQ_EMPTY(&tqpair->qpair.queued_req) ||
 			tqpair->state == NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_POLL ||
 			tqpair->state == NVME_TCP_QPAIR_STATE_INITIALIZING)) {
 		pgroup = nvme_tcp_poll_group(tqpair->qpair.poll_group);
 		TAILQ_INSERT_TAIL(&pgroup->needs_poll, tqpair, link);