nvme_tcp: fix bug about qpair stuck in CONNECTING state

When running perf test, sometimes after CONNECT req's resp was
received and processed, the qpair still failed to change from state
CONNECTING to CONNECTED. For when it goes to nvme_fabric_qpair_connect_poll
-> nvme_wait_for_completion_robust_lock_timeout_poll to process the
CONNECT req's resp, the req may have not been finished in sock_check_zcopy,
although its resp has been received and processed, which means the
tcp_req->ordering.bits.send_ack is still 0 and the status->done still
is false. And after the req is completed in sock_check_zcopy, we need
to poll this qpair again to make the state enter CONNECTED.

And if icreq's resp received and processed before nvme_tcp_send_icreq_complete
is called by _sock_check_zcopy, the qpair will be stuck in CONNECTING
and it never proceed to send the CONNECT req. We also need to put it
in pgroup->needs_poll to fix it.

I can reproduce this bug with the following configuration.
target: 16NVMe SSD, running on 20 cores;
initiator: randread test using nvme perf with 32 cpu cores and
zerocopy enabled.

The error doesn't always occur. CONNECT failure is about 1 failure in
ten with the following log. And icreq failure is less frequent with
only target side's "keep alive timeout" log.

Error reported in initiator side:
Initialization complete. Launching workers.
[2022-05-23 14:51:07.286794] nvme_qpair.c: 760:spdk_nvme_qpair_process_completions:
*ERROR*: CQ transport error -6 (No such device or address) on qpair id 2
ERROR: unable to connect I/O qpair.
ERROR: init_ns_worker_ctx() failed

And target side shows:
Disconnecting host  from subsystem nqn.2016-06.io.spdk:cnode2 due to keep alive timeout

Change-Id: Id72c2ffd615ab73c5fc67d36c3ff8b730cebcef7
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12975
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This commit is contained in:
Richael Zhuang 2022-06-07 18:13:17 +08:00 committed by Tomasz Zawadzki
parent 0fb87395b0
commit 4295661eb8

View File

@ -381,9 +381,14 @@ _pdu_write_done(void *cb_arg, int err)
* response to a PDU completing here. However, to attempt to make forward progress
* the qpair needs to be polled and we can't rely on another network event to make
* that happen. Add it to a list of qpairs to poll regardless of network activity
* here. */
if (tqpair->qpair.poll_group && !STAILQ_EMPTY(&tqpair->qpair.queued_req) &&
!tqpair->needs_poll) {
* here.
* Besides, when tqpair state is NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_POLL or
* NVME_TCP_QPAIR_STATE_INITIALIZING, need to add it to needs_poll list too to make
* forward progress in case that the resources are released after icreq's or CONNECT's
* resp is processed. */
if (tqpair->qpair.poll_group && !tqpair->needs_poll && (!STAILQ_EMPTY(&tqpair->qpair.queued_req) ||
tqpair->state == NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_POLL ||
tqpair->state == NVME_TCP_QPAIR_STATE_INITIALIZING)) {
pgroup = nvme_tcp_poll_group(tqpair->qpair.poll_group);
TAILQ_INSERT_TAIL(&pgroup->needs_poll, tqpair, link);