A subsystem RPC is not transitioned to a paused state when there are ios outstanding (tracked by subsystem poll group). In general AERs, are not tracked as outstanding IOs. However, there are 3 paths in nvmf_ctrlr_async_event_request which do not adjust the outstanding io count. If we get into any of these 3 paths, the subsystem pause can hang forever. The issue was reproduced with hot plug stress testing under load. We can get into the second path (SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE) under these circumstances: - An AER completion is sent to the initiator due to a namespace change (e.g. hot remove/add) - In this case, type is set to SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE - The initiator sends a new AER admin command, hitting the second path where we return without adjusting the outstanding ios. Fixes: 1552 Change-Id: I45f781966cc1e9a601b2305c7985a21154d802e8 Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3854 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Seth Howell <seth.howell@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: JinYu <jin.yu@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
||
---|---|---|
.. | ||
ctrlr_bdev.c | ||
ctrlr_discovery.c | ||
ctrlr.c | ||
fc_ls.c | ||
fc.c | ||
Makefile | ||
nvmf_fc.h | ||
nvmf_internal.h | ||
nvmf_rpc.c | ||
nvmf.c | ||
rdma.c | ||
spdk_nvmf.map | ||
subsystem.c | ||
tcp.c | ||
transport.c | ||
transport.h |