Spdk/lib
Seth Howell ceb32abbd8 nvmf: don't set qpair->group to NULL.
The typical rdma qpair disconnect function goes through the function
_nvmf_rdma_disconnect_retry. When this function was introduced, it was
discovered that we could receive a qpair disconnect event for a given
qpair before that qpair had been assigned to a poll group. In order to
ensure that the disconnect procedure completed properly, we waited on
the current thread in _nvmf_rdma_disconnect_retry for the qpair to be
assigned a poll group before we finally disconnected. see rdma.c:2250.
Since _nvmf_rdma_disconnect_retry was not necessarily called from the
poll group's thread, we relied upon the assumption that the group
variable would never be set back to NULL. See the comment on rdma.c:
2243.

However, in _spdk_nvmf_qpair_destroy we were setting the group back to
NULL. This operation can result in the following set of operations
across multiple threads that prevent a qpair from ever being fully
destroyed.
1. thread 1: receive a disconnect event - call nvmf_rdma_disconnect
2. thread 1: from nvmf_rdma_disconnect call
spdk_nvmf_rdma_qpair_inc_refcnt - setting rqpair->refcnt to 1.
3. thread 2: call spdk_nvmf_rdma_poller_poll.
4. thread 2: in spdk_nvmf_rdma_poller_poll reap a completion with an
error status which causes us to call spdk_nvmf_qpair_disconnect -
rdma:2846
5. thread 2: spdk_nvmf_qpair_disconnect calls _spdk_nvmf_qpair_destroy which sets
qpair->group = NULL
6. thread 1: from nvmf_rdma_disconnect we call
_nvmf_rdma_disconnect_retry which checks if qpair->group == NULL. If
that is the case, we assume that the qpair has not been assigned a group
yet and send ourself a message to call _nvmf_rdma_disconnect_retry again. see rdma.c:2253
7. thread 2: from _spdk_nvmf_qpair_destroy we call
spdk_nvmf_transport_qpair_fini which results in a call to
spdk_nvmf_rdma_close_qpair. which sends dummy send and recvs to the
qpair.
8. thread 2: we call poller_poll and get completions for both the send
and recv dummy requests. This results in a call to
spdk_nvmf_rdma_qpair_destroy.
9. thread 2: spdk_nvmf_rdma_qpair_destroy checks rqpair->refcnt and when
it sees that it does not = 0 (see step 2 above) it returns without
freeing the resources. see rdma.c:629
10. thread 1: we keep churning in _nvmf_rdma_disconnect_retry sending
ourselves messages because rqpair->group is going to be null. Thread 1
never reaches line 2257 where it sends a message to call
_nvmf_rdma_qpair_disconnect. _nvmf_rdma_qpair_disconnect is the function
that decreases the rqpair->refcnt and allows us to make forward progress
on destroying the qpair.

I encountered this issue while trying to disconnect from our target
using the kernel initiator with an x722 NIC. I think the timing on this
bug comes out with that specific configuration because come of the calls
in the disconnect path on thread 1 fail causing it to take longer giving
a chance to the second thread to delete the qpair.

There are really two issues at play here. We don't have a single point
of entry for disconnecting RDMA qpairs, and we rely on the qpair->group
variable never being set back to NULL. This patch addresses the second
issue, and the next patch in the series addresses the first.

Change-Id: I65395d0bbb67edfa7bad2ddc70906606c3d83781
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-11 19:25:51 +00:00
..
bdev lib/bdev: Expose enabled DIF check types of bdev. 2019-02-08 23:37:13 +00:00
blob lvol: add option to change default data erase method 2019-01-23 22:25:37 +00:00
blobfs blobfs: fix the length value of file. 2019-01-17 05:04:13 +00:00
conf string: spdk_strtol to delegate additional error checking 2019-01-29 00:10:57 +00:00
copy lib/copy: unregister copy engine on finish 2018-10-15 17:42:20 +00:00
env_dpdk bdev/compress: Add configure option and build dependencies 2019-02-11 19:23:17 +00:00
event event: Use spdk_json_write_named_* APIs throughout 2019-02-04 07:08:04 +00:00
ftl lib/ftl: Remove NULL pointer checks in external APIs 2019-02-08 16:35:34 +00:00
ioat memory: replace all hardcoded 0x200000 with a define 2019-01-13 00:47:26 +00:00
iscsi iscsi: remove unused mobj fields 2019-02-06 20:21:48 +00:00
json json_util: add debug logs to spdk_json_decode_object function 2019-01-10 14:31:37 +00:00
jsonrpc jsonrpc: Use spdk_json_write_named_* APIs throughout 2019-02-04 07:08:04 +00:00
log app: rename traceflag cmdline option to logflag 2018-12-03 19:50:15 +00:00
lvol lvol: add option to change default data erase method 2019-01-23 22:25:37 +00:00
nbd nbd: Use spdk_json_write_named_* APIs throughout 2019-02-04 07:08:04 +00:00
net net: Use spdk_json_write_named_* APIs throughout 2019-02-04 07:08:04 +00:00
nvme nvme/pcie: mark infrequent cases as unlikely in submission path 2019-02-06 18:37:40 +00:00
nvmf nvmf: don't set qpair->group to NULL. 2019-02-11 19:25:51 +00:00
reduce reduce: fix ordering bug 2019-02-04 19:23:35 +00:00
rocksdb thread: Rename spdk_allocate_thread to spdk_thread_create 2019-01-17 11:24:38 +00:00
rpc rpc: add spdk_rpc_is_method_allowed 2018-12-05 00:35:35 +00:00
scsi scsi: Use spdk_json_write_named_* APIs throughout 2019-02-04 07:08:04 +00:00
sock UT: fix the sock_ut failure because of the port conflict 2019-01-22 17:28:24 +00:00
thread thread: Keep caches of message objects on the thread object. 2019-02-05 06:49:30 +00:00
trace lib/trace: add trace_record tool 2019-01-30 06:36:25 +00:00
ut_mock thread: Eliminate use of pthread_self and thread_ids 2019-01-15 16:53:12 +00:00
util dif: Rename bitmask macros from SPDK_DIF_*_CHECK to SPDK_DIF_FLAGS_*_CHECK 2019-02-08 23:37:13 +00:00
vhost vhost/rpc: remove unnecessary if in the add_vhost_scsi_lun RPC 2019-02-06 19:04:21 +00:00
virtio virtio: Use spdk_json_write_named_* APIs throughout 2019-02-04 07:08:04 +00:00
Makefile ftl: Initial implementation 2019-01-11 09:15:39 +00:00