Spdk/lib
Alexey Marchuk 813869d823 nvmf: Fix possible race condition when adding IO qpair
There is a chance that admin qpair is being destroyed at
the moment when IO qpair is added to a controller due to e.g.
expired keep alive timer. Part of the qpair destruction process
is change of qpair's state to DEACTIVATING and removing it
from poll group. We can check admin qpair's state and poll
group pointer before sending a message to poll group's thread
and fail connect command.

Logs and backtrace from one CI build that hit this problem:
00:10:53.192  [2021-01-22 15:29:46.671869] ctrlr.c: 185:nvmf_ctrlr_keep_alive_poll: *NOTICE*: Disconnecting host from subsystem nqn.2016-06.io.spdk:cnode1 due to keep alive timeout.
00:10:53.374  [2021-01-22 15:29:46.854223] ctrlr.c: 185:nvmf_ctrlr_keep_alive_poll: *NOTICE*: Disconnecting host from subsystem nqn.2016-06.io.spdk:cnode2 due to keep alive timeout.
00:10:53.374  ctrlr.c:587:41: runtime error: member access within null pointer of type 'struct spdk_nvmf_poll_group'
00:10:53.486      #0 0x7f9307d3d3d8 in _nvmf_ctrlr_add_io_qpair /home/vagrant/spdk_repo/spdk/lib/nvmf/ctrlr.c:587
00:10:53.486      #1 0x7f93077ea3cd in msg_queue_run_batch /home/vagrant/spdk_repo/spdk/lib/thread/thread.c:553
00:10:53.486      #2 0x7f93077eb66f in thread_poll /home/vagrant/spdk_repo/spdk/lib/thread/thread.c:631
00:10:53.486      #3 0x7f93077ede54 in spdk_thread_poll /home/vagrant/spdk_repo/spdk/lib/thread/thread.c:740
00:10:53.486      #4 0x7f93078366c3 in _reactor_run /home/vagrant/spdk_repo/spdk/lib/event/reactor.c:677
00:10:53.486      #5 0x7f9307836ec8 in reactor_run /home/vagrant/spdk_repo/spdk/lib/event/reactor.c:721
00:10:53.486      #6 0x7f9307837dfb in spdk_reactors_start /home/vagrant/spdk_repo/spdk/lib/event/reactor.c:838
00:10:53.486      #7 0x7f930782f1c4 in spdk_app_start /home/vagrant/spdk_repo/spdk/lib/event/app.c:580
00:10:53.486      #8 0x4024fa in main /home/vagrant/spdk_repo/spdk/app/nvmf_tgt/nvmf_main.c:75
00:10:53.486      #9 0x7f930716d1a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
00:10:53.486      #10 0x40228d in _start (/home/vagrant/spdk_repo/spdk/build/bin/nvmf_tgt+0x40228d)

Change-Id: I0968eabd1bcd532b8d69434ad5503204c0a2d92b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6071
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
2021-01-26 08:32:39 +00:00
..
accel lib/accel: change max batch size to match idxd batch max 2020-11-18 11:27:23 +00:00
bdev bdev: add function to return aio's errno 2021-01-20 00:13:49 +00:00
blob blob: Make the ABI compatibility of spdk_blob_open_opts structure. 2020-12-29 07:55:22 +00:00
blobfs blob: Make the ABI compatibility for spdk_bs_opts 2020-12-29 07:55:22 +00:00
conf lib/conf: check pointer return value when use calloc 2020-11-11 01:02:31 +00:00
env_dpdk env/dpdk: Use the DPDK device count for IOMMU mapping 2021-01-22 18:32:53 +00:00
env_ocf lib/thead: print error log when create mempool or ring failed 2020-11-05 09:41:06 +00:00
event scheduler: Move busy thread if its mask do not match current lcore 2021-01-25 20:37:50 +00:00
ftl lib/ftl: add assert check for ftl_wptr_from_band 2020-11-17 08:25:31 +00:00
idxd lib/idxd: small code cleanup 2020-10-22 22:43:28 +00:00
ioat ioat: hide 2MiB boundary memory check in spdk_vtophys() 2020-11-25 17:15:13 +00:00
iscsi lib/iscsi: Support the Datain pdu sending in out of order case. 2021-01-07 13:36:39 +00:00
json json: add spdk_json_free_object() 2020-10-19 10:02:10 +00:00
jsonrpc lib/jsonrpc: Add a new API to send response for writing bool result. 2020-11-16 15:08:47 +00:00
log log: remove internal log.h header 2020-10-15 08:23:39 +00:00
lvol blob: Make the ABI compatibility of spdk_blob_open_opts structure. 2020-12-29 07:55:22 +00:00
nbd lib/nbd: Add the abort support 2021-01-25 08:14:49 +00:00
net lib/jsonrpc: Add a new API to send response for writing bool result. 2020-11-16 15:08:47 +00:00
notify log: remove internal log.h header 2020-10-15 08:23:39 +00:00
nvme nvme: add NO_SGL_FOR_DSM quirk for Intel P55XX SSDs 2021-01-22 08:16:53 +00:00
nvmf nvmf: Fix possible race condition when adding IO qpair 2021-01-26 08:32:39 +00:00
rdma rdma: Remove check for translation length 2021-01-18 13:02:20 +00:00
reduce log: remove internal log.h header 2020-10-15 08:23:39 +00:00
rocksdb build: use DEPDIRS variables to build SPDK_LIB_LIST 2020-12-18 09:40:01 +00:00
rpc RPC: update the error message for current RPC state 2020-07-31 08:21:37 +00:00
scsi lib: Use PRId64 for portability 2020-11-20 11:01:37 +00:00
sock lib/sock: Make spdk_sock_flush do real work if sock does not belong to a group. 2020-12-18 09:39:51 +00:00
thread lib/thread: Defer exiting thread if thread is unregistering io_device 2021-01-13 10:07:51 +00:00
trace trace: disable trace by set num-trace-entries=0 2020-11-26 10:16:26 +00:00
ut_mock mk/lib: add a check that major and minor version is set for libs. 2020-05-21 09:19:00 +00:00
util intr: allow operations on fd=0 2020-12-21 17:49:12 +00:00
vfio_user NVMe/vfio-user: add initial version vfio-user transport to NVMe driver 2021-01-21 05:00:18 +00:00
vhost vhost-blk: recover ring base when reconnect 2021-01-15 08:30:18 +00:00
virtio virtio: add transitional virtio device support 2020-11-20 11:00:53 +00:00
vmd lib: Use PRId64 for portability 2020-11-20 11:01:37 +00:00
Makefile NVMe/vfio-user: add initial version vfio-user transport to NVMe driver 2021-01-21 05:00:18 +00:00