Spdk/lib
Philipp Skadorov 4bfb557d80 nvmf/rdma: recover qp from fatal errors
RDMA QP is attempted to recover after IBV_EVENT_QP_FATAL event
is received from IBV asynchronous event API.

RDMA QP is put into ERROR state and is not processing any inbound
requests. The outstanding requests are only allowed to COMPLETED
and FREE states, no outbound transfers are performed.

IBV_EVENT_QP_LAST_WQE_REACHED or IBV_EVENT_SQ_DRAINED event is
expected to follow IBV_EVENT_QP_FATAL, giving a go to draining of
all outstanding requests and freeing the associated resources.

The requests executed by block layer are gracefully allowed to
complete, but no outbound transfers are made.

Note, outstanding requests can not be reliably completed through
polling the CQ, as WC's with failure status might not have all
the fields valid. The failed WC's are dropped and the outstanding
requests are fetched from the appropriate state's linked list.

QP recovery is triggered when there is no more outstanding requests.
If QP recovery is completed succesfully, the RDMA QP is put back into
ACTIVE state, the QP disconnect is triggered otherwise.

Change-Id: I45ee7feea067f80ccc6402518990014d691fbda3
Signed-off-by: Philipp Skadorov <philipp.skadorov@wdc.com>
Reviewed-on: https://review.gerrithub.io/416879
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2018-07-18 20:58:50 +00:00
..
bdev bdev: change return type of _spdk_bdev_enable_qos to void 2018-07-18 03:46:12 +00:00
blob blobstore: no copy write in thin-provisioning 2018-07-17 17:44:13 +00:00
blobfs thread: Replace #include of io_channel.h with thread.h 2018-06-12 15:24:07 +00:00
conf conf: don't strdup name if section already exist 2018-05-17 17:13:19 +00:00
copy copy/ioat: Add scan_ioat_copy_engine RPC 2018-06-14 03:54:42 +00:00
env_dpdk env/app: add unlink hugepages option to app 2018-07-17 07:06:53 +00:00
event env/app: add unlink hugepages option to app 2018-07-17 07:06:53 +00:00
ioat ioat: fix typo on IOAT_DEFAULT_ORDER comment 2018-07-05 16:24:56 +00:00
iscsi iscsi: Support hot removal of LUN based on LUN open/close 2018-07-17 17:43:28 +00:00
json json: Add spdk_json_decode_uint16 2018-06-05 21:30:02 +00:00
jsonrpc jsonrpc: fix closed connection hadling 2018-06-08 18:11:18 +00:00
log util: Remove usage of abort from library code 2018-07-17 17:40:11 +00:00
lvol blobstore: add decouple parent function 2018-06-21 22:50:03 +00:00
nbd thread: Replace #include of io_channel.h with thread.h 2018-06-12 15:24:07 +00:00
net net: split sock abstraction into lib/sock 2018-06-22 17:09:57 +00:00
nvme nvme: show command manual completion 2018-07-16 08:23:19 +00:00
nvmf nvmf/rdma: recover qp from fatal errors 2018-07-18 20:58:50 +00:00
rocksdb thread: Replace #include of io_channel.h with thread.h 2018-06-12 15:24:07 +00:00
rpc rpc: Add option to get_rpc_methods RPC to output only currently usable RPCs 2018-05-04 17:45:48 +00:00
scsi iscsi: Support hot removal of LUN based on LUN open/close 2018-07-17 17:43:28 +00:00
sock net: split sock abstraction into lib/sock 2018-06-22 17:09:57 +00:00
thread util: Remove usage of abort from library code 2018-07-17 17:40:11 +00:00
trace app,lib: fix checking mmap return value 2018-03-30 16:18:34 -04:00
ut_mock test/mock: add pthread_self 2017-09-19 17:15:15 -04:00
util thread: Move threading abstraction code out of util 2018-06-12 15:24:07 +00:00
vhost vhost: add socket path in info dump 2018-07-12 23:54:34 +00:00
virtio virtio: fix vq init error handling 2018-07-11 21:02:06 +00:00
Makefile test: remove spdk_cunit library 2018-07-06 18:35:03 +00:00