nvme: Handle CQ polling failures by marking the controller as failed.
nvme_transport_qpair_process_completions calls nvme_rdma_qpair_process_completions There are some cases return -1 due to failure of "CQ errors". Handle CQ polling failures by marking the controller as failed. That a completion with an error will be treated as controller failed. Requests will be aborted after retry counter exceeded. Otherwise, code will keep on reporting errors without recovery. This is to fix issue #850. Change-Id: I0b324232310e107bf7fd5722aca54d402a19b14d Signed-off-by: yidong0635 <dongx.yi@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460569 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This commit is contained in:
parent
16fdf46600
commit
ff0a7dfc42
@ -449,6 +449,10 @@ spdk_nvme_qpair_process_completions(struct spdk_nvme_qpair *qpair, uint32_t max_
|
|||||||
|
|
||||||
qpair->in_completion_context = 1;
|
qpair->in_completion_context = 1;
|
||||||
ret = nvme_transport_qpair_process_completions(qpair, max_completions);
|
ret = nvme_transport_qpair_process_completions(qpair, max_completions);
|
||||||
|
if (ret < 0) {
|
||||||
|
SPDK_ERRLOG("CQ error, abort requests after transport retry counter exceeded\n");
|
||||||
|
qpair->ctrlr->is_failed = true;
|
||||||
|
}
|
||||||
qpair->in_completion_context = 0;
|
qpair->in_completion_context = 0;
|
||||||
if (qpair->delete_after_completion_context) {
|
if (qpair->delete_after_completion_context) {
|
||||||
/*
|
/*
|
||||||
|
Loading…
Reference in New Issue
Block a user