Commit Graph

1727 Commits

Author SHA1 Message Date
Jim Harris
d1179a5801 nvme: put req in nvme_tcp_req_complete
All callers of nvme_tcp_req_complete call
nvme_tcp_req_put immediately afterwards, so move
this call into nvme_tcp_req_complete.

This will help enable some improvements in later
patches.

Note that nvme_tcp_req_complete_safe has this same
functionality open coded right now, but that will
get changed in the next patch.  It calls
nvme_tcp_req_put immediately after the TAILQ_REMOVE,
so do that in nvme_tcp_req_complete as well.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I368122bc49a7f0772e3011e5427e3c43618380eb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13520
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
2022-07-04 07:23:13 +00:00
Shuhei Matsumoto
4be6d30438 nvme: Add ctrlr_abort_queued_aborts() into qpair_abort_all_queued_reqs()
nvme_qpair_abort_all_queued_reqs() aborts error injections, queued
requests, aborting queued requests, and outstanding requests. (Aborting
outstanding requests depends on transports.) However, it did not abort
queued aborts.

Include nvme_ctrlr_abort_queued_aborts() into
nvme_qpair_abort_all_queued_reqs() to do really the name of the
function indicates.

nvme_ctrlr_abort_queued_aborts() has been called in a few cases, but
we do not care duplication.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I19102cc6603a72ce5c398a7947cb4d606b692991
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12849
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
2022-06-30 07:51:23 +00:00
Ben Walker
8dd1cd2104 check_format: For C files only, fix return type breaks
In SPDK, declarations have the return type on the same line. Definitions
have the return type on a separate line. Astyle has an option for
enforcing this. Unfortunately, it seems to have two bugs:

1) It doesn't work correctly at all on C++ files.
2) It often fails on functions that return enums, or long type names

Deal with 1) by adjusting the check_format.sh script to only tell astyle
to fix return type line breaks for C files and not C++. Deal with 2) by
adding a few typedefs to work around the problem.

Change-Id: Idf28281466cab8411ce252d5f02ab384166790c6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
2022-06-27 09:33:48 +00:00
Shuhei Matsumoto
ceaa4ee0f7 nvme: Increment ctrlr->outstanding_aborts when aborting req in ctrlr->queued_aborts
We had not incremented ctrlr->outstanding_aborts when aborting a
request in the ctrlr->queued_aborts, and ctrlr->outstanding_aborts
became negative. Fix the bug in this patch. Additionally add assert
to check if ctrlr->outstanding_aborts is not negative.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I58090286f070ba854bdea87f0f8ecb7810890338
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13452
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-06-24 07:22:36 +00:00
Sebastian Brzezinka
14ecc7787d nvme: Complete pending register operations first
Fully asynchronous ctrlr detach (b6ecc3729) introduce a register
operation state machine that waits for operation to complete. When
controller failed to initialize, `nvme_ctrlr_fail` set qpair state to
`DISCONNECTED` immediately, causing qpair process completions to
never complete register operations therefore prevent async detach exit.

Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I205c5157b8ea7b4535f98ff4052414310e421446
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12858
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-06-20 10:00:17 +00:00
Richael Zhuang
4295661eb8 nvme_tcp: fix bug about qpair stuck in CONNECTING state
When running perf test, sometimes after CONNECT req's resp was
received and processed, the qpair still failed to change from state
CONNECTING to CONNECTED. For when it goes to nvme_fabric_qpair_connect_poll
-> nvme_wait_for_completion_robust_lock_timeout_poll to process the
CONNECT req's resp, the req may have not been finished in sock_check_zcopy,
although its resp has been received and processed, which means the
tcp_req->ordering.bits.send_ack is still 0 and the status->done still
is false. And after the req is completed in sock_check_zcopy, we need
to poll this qpair again to make the state enter CONNECTED.

And if icreq's resp received and processed before nvme_tcp_send_icreq_complete
is called by _sock_check_zcopy, the qpair will be stuck in CONNECTING
and it never proceed to send the CONNECT req. We also need to put it
in pgroup->needs_poll to fix it.

I can reproduce this bug with the following configuration.
target: 16NVMe SSD, running on 20 cores;
initiator: randread test using nvme perf with 32 cpu cores and
zerocopy enabled.

The error doesn't always occur. CONNECT failure is about 1 failure in
ten with the following log. And icreq failure is less frequent with
only target side's "keep alive timeout" log.

Error reported in initiator side:
Initialization complete. Launching workers.
[2022-05-23 14:51:07.286794] nvme_qpair.c: 760:spdk_nvme_qpair_process_completions:
*ERROR*: CQ transport error -6 (No such device or address) on qpair id 2
ERROR: unable to connect I/O qpair.
ERROR: init_ns_worker_ctx() failed

And target side shows:
Disconnecting host  from subsystem nqn.2016-06.io.spdk:cnode2 due to keep alive timeout

Change-Id: Id72c2ffd615ab73c5fc67d36c3ff8b730cebcef7
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12975
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-06-14 09:18:04 +00:00
Jim Harris
488570ebd4 Replace most BSD 3-clause license text with SPDX identifier.
Many open source projects have moved to using SPDX identifiers
to specify license information, reducing the amount of
boilerplate code in every source file.  This patch replaces
the bulk of SPDK .c, .cpp and Makefiles with the BSD-3-Clause
identifier.

Almost all of these files share the exact same license text,
and this patch only modifies the files that contain the
most common license text.  There can be slight variations
because the third clause contains company names - most say
"Intel Corporation", but there are instances for Nvidia,
Samsung, Eideticom and even "the copyright holder".

Used a bash script to automate replacement of the license text
with SPDX identifier which is checked into scripts/spdx.sh.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa88ab5e92ea471691dc298cfe41ebfb5d169780
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12904
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: <qun.wan@intel.com>
2022-06-09 07:35:12 +00:00
Heinrich Schuchardt
72b5626d33 nvme/pcie: memory barrier for RISC-V
Play it safe and add the same memory barrier in
nvme_pcie_qpair_process_completions() as for ppc64.

Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Change-Id: I7079b4769d30106387ef4549495a72b7fea6a77a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12879
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-06-06 07:34:27 +00:00
MengjinWu
bb33310aa0 nvmf: remove XOR in nvme_tcp_pdu_calc_data_digest
Prepare for the later patch, and make the later patch code clean

Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I12b175c86a5245f38dc76fe2d3918ec4b30a475a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12830
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
2022-06-02 08:16:38 +00:00
Konrad Sztyber
1f3bd08fa0 nvme/tcp: check tcp_req for NULL in pdu_payload_handle
For a C2HTermReq PDU, there's no associated tcp_req, so we need to check
it for NULL before dereferencing it.

Also, while here, moved some of the assignments to the declarations to
reduce the number of boilerplate lines.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iac05ef0ba605e2f40d0026ad1b131c28d29f7314
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12845
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-06-01 08:56:58 +00:00
Jim Harris
64df311eba nvme: add KEYED_DATA_BLOCK to sgl_types
This SGL type was missed in the original commit
that added the pretty printing.

Fixes: 4d9ab1e9a1 ("nvme: pretty print dptr")

Reported-by: Ramanjaneya Burugula <burugula@gmail.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibc655db4e65009071f39f55f691c94a094cea0bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12705
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-05-25 07:43:03 +00:00
Or Gerlitz
9b5dabff7f nvme/rdma: Always use spdk allocation scheme
Use the conventional huge-pages based spdk allocation scheme for the initiator
data-structures unconditionally.

Change-Id: I5baee7614e3ac9b5497b3d771dfddfbaa7fdf65b
Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12687
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-05-25 07:42:47 +00:00
Shuhei Matsumoto
51e897c42e nvme: Abort queued requests even if they are children of a large I/O
A iterator function nvme_request_add_abort() covers not only a small
I/O request but also children of a large I/O.

However nvme_qpair_abort_queued_reqs_with_cbarg() did not check the
latter. check if cmd_cb_arg matches not only req->cb_arg but also
req->parent_cb_arg.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I015e29b0a8f58920b9a13081330a94f9dd976a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12557
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-05-20 09:19:07 +00:00
Shuhei Matsumoto
09c7c76876 nvme: Set I/O qpairs to failed only if reset is synchronous
For PCIe transport, we need to stop any activity of the controller
before deleting I/O qpair resource in a controller reset sequence.

However, we set I/O qpairs to failed before disabling a controller.
In the NVMe bdev module, this caused disconnected qpair callback to
delete I/O qpairs before disabling the controller.

Hence, change the code slightly to set I/O qpairs to failed only if
reset is synchronous to keep backward compatibility.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ica71aad0a1dabce45616dfdfff5f11b07131bbd1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12736
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-05-20 09:17:28 +00:00
Shuhei Matsumoto
64454afb7c nvme: disconnect() sets and reconnect_async() clears prepare_for_reset
The following patches swaps the ordering of destrloying I/O qpairs
and disconnecting a controller for PCIe transport.

prepare_for_reset is a flag for PCIe transport.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3009de9fea089fc93ecf87adba42e85c9a77e715
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12582
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-05-19 08:23:57 +00:00
Shuhei Matsumoto
736b9da034 nvme: Do Controller Level Reset when disconnecting adminq for PCIe
As described in the previous patches, we need to delete all I/O
SQ/CQs before aborting trackers when disconnecting a controller.

The following patches reorder the operations. This patch changes
adminq disconnection to initiate a Controller Level Reset and
adminq completion processes it if ctrlr->is_disconnecting is true.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I64f06bae2ce8a9127124029fd042db0028198e3c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12560
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-05-19 08:23:57 +00:00
Ben Walker
813756e75e nvme: Do not abort transport commands when disconnecting a qpair
Make this a transport-level decision instead. TCP and RDMA do want to
abort, but PCIe cannot because these commands may still be receiving DMA
operations from the device.

Change-Id: I305acddc3819c903eb3217e8f710d4216d0b3931
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11509
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
2022-05-19 08:23:57 +00:00
Shuhei Matsumoto
bdc9fa832d nvme: Add helper functions to do a Controller Level Reset (Set CC.EN to 0)
Previously, we did not do any Controller Level Reset when disconnecting
the admin qpair.

However, for PCIe transport, we need to stop any activity of the
controller, i.e., delete all I/O SQ and CQs before
nvme_transport_ctrlr_disconnect_qpair_done() calls
nvme_transport_qpair_abort_reqs() (i.e., nvme_pcie_qpair_abort_trackers()).
Otherwise, some corruption may occur because completed I/Os may still be
in progress on the NVMe device.

Not to change any public API, nvme_pcie_ctrlr_disconnect_qpair() is a
convenient place to initiate a Controller Level Reset because it is
called from spdk_nvme_ctrlr_disconnect(). Then
nvme_pcie_qpair_process_completions() can process it until completion.

However, necessary functions are not accessible from PCIe transport.

This patch adds two helper functions and guards us from some undesirable
behaviors because it was not assumed that nvme_ctrlr_process_init() is
called from the completion context and ends in the middle of transition.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3d986e94ba71b83beeff7e75cf92033b5fa6f075
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12559
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-05-19 08:23:57 +00:00
Alexey Marchuk
622ceb7f07 nvme/rdma: Use rdma qpair as cm_id context
It simplifies code and removes cast of nvme_qpair
to rdma_qpair

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I363246cf9d8c9cbafd48b26facdb5cc37fdd8e67
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12701
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-05-18 00:34:29 +00:00
Alexey Marchuk
1003e28623 nvme/rdma: Fix qpair destroy/disconnect race
When qpair is attached to a poll group, disconnect
process is async - we are waiting for the DISCONNECTED
event from rdmacm to destroy rdma resources. However
the user (nvme_perf) can destroy qpair immediatelly,
so memory allocated for qpair is freed but rdma
resouces are still allocated. That means that we may
receive rdmacm event (DISCONNECTED) for the destroyed qpair,
that leads to use-after-free.
To fix this problem, add a check for internal qpair state
when qpair is destroyed, if disconnect is not finished, then
we forcefully destroy rdma resources.

Fixes issue #2515

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@nvidia.com>
Change-Id: I7bfa53c9f6fe6ed787323a8941f1f2db17ea0c20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12700
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-05-18 00:34:29 +00:00
Alexey Marchuk
007fb1d3cb nvme: Fix keyed/unkeyd SGL nvme cmd dump
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0a08518b5c30455a17158aa440715515d0c066fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12133
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-05-17 20:11:43 +00:00
Shuhei Matsumoto
5e5423de93 nvme: Add DISABLED to ctrlr's state to show completion of Controller Level Reset
In the following patches, nvme_ctrlr_process_init() will be used to
disable the controller when disconnecting the admin qpair for PCIe
transport. In this case, we will have to exit nvme_ctrlr_process_init()
after CSTS.RDY is 0. However, spdk_nvme_ctrlr_reset() and
spdk_nvme_ctrlr_reconnect_poll_async() have to continue
nvme_ctrlr_process_init() until the controller becomes ready.

To differentiate stop and continue clearly, add a new state
NVME_CTRLR_STATE_DISABLED to enum nvme_ctrlr_state.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic0a5fb7114d4eeb1cefec28bc404184768fb0a96
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12613
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-05-12 07:28:02 +00:00
Changpeng Liu
4e241cba01 nvme/quirks: don't use SGL for Huawei SSDs
We see reports that Huawei SSDs can't handle hardware
SGL properly, it requires additional alignment, so add
a quirk here to force Huawei SSDs use PRP instead.

Fix #2489.

Change-Id: I20a57e754bc6ff8666d681191994818f2192decc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12405
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-05-02 20:00:35 +00:00
Alex Michon
f89cf818c0 nvme/pcie: Fix doorbell delay with fuse operations
When sending the first part of a fuse command, we set the
first_fused_submitted flag so that we don't ring the doorbell
immediately. When the second part is sent, we ring the doorbell for
both commands.
However, this doesn't work well when we use the option to delay ringing
the doorbell. We send both parts, then later when we try to ring the
doorbell, we don't because of the first_fused_submitted flag from the
first command.
Replace this mechanism by keeping track of the last submitted fuse.

Change-Id: Ia4ac9b3ce9c319ee4c7e42f86eadda93dac85fca
Signed-off-by: Alex Michon <amichon@kalrayinc.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12182
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-04-27 07:36:20 +00:00
Alexey Marchuk
b0f4249c59 nvme/rdma: Add async set/get registers
Now controller initialization with RDMA
transport is fully async

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I26e857740d3137d0b0e987facc81fc5f6ef81f2b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10756
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-04-22 09:44:57 +00:00
Shuhei Matsumoto
dbe7e74cee nvme: Change nvme_qpair_abort_queued_reqs() to set SC_ABORTED_SQ_DELETION
Transport specific qpair_abort_reqs() set SC to SC_ABORTED_SQ_DELETION.
However, nvme_qpair_abort_queued_reqs() set SC to SC_ABORTED_BY_REQUEST
even if its call is not requested by the upper layer.

Change nvme_qpair_abort_queued_reqs() to set SC to SC_ABORTED_SQ_DELETION
for consistency.

nvme_qpair_abort_queued_reqs() is used to abort queued requests that
were sent while adminq was connecting. SC_ABORTED_SQ_DELETION will not
be so bad even for the case.

This change is required for the NVMe bdev module to be resilient for I/O
error. The NVMe bdev module does not retry I/O if SC is
SC_ABORTED_BY_REQUEST.

SC is set to SC_INTERNAL_DEVICE_ERROR if a request is failed to submit
to qpair by a generic qpair layer. We can change it to
SC_ABORTED_SQ_DELETION as well but we keep this for now.
SC_INTERNAL_DEVICE_ERROR is also retriable for the NVMe bdev module.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I7d8d5e97b222fe9275afc4fed024c1654c9579a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12121
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-04-22 09:44:57 +00:00
zhangduan
31db7b139b nvme_tcp: set transport_ack_timeout to ack_timeout
The value of ack_timeout is calculated according to
the formula 2^(transport_ack_timeout) msec.

Signed-off-by: zhangduan <zhangd28@chinatelecom.cn>
Change-Id: I5a938635d70693ddd405fa5907555bb745b4df0f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12215
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-04-20 08:21:42 +00:00
Konrad Sztyber
aa21240574 nvme/pcie: increase min admin queue size to 256
Now that IO qpairs can be created asynchronously, we need to make sure
that all the create IO CQ/SQ commands can be executed simultaneously.
It is pretty common to create multiple IO qpairs at the same time, e.g.
adding an NVMe bdev to an nvmf subsystem will create an IO qpair on each
poll group.  In that case, if the number of cores exceed the size of the
admin queue (actually it can be even lower due to outstanding AERs), we
might run out nvme_requests on the admin queue.

The chosen minimum value for the admin queue size, 256, should be enough
to cover most cases.

Fixes #2465

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I55c59aef64f3fdb33f7b4824d3e9beb403602633
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12270
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-04-19 08:18:34 +00:00
Shuhei Matsumoto
2c13441ba8 nvme_rdma: Destroy qpair after qpair is actually disconnected
The RDMA transport can disconnect qpair asynchronously now.

Previously, we tried to release the resource of the qpair after disconnected.
However it did not work because it was done when deleting the qpair.
The admin qpair was not deleted in a ctrlr reset sequence.

This patch tries to satisfy the same aim again but by a different way.

Previously, we released the resource of the qpair before starting
actual disconnection process. This patch release the resource of the
qpair after the qpair is actually disconnected.

The related patches are:
b9518a5540
eb09178a59

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id6a814895a35b1589b781a91744ef872b42aaa69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11783
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-18 18:35:29 +00:00
Shuhei Matsumoto
4b73223542 nvme_rdma: Wait until lingering qpair becomes quiet before completing disconnection
The code to handle the lingering qpair when deleting it was really
complicated.

The RDMA transport can connect or disconnect qpair asynchronously.

Then we can include the code to handle the lingering qpair into the
code to disconnect qpair now.

If the disconnected qpair is still busy, defer completion of the
disconnection until qpair becomes idle.

If poll group is not used, we can complete disconnection immediately
because cq is already destroyed.

The related data and unit test cases are not necessary anymore.
So delete them in this patch.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-18 18:35:29 +00:00
Shuhei Matsumoto
20cf90801e nvme_rdma: Handle stale connection asynchronously
Include delayed disconnect/connect retries with finite times into
the state machine of asynchronous qpair connnection.

We do not need to call back to the common transport layer but
we need to do the following, clear rqpair->cq before starting disconnection
if qpair uses poll group, and clear qpair->transport_failure_reason after
disconnected.

Additionally locate the new state STALE_CONN before INITIALIZING
because cq is not ready to use for admin qpair when the state is
STALE_CONN.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibc779a2b772be9506ffd8226d5f64d6d12102ff2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11690
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-04-18 18:35:29 +00:00
Shuhei Matsumoto
77c4657140 nvme_rdma: Factor out destroying rdma qpair operation
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I18e166a726cca69f13e7c5818eba57f478726286
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
2022-04-18 18:35:29 +00:00
Shuhei Matsumoto
aa36c18196 nvme_rdma: Pass callback to ctrlr_disconnect_qpair() via a parameter
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I06cbb9739286d1928ad9fc07de3715a449914d75
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11688
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-18 18:35:29 +00:00
Shuhei Matsumoto
75d38a301d nvme: poll_group_process_completions() returns -ENXIO if any qpair failed
TCP transport already does it but was not documented clearly.

RDMA and PCIe transports follow it and document it clearly.

Then we can check each qpair's state if
spdk_nvme_poll_group_process_completions() returns -ENXIO before
disconnected_qpair_cb() is called.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I2afe920cfd06c374251fccc1c205948fb498dd33
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11328
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-18 18:35:29 +00:00
Shuhei Matsumoto
9717b0c3df nvme_rdma: Connect and disconnect qpair asynchronously
Add three states, INITIALIZING, EXITING, and EXITED to the rqpair
state.

Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it
to opts->async_mode for I/O qpair and true for admin qpair.

Replace all nvme_rdma_process_event() calls by
nvme_rdma_process_event_start() calls.

nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING
when starting to process CM events.

nvme_rdma_ctrlr_connect_qpair_poll() calls
nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is
not admin qpair.

nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true
or qpair->poll_group is not NULL before polling CM events, or polls
CM events until completion otherwise. Add comments to clarify why
we do like this.

nvme_rdma_poll_group_process_completions() does not process submission
for any qpair which is still connecting.

Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-18 18:35:29 +00:00
Changpeng Liu
c47b7b0276 nvme/vfio-user: use API to setup BAR0 doorbells
We can use lib/vfio-user API to setup BAR0 doorbells,
existing implementation is redundant.

Change-Id: Ib880d167c84c6b8482bf1a35559a34c939f6a02d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12211
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-04-12 07:24:22 +00:00
Tomasz Zawadzki
6301f8915d lib/sock: provide a hint to picking optimal poll group
The process of matching qpair to poll group is split into
two distinct parts that occur on different threads.
See spdk_nvmf_tgt_new_qpair().

This results in a race condition for TCP between spdk_sock_map_lookup()
and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group()
and spdk_nvmf_poll_group_add() respectively.

Fixes #2113

This patch picks a hint from nvmf_tcp for next poll group,
which is then passed down to spdk_sock_map_lookup().

When matching placement_id exists, but does not have
a poll group assigned - the hint will be used.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2022-04-01 12:41:26 +00:00
Shuhei Matsumoto
0a61427ecc nvme_rdma: Start qpair after resolving address and route when poll group is used
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0b0f314c98368247582f2dfcaf69f78e24d715f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11366
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-04-01 08:28:45 +00:00
Shuhei Matsumoto
531c1b0f04 nvme_rdma: Make nvme_rdma_process_event() asynchronous
Separate nvme_rdma_process_event() into nvme_rdma_process_event_start()
and nvme_rdma_process_event_poll().

Use nvme_rdma_process_event_start() and nvme_rdma_process_event_poll()
in nvme_rdma_process_event() to ensure compatibility.

Change-Id: Idc960fab2540efec612dcf22f156acabd2e2874e
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-04-01 08:28:45 +00:00
Shuhei Matsumoto
791ee7deb4 nvme_rdma: nvme_rdma_process_events() returns negated errno
It will be convenient for the following patches to return
negated errno directly.

Change-Id: Ic80181b2ee449946dd60ad0c97a325fd48b92231
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10990
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-01 08:28:45 +00:00
Shuhei Matsumoto
cf7f253302 nvme_rdma: Add callback to nvme_rdma_process_event()
Change-Id: I66aa89dc54d5aaedbe2f06239cbf04aeeb2c739e
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11359
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-04-01 08:28:45 +00:00
Shuhei Matsumoto
bcf0845727 nvme_rdma: Make CM event operations callback functions
Change-Id: I9f2551a07187400dd9ef624348cd465e64557e1b
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11138
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-04-01 08:28:45 +00:00
Shuhei Matsumoto
e5927c02e9 nvme_rdma: Remove cm_channel param from process_event()
nvme_rdma_poll_events() gets the cm_channel pointer itself.
Before calling nvme_rdma_process_event(), we checks the
rctrlr is valid.

Hence we do not have to pass the cm_channel pointer to
nvme_rdma_process_event() via a parameter.

This simplifies the code and makes the following patches a little easier.

Change-Id: I03f095833469c5b64592264d63a592106d49e13b
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11167
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-04-01 08:28:45 +00:00
Shuhei Matsumoto
29974dc882 nvme_rdma: Make fabric_qpair_connect() asynchronous
Replace nvme_fabric_qpair_connect() by nvme_fabric_qpair_connect_async()
and nvme_fabric_qpair_connect_poll().

The following is a detail.

Define state of the nvme_rdma_qpair and each rqpair holds it.

Initialize rqpair->state by INVALID at nvme_rdma_ctrlr_create_qpair().

_nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to
FABRIC_CONNECT_SEND instead of calling nvme_fabric_qpair_connect().

Then the new function nvme_rdma_ctrlr_connect_qpair_poll() calls
nvme_fabric_qpair_connect_async() at FABRIC_CONNECT_SEND and
nvme_fabric_qpair_connect_poll() until it returns 0 at FABRIC_CONNECT_POLL.

nvme_rdma_qpair_process_completions() or
nvme_rdma_poll_group_process_completions() calls
nvme_rdma_ctrlr_connect_qpair_poll() if qpair->state is CONECTING.

This patter follows the TCP transport.

Change-Id: I411f4fa8071cb5ea27581f3820eba9b02c731e4c
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11334
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-04-01 08:28:45 +00:00
Evgeniy Kochetov
a2d4ddb3b1 nvme: Prioritize user provided trstring for transport lookup
This patch fixes the issue with custom nvme transport. It is possible
to register custom nvme transport with arbitrary name but it is not
usable because 'spdk_nvme_trid_populate_transport' call in probe
function will always set trstring to 'CUSTOM' and transport lookup
will fail.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I83fd24dd8732ac0a21e22435e0acff20ab0e7521
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9557
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-03-31 10:31:20 +00:00
Jim Harris
7039639063 nvme: read full discovery page after reading header
Some targets report they support log page offset,
but then fail GET_LOG_PAGE commands that specify
a non-zero offset, or report the wrong number
of discovery entries when reading more than the
discovery log page header but not the entire log
page.

So just revert to reading the entire discovery log
page, after we've read the header to know how big the
log page will be. This means that when we read the
log page initially (without the individual entries),
we need to save off the genctr, since it will get
overwritten when we read the log page again.  We can
just store this in the discovery context, and compare
it to the genctr that we read with the whole log page.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I34929253312fed9924db58904a051f3979283730
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11478
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-03-28 17:10:04 +00:00
Alexey Marchuk
94494579ce nvme_rdma: Update reportring of RDMA responder resources
responder_resources parameter of rdma cm tells remote
side how many outstaing RDMA_READ of atomic operations
local side can handle.

Previously it was adjusted on queue depth but that was
not correct since these parameters do not depend on
each other. Even with qdepth=1 remote side may send
several RDMA_READ operations per 1 IO request.

With this change we report responder_resources
equal to the maximum supported by RDMA device.
Linux kernel nvme rdma driver reports this value
in the same way.

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I77e5c2ead6269da44c32a75a9188429f50d32ae4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11698
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-03-25 08:18:37 +00:00
Krzysztof Karas
a9a55513e5 nvme_ctrlr.c: Add error logs
Add NVME_CTRLR_ERRLOGs to nvme_ctrlr_process_init().
The main goal is to help with debugging #2201 issue.

Change-Id: I1ae6a9b30d6124dfe25eb7912402c37d476b0d4c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10627
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-03-24 09:57:17 +00:00
Shuhei Matsumoto
6a89f75ec7 nvme_rdma: Remove handling stale connect
The feature will be redesigned and restored in the following patches.
For the NVMe bdev module, it can reconnect by itself without relying
on the feature.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I2d9c0437f7ad8412ad8cf40d11e574723b735bee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
0c77cf90bf nvme_rdma: Consolidate fail_qpair() calls into a single place
For nvme_rdma_qpair_process_completions(), consolidate the operations to
call nvme_rdma_fail_qpair() and return -ENXIO into a single place.

Besides, shorten pointer references for
nvme_rdma_qpair_process_completions() and
nvme_rdma_poll_group_process_completions().

These will make the following patches a little easier.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iaf72cfca0b5b3ba223d86e267da8069d43a15292
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11439
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
df7c2a2253 nvme: Call ctrlr_disconnect_done() after qpair_process_comletions() returns -ENXIO
Add a new flag is_disconnecting to struct spdk_nvme_ctrlr.

Separate calling nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done()
by using the flag is_disconnecting.

Additionally, change nvme_ctrlr_fail() to skip setting ctrlr->is_failed
to true if ctrlr->is_disconnecting is true.

Change-Id: Ie2c74ba41f120662a30f6198751d07005d23abcf
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11000
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
8926303b59 nvme: Caller polls qpair until disconnected if async connect failed
nvme_transport_ctrlr_connect_qpair() calls
nvme_transport_ctrlr_disconnect_qpair() if failed.

If async qpair disconnect is supported, even when connect qpair failed,
nvme_transport_ctrlr_connect_qpair() may complete asynchronously later.

The cases that qpair->async is set to true are I/O qpair for the NVMe
bdev module and admin qpair.

example/nvme/perf and example/nvme/reconnect use I/O qpair but both
set qpair->async to false.

For the NVMe bdev module, I/O qpair is connected when creating I/O
channel or resetting ctrlr. If spdk_nvme_ctrlr_connect_io_qpair()
returns 0 for a I/O qpair, the qpair is in a poll group and is polled
by spdk_nvme_poll_group_process_completions() and a disconnected
callback is called to the qpair. Hence we do not need to add additional
polling for I/O qpair in the NVMe bdev module.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6e0aadcfd98e5cb77b362ef1a79e0eca2985f36e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11112
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
9d1063d732 nvme: qpair_process_completions() processes if qpair is disconnecting
If poll group is not used and if qpair is disconnecting,
spdk_nvme_qpair_process_completions() has to poll qpair until it is
actually disconnected even if ctrlr is failed.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6da84f1e35780d21480fbe6f07e76af3048a777b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11018
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
21322e01dd nvme: Separate spdk_nvme_ctrlr_disconnect() into disconnect and after it
Serparate spdk_nvme_ctrlr_disconnect() into nvme_ctrlr_disconnect()
and nvme_ctrlr_disconnect_done() to call nvme_ctrlr_disconnect_done()
after adminq is actually disconnected when disconnecting adminq
asynchronously.

The following patches will add a new flag is_disconnecting to struct
spdk_nvme_ctrlr and prevent us from setting ctrlr->is_failed to true
between nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done().

By this patch, nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done()
are executed in the same context. So it is not possible to set
ctrlr->is_failed to true between nvme_ctrlr_disconnect() and
nvme_ctrlr_disconnect_done().

Hence nvme_ctrlr_disconnect_done() does not have to clear ctrlr->is_failed
again.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I18b5b68f37e27b54782691823edae9738c26faa1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10999
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
4a73675dbf nvme: Remove deprecated spdk_nvme_ctrlr_reset_async() and _reset_poll_async()
Change spdk_nvme_ctrlr_reset() to use spdk_nvme_ctrlr_disconnect(),
spdk_nvme_ctrlr_reconnect_async(), and
spdk_nvme_ctrlr_reconnect_poll_async().

Then remove the deprecated spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().

These changes simplify the following patches to make
spdk_nvme_ctrlr_disconnect() asynchronous.

Change-Id: Ia71e8e0ad5b2dff42b7423634f66de47863926e2
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10913
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
cfe11bd1db nvme: Factor out operations done after disconnect qpair completes
This is a preparation to make nvme_transport_ctrlr_disconnect_qpair()
asynchronous.

For nvme_transport_ctrlr_disconnect_qpair(), factor out operations after
returning from transport's specific ctrlr_disconnect_qpair() into a helper
function nvme_transport_ctrlr_disconnect_qpair_done().

Then move nvme_transport_ctrlr_disconnect_qpair_done() into the end of
the transport specific ctrlr_disconnect_qpair().

Additionally remove the operation to overwrite the qpair state to
DISCONNECTED from nvme_transport_connect_qpair_fail() because
this is duplicated and nvme_transport_ctrlr_disconnect_qpair() is responsible
to make the qpair disconnected even after it completes asynchronously.

Change-Id: I9c8faa7039d306d3e31a8f51826755ce8840a8aa
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10851
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-03-21 10:49:11 +00:00
Shuhei Matsumoto
1285481917 nvme: Free I/O qpair now even if it is in poll group completion
spdk_nvme_poll_group has followed spdk_nvme_qpair about how to
process I/O qpair deletion inside of a completion context.

spdk_nvme_qpair_process_completions() accesses qpair after
returning from nvme_transport_qpair_process_completions().

So this is reasonable.

On the other hand, if spdk_nvme_poll_group_process_completions()
can execute spdk_nvme_ctrlr_free_io_qpair() inside of a completion
context, the target qpair is ensured to be deleted after returning
from spdk_nvme_ctrlr_free_io_qpair(). Then the target qpair is
not accessed anymore in spdk_nvme_poll_group_process_completions().

Remove two variables, in_completion_context and num_qpairs_to_delete,
of spdk_nvme_transport_poll_group and the related code.

This change is really necessary to support the following case.

In the NVMe bdev module, a nvme_qpair has a qpair and a poll_group
channel. disconnected_qpair_cb calls spdk_nvme_ctrlr_free_io_qpair()
for the qpair and spdk_put_io_channel() to the poll_group_channel.
spdk_nvme_ctrlr_free_io_qpair() is executed after unwinding stack
but spdk_put_io_channel() is executed now. The callback to
spdk_put_io_channel() calls spdk_nvme_poll_group_destroy(). However,
spdk_nvme_ctrlr_free_io_qpair() is not executed. Hence
spdk_nvme_poll_group_destroy() fails.

Update the corresponding stub in unit test together.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Icd1f1daf049c6c7ffb28790fe87989a1060f8952
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11496
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-03-15 09:05:09 +00:00
Shuhei Matsumoto
486f46e867 nvme_rdma: Call disconnected_qpair_cb when qpair is in disconnected_qpairs list
We want to call disconnected_qpairs_cb only if qpair is actually
disconnected. When we disconnect qpair asynchronously, for qpairs in
the group->disconnected_qpairs list, we want to poll them until
actually disconnected and then call disconnected_qpairs_cb for them.

As a preparation, call disconnected_qpair_cb only for qpairs which is
in the group->disconnected_qpairs list.

For TCP and PCIe transports, disconnecting qpair will continue to be
synchronous for now.

So we change only RDMA transport.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ifaf6157e1e02fa13f52a66409c9e60fc814d71dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11495
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-15 09:05:09 +00:00
Shuhei Matsumoto
999f0362ab nvme_tcp: Remove qpair from group->needs_poll when removing it from poll_group
spdk_sock_close() may not complete synchronously.

Then the following scenario is possible.

ctrlr_disconnected_qpair() calls spdk_sock_close(). Then any request may
complete and call _pdu_write_done(). qpair is still associated with a
poll_group and is inserted into the group->needs_poll list.
After ctrlr_disconnected_qpair() completes, disconnected_qpair_cb() is
called. disconnected_qpair_cb() calls spdk_nvme_ctrlr_free_io_qpair().
spdk_nvme_ctrlr_free_io_qpair() calls spdk_nvme_poll_group_remove().

spdk_nvme_poll_group() removes qpair only from the
group->disconnected_qpairs list.

Even after qpair->poll_group is nullified, qpair is still in the
group->needs_poll list.

Then spdk_nvme_poll_group_process_completions() polls all qpairs in
the group->needs_poll list and hits the assert.

Fixes the issue #2390

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6ede8bd3b7b1a57a34ac61436159975ab6253fbe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11882
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-03-14 08:44:03 +00:00
John Levon
c4f7ddd2c7 lib/nvme: report shadow doorbell update stats
Currently shadow doorbell updates are not counted; add statistics for
those, and rename the other statistic for clarity.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I211a77902e38265c99b15862034c6d022dc582a0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-03-10 09:49:25 +00:00
John Levon
9e239b1c1e lib/nvme: remove needless check
nvme_pcie_qpair_update_mmio_required() is only called for QPs with
shadow doorbells enabled, so we don't need an additional branch here.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idf65f92deb0c6f93019892ef5a02bc28ba4c0f8c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11843
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-03-10 09:49:25 +00:00
John Levon
8071e0f1dc lib/nvme: drop unused argument
nvme_pcie_qpair_update_mmio_required() qpair argument is unused.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I0bd6897eb8e6a06f211cc599ab6045409bb43641
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11842
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-03-10 09:49:25 +00:00
John Levon
ba4ffda671 lib/nvme: correct typo in transport stats
"doorbell" not "doobell"

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9261559576e72a09b63fbc984ae0ec2a2572eb2c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11841
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-03-09 08:03:42 +00:00
Jim Harris
12d522515f nvme: simplify spdk_nvme_transport_id_populate_trstring
Note that this also works around a false positive in
gcc-11 of type -Wstringop-overread.

Fixes issue #2391.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib5301b9ef9fa3ead2a1a2318655533a8cfba03fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11709
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-02-28 11:07:05 +00:00
Evgeniy Kochetov
5c80b1e5ab nvme/rdma: Limit max_sges by command capsule size
According to NVMe over Fabrics spec number of SGLs supported by the
controller is reported in MSDBD. But it is also implicitly limited by
command capsule size (IOCCSZ) since SGL are passed in capsule.

This patch adjusts max_sges to capsule size if required. Adjustment to
MSDBD is also moved to transport layer because it is fabrics specific
parameter and is not valid for PCIe transport.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I44918eb949345c61242ca50a524d21d04b6ac058
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11669
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-02-25 08:18:32 +00:00
Shuhei Matsumoto
7594030409 nvme: Set dnr to zero for abort_reqs() including a fix of degradation
The patch

nvme: Set dnr to zero for nvme_qpair_abort_reqs()
1b3172f726

did the change stated in the title.

However,

Revert "nvme/rdma: Correct qpair disconnect process"
c8f986c7ee

destroyed it for RDMA transport.

Additionally, we had still set DNR to 1 in nvme_qpair_init().

This patch fixes both.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iee60ac24aa7e04cce0f394014c9d9afc9d2b56ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11644
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-02-24 14:56:03 +00:00
Jim Harris
635d0cbe75 nvme: allocate extra request for fabrics connect
With async connect, we need to avoid the case
where the initiator is sending the icreq, and
meanwhile the application submits enough I/O
such that the request objects are exhausted, leaving
none for the FABRICS/CONNECT command that we need
to send after the icreq is done.

So allocate an extra request, and then use it
when sending the FABRICS/CONNECT command, rather
than trying to pull one from the qpair's STAILQ.

Fixes issue #2371.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If42a3fbb3fd9d863ee48cf5cae75a9ba1754c349
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11515
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-02-14 15:29:39 +00:00
Jim Harris
a97200ad45 nvme: optimize struct spdk_nvme_qpair packing
Group fields such that those not used in the I/O path
are at the end of the structure.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I43eca1faacd29a5bf34be6ee644191d865cd42a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11514
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-02-14 15:29:39 +00:00
Jim Harris
56618eacb9 nvme: add NVME_INIT_REQUEST macro
This macro will be used in an upcoming patch
that needs to construct an nvme_request structure
outside of the standard nvme_allocate() routines.

Examined x86 optimized assembly with this patch,
and there is no change.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0f6b8500e06b56edc33f437f351536cf857d13d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11513
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-02-14 15:29:39 +00:00
Evgeniy Kochetov
834e3c5a0e nvme: Fix submission queue overflow
SPDK can submit more commands to remote NVMf target than allowed by
negotiated queue size. SPDK submits up to SQSIZE commands, but only
SQSIZE-1 are allowed.

Here is a relevant quote from NVMe over Fabrics rev.1.1a ch.2.4.1
“Submission Queue Flow Control Negotiation”:

If SQ flow control is disabled, then the host should limit the number
of outstanding commands for a queue pair to be less than the size of
the Submission Queue. If the controller detects that the number of
outstanding commands for a queue pair is greater than or equal to the
size of the Submission Queue, then the controller shall:

a) stop processing commands and set the Controller Fatal
Status (CSTS.CFS) bit to ‘1’ (refer to section 10.5 in the NVMe Base
specification); and

b) terminate the NVMe Transport connection and end the association
between the host and the controller.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ifbcf5d51911fc4ddcea1f7cde3135571648606f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11413
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-02-10 15:22:08 +00:00
Evgeniy Kochetov
486426529d nvme/rdma: Remove queue depth adjustment to crqsize
According to NVMe over Fabrics specification (rev.1.1a) HSQSIZE sent
in RDMA_CM_REQUEST private data (ch.7.3.6.4) shall be the same as
SQSIZE later sent in Connect command (ch.3.3).

SPDK NVMe RDMA initiator adjusts SQSIZE to CRQSIZE received from
target in RDMA_CM_ACCEPT private data. Target is allowed to send
CRQSIZE < HSQSIZE if RNR retries are used. So, it is possible that
SQSIZE sent by SPDK will be lower than previously sent HSQSIZE. There
are targets validating this match and they reject connection from
SPDK.

Linux kernel NVMe initiator doesn't perform such adjustments and
connects well to such targets.

This patch aligns SPDK behavior with specification and Linux kernel
implementation.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I01968d1c07d284396fa5939932d85841351d7a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11350
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-02-10 15:22:08 +00:00
Jaylyn Ren
3e937f07eb test/accel&rdma: Fix unittest_accel and unittest_nvme_rdma failure
There are errors occur that uninitialised value created by a stack allocation when running unittest_accel and unittest_nvme_rdma with valgrind.

Change-Id: I4b48b472cc7c189cbcaf8ca772830a23118e7e17
Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10559
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-02-09 22:22:04 +00:00
Tomasz Zawadzki
047c067c05 so_ver: increase all major versions
To allow SO_MINOR updates on LTS for the whole year it is supported,
the major version for all components needs to be increased.
This is to prevent scenario where two versions exists with matching
versions, but conflicting ABI.
Ex. Next SPDK release adds an API call increasing the minor version,
then LTS needs just a subset of those additions.

Increasing major so version after LTS, allows the future releases
to update versions as needed. Yet allowing LTS to increase minor
version separately.

Disabled test for increasing SO version without ABI change, as
that is goal of this patch. This check shall be removed with SPDK 22.05
release.

This patch:
- increases SO_VER by 1 for all components
- resets SO_MINOR to 0 for all components
- removes suppressions for ABI tests

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id1a5358882dc496faa5b0b5c9a63b326c378c551
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-01-31 15:29:56 +00:00
Shuhei Matsumoto
fc48cf8681 nvme_rdma: Check only if Soft RoCE receive normal completion after disconnect
We saw this unexpected behavior by the current SPDK master.
Add the check to clarify this behavior occurs only when we use
Soft RoCE.

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3a5eaa9064a0601c65139e7868898545926d0dbf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11225
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
2022-01-26 08:09:15 +00:00
Shuhei Matsumoto
c8f986c7ee Revert "nvme/rdma: Correct qpair disconnect process"
This reverts commit eb09178a59.

Reason for revert:

This caused a degradation for adminq.
For adminq, ctrlr_delete_io_qpair() is not called until ctrlr is destructed.
So necessary delete operations are not done for adminq.

Reverting the patch is practical for now.

Change-Id: Ib55ff81dfe97ee1e2c83876912e851c61f20e354
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10878
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-01-26 08:09:15 +00:00
Shuhei Matsumoto
194dc9e2f9 Revert "nvme_rdma: Continue even if we receive a normal WC when qpair is disconnected"
This reverts commit b9518a5540.

Reason for revert: Fix a degradation for adminq

Change-Id: I0e2c5e48a5ca34171fa98fa68216da4354b5d262
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10879
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-26 08:09:15 +00:00
GangCao
765cf74d07 lib/nvme: only active process to operate the unmap operation
Fix issue: #2320

Only the primary process will do the unmap bar operation as for
the map bar operation.

The DevHandle is process specific and the issue here is the
secondary process's function pointer of DevHandle is not properly
set.

Change-Id: I95dddc76c6ce4be8775b6aaf54699002baffd3b9
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11216
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-01-24 20:12:51 +00:00
Changpeng Liu
80c88ab355 nvme: disconnect ADMIN queue pair when destruct controller
We should disconnect ADMIN queue pair after shutdown
returned, or we may leak ADMIN socket resources after
free the controller data structure.

Fix issue #2289.

Change-Id: I956191fcd51cdcef5de2c3c7b15ffc70f22b040b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11133
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: <qun.wan@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-01-20 15:46:36 +00:00
Konrad Sztyber
a7d61bef5a nvme: guard admin qpair error injection queue
Admin commands can be sent and polled from any thread, which also means
that the error injection queue on the admin qpair can be accessed from
multiple threads.  Therefore, any modifications to that queue should be
done under the ctrlr lock.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib1ed194405cb5b93f65a007b9749fd4433dc367d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11099
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
2022-01-19 09:05:36 +00:00
Shuhei Matsumoto
34eea269f5 nvme: Assume poll_group_disconnect_qpair() succeeds if qpair is in connected_qpairs
poll_group_disconnect_qpair() is used only in a single place now
and transport_poll_group_disconnect_qpair() always returns 0 for all
transport.

Let's remove unnecessary processing for return code.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I45d7f8cea2117b3ec00028df234d1eb9ecc65713
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10677
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
728e3721a4 nvme_rdma: Remove a guard for recursive calls from poll_group_disconnect_qpair()
nvme_poll_group_disconnect_qpair() is called only by a single place now.

We do not need the flag poll_group_disconnect_in_progress any more.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I8f9c0f14baa8fcb9b0637635a5bb3d34a8b11af5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
4c8ccb5403 nvme: Remove poll_group_disconnect_qpair() call from poll_group_remove()
spdk_nvme_poll_group_remove() is available only for disconnected
qpairs now. Hence spdk_nvme_poll_group_remove() does not have to
check if qpair is connected and call nvme_ctrlr_disconnect_qpair().

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3b05246c4be6adfa3392b8f0e5ecaf274a8a7795
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10846
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
ea2db5bb0c nvme_pcie: Use dummy stats after removing qpar from poll group
Previously, when connecting qpair, we allocated stats per qpair if poll
group is not used or we set stats per poll group otherwise.
Then when deleting qpair, we freed per qpair stats if allocated.

However, if qpair is still not completely disconnected after removing
qpair from poll group, pqpair->stat is use-after-free and it causes
a segmentation fault.

To fix this issue, we set pqpair->stat to &g_dummy_stats instead.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibf303e6db5176e93ed75cbe3a414bb923d6e3ab6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10845
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
f1941efe7b nvme_tcp: Use dummy stats after removing qpair from poll group
Previously, when connecting qpair, we allocated stats per qpair
if poll group is not used or we set stats per poll group otherwise.
Then when removing qpair from poll group, we cleared qpair->stats pointer.

However, if qpair is still not completely disconnected after removing
qpair from poll group, tqpair->stats is NULL and it causes a segmentation
fault.

Hence we set tqpair->stats to &g_dummy_stats instead of NULL.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ice6469627ce8d4bf4567f57c304759206b6432f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
7ae79a38a5 nvme: Limit spdk_nvme_poll_group_remove() to use only for disconnected qpairs
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3c06c41664ee757423641474141439f9c32fc0b6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10671
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
e021cc0147 nvme: Swap ctrlr_disconnect_qpair() and poll_group_remove() in nvme_ctrlr_free_io_qpair()
nvme_ctrlr_disconnect_qpair() calls nvme_poll_group_disconnect_qpair() if the qpair
uses a poll group, and nvme_poll_group_disconnect_qpair() calls
nvme_ctrlr_disconnect_qpair() if the state of the qpair is not DISCONNECTING.

This relationship made the code very complex.

A few patches starting from this patch simplifies disconnect and free qpair
operations.

This patch swaps the ordering of nvme_ctrlr_disconnect_qpair() and
spdk_nvme_poll_group_remove() in spdk_nvme_ctrlr_free_io_qpair().

This ensures the qpair is disconnected when spdk_nvme_ctrlr_free_io_qpair()
calls spdk_nvme_poll_group_remove().

This enables us to limit spdk_nvme_poll_group_remove() to be available
only for disconnected qpairs.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0601a74f953a2efc4f177a51a4450baea33533d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10670
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-01-19 08:44:09 +00:00
Shuhei Matsumoto
1b3172f726 nvme: Set dnr to zero for nvme_qpair_abort_reqs()
This is necessary to failover another path when multipath is configured.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0b6bcf63501e38f75efb4b0d6bec58abb4b67aef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10250
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
2022-01-17 14:25:15 +00:00
Ahriben Gonzalez
93ef69ef9c nvme: Add Check for fuse request size
FUSE has a limitation of 128KiB. Adding a check that returns ENOMEM for
ioctl and logs the error. Applies to both in and out buffers

Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I9ce5fdc413b047a1ec074468be5abf433da26d7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10855
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-01-14 11:10:13 +00:00
Ahriben Gonzalez
0345729e00 nvme: Add metadata support to io commands
Adding metadata support for io commands. Currently metadata is ignored
even if present in the cmd struct. Making metadata adress
readable/writable depending on data transfer bits. Adding extra unit
test to make sure metadata fields are populated.

Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I1d01974a6b2831c82b43e94073065d235eea429a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10854
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-01-14 11:10:13 +00:00
Ahriben Gonzalez
9e14341bd9 nvme: Always set result field for passthru cmd
Modify admin passthru so that result field of passthru struct is always
populated. This should be safe since dw0 is either reserved or contains
command specific info. This is specifically meant for the namespace
management command when attempting to create a namespace. As per spec:
"Dword 0 of the completion queue entry contains the Namespace Identifier
created.". So for nvme cli and perhaps other application to see what is
the id of the namespace created there needs to be a way to pass the
information back.

Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: Ide4effc126ad9eedac95b0700dd65041ed4b35b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10633
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-01-14 11:10:13 +00:00
Ahriben Gonzalez
0c645fdc8e nvme: change cuse ioctl reply
-Change cuse ioctl reply from status code to whole status field.
-Add negative test for nvme cli cuse: Power Managment on Namespace

Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I55a88a4f5ace5040f79c05edfc0b8559905bdd2e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-01-14 11:10:13 +00:00
Ben Walker
e3eeb6bd9e nvme: Free inactive namespaces during spdk_nvme_ctrlr_reset
This is the only time where we're allowed to invalidate namespace
handles, so use this opportunity to release inactive ones.

Change-Id: I53626ddf30e48e04207078fe406ec6e02138ac9f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10103
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-14 08:35:10 +00:00
Ben Walker
26d60dc433 nvme: Move active_ns_count next to ns in spdk_nvme_ctrlr
This is the count of items in the RB_TREE, so put the two next to each
other.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib30bee12e65065dc414b55e85cfffa2026057e9f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10035
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-14 08:35:10 +00:00
Ben Walker
517b557226 nvme: Do not track a separate active namespace list
We only populate active namespaces into the main namespace tree, so we
don't need a separate list of active namespaces too.

Change-Id: Iaf194f806cc1d9672f5567cff3dffafff3165069
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10034
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-14 08:35:10 +00:00
Ben Walker
dbde5edd39 nvme: Inline nvme_ctrlr_[construct|destruct]_namespaces
These are no longer complex enough to warrant being separate functions.

Change-Id: I5f3c9fc904b768b6509283c4b7def686bab9a1d2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2022-01-14 08:35:10 +00:00
Ben Walker
e7602c158f nvme: Hold namespaces in an RB_TREE
Since this is now sparsely populated, a tree is a better choice.

Change-Id: Ie66d913fa1d298de56a7d22ef55f0adf7f8803b8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10031
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-01-14 08:35:10 +00:00
Ben Walker
b4dace738e nvme: Do not allocate inactive namespace objects
Some subsystems report a very large maximum value for the number of
namespaces, but in essentially every case the subsystem is sparsely
populated with active namespaces. To save memory, don't allocate
objects for the inactive ones.

Change-Id: I4cbeb5a7a898d3c685f4a3a9ec4c2ce45efffb92
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9898
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2022-01-14 08:35:10 +00:00
Jim Harris
1ea419ecd7 nvme: restart discovery log when genctr changes
Each portion of the discovery log has a header which
includes a 'genctr'. This number indicates the
current generation of the discovery log. If this
number changes during the process of fetching the
discovery log in multiple chunks, wait for the
current fetch to complete, but then start over.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5f8623593b7f935eecc37a98daf92e7d8c0dd566
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2022-01-12 08:20:23 +00:00
Jim Harris
6a520ae644 nvme: simplify get_log_page_completion
Return if outstanding_commands > 0. This reduces
indentation for the rest of the code in the
function and simplifies the diff for an upcoming
patch.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I49f09eff7c0908829819e6b797c922211c56e7db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2022-01-12 08:20:23 +00:00
Shuhei Matsumoto
b9518a5540 nvme_rdma: Continue even if we receive a normal WC when qpair is disconnected
We recently improved qpair disconnect process and added assert
if we get a completion without any error when a qpair is disconnected.

However unexpectedly we saw this case very often when we ran the test
test/nvmf/host/multipath.sh for the real hardware in the test pool.

So we remove the assert and change the ERRLOG to INFOLOG.

Fixes one of the issues in #2300

Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: Iedbf7e0afa5025da6a810043ba95348ba5b856b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10901
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-12-29 02:19:58 +00:00
Alexey Marchuk
3c4a68cafc nvme: Do not create IO qpair during ctrlr initialization
If nvme ctrlr is resetting or initializing, free_io_qids
bitmap is already freed or not created yet. In that case
an attempt to create IO qpair leads to segmentation fault.

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I6a97bf81d5a568db20d23b3f88cf01e994ba42e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10827
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
2021-12-27 08:43:03 +00:00
Alexey Marchuk
eb09178a59 nvme/rdma: Correct qpair disconnect process
In current implementation RDMA qpair is destroyed right after
disconnect. That is not graceful qpair shutdown process since
there can be requests submitted to HW and we may receive
completions for already destroyed/freed qpair.

To avoid this, only disconnect qpair in ctrlr_disconnect_qpair
transport callback, all other resources will be released in
ctrlr_delete_io_qpair cb.

This patch is useful when nvme poll groups are used since in
that case we use shared CQ, if the disconnected qpair has WRs
submitted to HW then qpair's destruction will be deferred to
poll group.

When nvme poll groups are not used, this patch doesn't change
anything, in that case destruction flow is still ungraceful.
However since CQ is destroyed immediately after qpair,
we shouldn't receive any requests which point to released
resources. A correct solution for non-poll group case
requires async diconnect API which may lead to significant
rework.

There is a bug when Soft Roce is used - we may receive
a completion with "normal" status when qpair is already
disconnected and all nvme requests are aborted. Added
a workaround for it.

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0680d9ef9aaa8737d7a6d1454cd70a384bb8efac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-12-23 08:44:40 +00:00
Jim Harris
843c387a1f nvme: add spdk_nvme_ctrlr_get_discovery_log_page API
This API is a helper for getting the full discovery
log page from a discovery controller.  It will read the
log page header to get the total number of entries,
allocate a buffer for all of the entries, and then
issue a series of get_log_page commands to read each
4KiB worth of entries.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I02666ef5adcb9fc8825a221655811ace708f97b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10564
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-12-20 18:12:41 +00:00
Ben Walker
67196e9959 nvme: Don't free and allocate the entire ns array in
nvme_ctrlr_construct_namespaces

We can just reallocate here to be more efficient.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8cfc87da23aee6c05ff83aea2165683dddba1dbd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10688
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-12-20 08:49:41 +00:00
Ben Walker
fca4262987 nvme: Remove nvme_ns_update
In the one place this was called, we can call nvme_ns_construct
instead. There's no harm in re-fetching the identify pages.

Change-Id: I91292ff9650bdc7edd5588a05837b671dcac1922
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10102
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-12-20 08:49:41 +00:00
Jim Harris
b33e68a789 nvme: call probe_cb when directly connecting to discovery ctrlr
The host may have specified a hostnqn to use to connect to
a discovery ctrlr, so we can't just use the default ctrlr
opts to connect - we need to call the probe_cb (if there is
one) to get any options that the host may have specified.

Tested by using discovery_aer tool, creating a subsystem and
listener, and then adding a host on the target side that matches
the hostnqn specified to the discovery_aer tool.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07266e984e0094d3a768e6a0d5ea3a3bd71e32ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10547
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-12-17 09:45:17 +00:00
Changpeng Liu
97277e1459 nvme: use transport internal queue state when deleting unfinished IO queue pair
The NVMe bdev module enables asynchronous IO QP creation by default, after
calling `spdk_nvme_ctrlr_alloc_io_qpair` and `spdk_nvme_ctrlr_connect_io_qpair`,
the queue pair is in connecting state at the beginning, then users may call
`spdk_nvme_ctrlr_free_io_qpair` immediately, and the common layer will
change queue state to NVME_QPAIR_DISCONNECTING and NVME_QPAIR_DESTROYING,
so in function `nvme_pcie_ctrlr_delete_io_qpair` the workaround to wait
for create cq/sq callbacks will not be called, instead of using the common
layer queue state here, we should use the internal `pcie_state`.

Fix #2245.

Change-Id: I801caf26563464b135035bf7fa2f63def13de9f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10445
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-12-09 06:06:02 +00:00
Shuhei Matsumoto
7a0a2800e0 nvme: Add three APIs for disconnect, start re-enable, and poll re-enable ctrlr
The NVMe bdev module will support two features, delayed reconnect and
delete after multiple failures of reconnect to improve error recovery.

The recently added two APIs, spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async(), were not good enough.

spdk_nvme_ctrlr_reset_ctx was not necessary. It had only a pointer to ctrlr.
Using a pointer to ctrlr directly saves us from undesirable malloc error
processing.

Separate spdk_nvme_ctrlr_reset_async() into spdk_nvme_ctrlr_disconnect()
and spdk_nvme_ctrlr_reconnect_async(). spdk_nvme_ctrlr_disconnect()
disconnects ctrlr including disconnecting adminq.
spdk_nvme_ctrlr_reconnect_async() moves the ctrlr state to INIT.

Then rename spdk_nvme_ctrlr_reset_poll_async() by
spdk_nvme_ctrlr_reconnect_poll_async().

Finally deprecate spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().

The following patches will change the NVMe bdev module to use these new APIs.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id1d6858dcdc5fc2e9db0a6ebf3f79cab4f9bbcb7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10091
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-12-08 08:31:24 +00:00
Changpeng Liu
632c8d5613 nvme: make get INTEL log pages can be executed asynchronously
Also we don't treat exceptions when getting INTEL log pages
as a fatal error, the initialization will still contine.

Change-Id: Ic2fd2be510fde2679c1546482934d0a180266936
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10341
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-12-06 23:17:07 +00:00
Josh Soref
cc6920a476 spelling: lib
Part of #2256

* accessible
* activation
* additional
* allocate
* association
* attempt
* barrier
* broadcast
* buffer
* calculate
* cases
* channel
* children
* command
* completion
* connect
* copied
* currently
* descriptor
* destroy
* detachment
* doesn't
* enqueueing
* exceeds
* execution
* extended
* fallback
* finalize
* first
* handling
* hugepages
* ignored
* implementation
* in_capsule
* initialization
* initialized
* initializing
* initiator
* negotiated
* notification
* occurred
* original
* outstanding
* partially
* partition
* processing
* receive
* received
* receiving
* redirected
* regions
* request
* requested
* response
* retrieved
* running
* satisfied
* should
* snapshot
* status
* succeeds
* successfully
* supplied
* those
* transferred
* translate
* triggering
* unregister
* unsupported
* urlsafe
* virtqueue
* volumes
* workaround
* zeroed

Change-Id: I569218754bd9d332ba517d4a61ad23d29eedfd0c
Signed-off-by: Josh Soref <jsoref@gmail.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10405
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-12-03 08:12:55 +00:00
Jim Harris
7e68d0baca nvme: configure AER for discovery controllers
Move the CONFIGURE_AER state before SET_KEEP_ALIVE to
make sure that we run the CONFIGURE_AER state for
discovery controllers.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia4e24f6507c43e3fece06b9161ff8e0b8fa0e97d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-12-02 04:02:29 +00:00
Jim Harris
b962b6bee5 nvme: set AER bit for discovery controllers
We will actually run the CONFIGURE_AER state for
discovery controllers in a future patch.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib114beb886ab4b9214e4525479eb5ec7e038e5d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10331
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-12-02 04:02:29 +00:00
Konrad Sztyber
2f9c97a4f6 nvme: silence debug logs when waiting for CSTS.RDY transitions
These can produce a lot of output, which doesn't really give any
additional information.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I572cd203d61c717ce6400f67ef27ec1d7bb54c0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-11-30 09:09:11 +00:00
Jim Harris
f0f7005bc1 nvme: simplify nvme_ctrlr_configure_aer_done
For error case, just set ctrlr->num_aers to 0, and
then the loop won't execute at all.  This avoids an
extra call to nvme_ctrlr_set_state() and simplifies
the code a bit.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iff7bbf6e03d18b5f553b9e8527b4c803db583917
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10330
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-11-24 08:34:58 +00:00
Jim Harris
1c083e6200 nvme: set keep alive for discovery controllers
Discovery services using the SPDK nvme driver may
use long-lasting connections that detect AER completions
to determine when there are changes in the discovery
log. This means that we still need to send keep alives
on discovery controller admin queues. So move the
SET_KEEP_ALIVE_TIMEOUT state immediately after
IDENTIFY, and run the SET_KEEP_ALIVE_TIMEOUT state
even for discovery controllers.

Note, we need the IDENTIFY's KAS value to properly
set the keep alive timeout, so we have to keep the
IDENTIFY state before SET_KEEP_ALIVE_TIMEOUT.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6403c28fb72d42629c5f9009a89c4bfd44d162
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10329
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-11-24 08:34:58 +00:00
Jim Harris
5c3da57040 nvme: don't overwrite keep_alive option for discovery controllers
Keep alive is valid for discovery controllers, so don't overwrite
the value requested with zero in nvme_fabric_ctrlr_scan().

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7dcda6ebf4ab1c8a9085e4e3a02b814d8e586a97
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10328
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-11-24 08:34:58 +00:00
Sylvain Didelot
a1014fccee nvme_cuse: Fix write-after-free when cuse thread early-exits
There is a bad memory corruption where the code for CUSE attempts to write
one byte (with value 0x1) after the memory is freed.

Context:
When the CUSE device is unregistered, the poller thread is signaled with
fuse_session_exit(), which writes the value 1 to fuse_session::exited.
The poller thread then detects with fuse_session_exited() that it must exit
the routing and finally destroys its own fuse session with
cuse_lowlevel_teardown() before it exits.

However, FUSE may also call fuse_session_exit() for its internal purposes.
I'm not sure exactly under what conditions that happens, but I added
some trace messages and I could clearly see that the CUSE thread exits
before it was requested to exit in cuse_nvme_ns_stop().

If the poller thread early-exits, it would destroy its own FUSE session
(and free the memory) before fuse_session_exit() gets executed, causing
the memory to be corrupted with a single byte of value 0x1.

Reproducer:
The bug can be reproduced by resetting the FUSE session to NULL after it
is destroyed. This will cuse_nvme_ns_stop() to crash with a segmentation
fault in fuse_session_exit() because it tries to access a NULL pointer.

static void *
cuse_thread(void *arg)
{
    [...]
    free(buf.mem);
    fuse_session_reset(cuse_device->session);
+    cuse_device->session = NULL;
    pthread_exit(NULL);
}

This fix:
The fix I suggest is to destroy the FUSE session with
cuse_lowlevel_teardown() after the thread is joined.

Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I47202891a358f139506845110b012f840974b6fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9931
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-11-23 09:03:51 +00:00
Alexey Marchuk
2db77dc9c7 nvme: Explicitly disconnect qpair before destroy
spdk_nvme_ctrlr_free_io_qpair can be called when
qpair is already disconnected. In that case qpair's
state is changed to NVME_QPAIR_DESTROYING and
transport's ctrlr_delete_io_qpair callback is
called. RDMA and TCP transports call
nvme_transport_ctrlr_disconnect_qpair in
the callback and since qpair's state is
not DISCONNECTED or DISCONNECTING, qpair
is disconnected for the second time.

If spdk_nvme_ctrlr_free_io_qpair is called
when qpair is in ENABLED state than nothing
changes, qpair will be disconnected before destroy.
PCIE/vfio_user don't implement transport disconnect
callback, so they are not affected.

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I23e11856ecafb51669acf4a3118be049c11eecda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-11-23 09:01:05 +00:00
Changpeng Liu
0af4a7cd84 nvme: abort outstanding requests case by case
For DSM command, the NVMe drive may take a long time to finish it,
if we set a small timeout value for DSM command, the bdev/nvme module
will try to reset the IO queue pair when timeout happens,
in `spdk_nvme_ctrlr_free_io_qpair`, we will abort the outstanding
IO requests first, then in the `nvme_pcie_ctrlr_delete_io_qpair`,
we will poll the CQ for any requests that have been completed by
the NVMe controller, if there are NVMe completions in the CQ,
we will finish them again, thus double completions happened.

Here we rename `nvme_qpair_abort_reqs` to `nvme_qpair_abort_all_queued_reqs`,
so the common layer will just abort queued request, and let each
transport to abort outstanding requests case by case.

Fix #2233.

Change-Id: Icae6214239160c615418cb514fc51cfe77b59211
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10233
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-11-22 08:35:35 +00:00
Alexey Marchuk
64fa301f67 rdma: Update for memory map
Add a parameter which determines the owner of the
map - target or initiator. It allows to set different
access flags when creating Memory Regions

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0016847fe116e193d0954db1c8e65066b4ff82bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10283
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-11-19 08:29:59 +00:00
Sylvain Didelot
4cd97383cc nvme_cuse: Fix NULL pointer dereference triggered by unit test
The unit test test_nvme_cuse_stop() manually creates 2 cuse devices
and executes nvme_cuse_stop(). Problem is that the Fuse session is
never initialized for those 2 cuse devices, causing cuse_nvme_ns_stop()
to access 'ns_device->session', which is a NULL pointer.

This bug is detected by ASAN as follows:

==77298==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000180 (pc 0x7fdac6d7d40e bp 0x000000000000 sp 0x7fff74768320 T0)
==77298==The signal is caused by a READ memory access.
==77298==Hint: address points to the zero page.
    0 0x7fdac6d7d40e in fuse_session_destroy (/usr/lib64/libfuse3.so.3+0x1640e)
    1 0x40dc7a in cuse_nvme_ns_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:851
    2 0x40df59 in cuse_nvme_ctrlr_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:923
    3 0x40f103 in nvme_cuse_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:1094
    4 0x415803 in test_nvme_cuse_stop /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:393
    5 0x7fdac724c1a6  (/usr/lib64/libcunit.so.1+0x41a6)
    6 0x7fdac724c528  (/usr/lib64/libcunit.so.1+0x4528)
    7 0x7fdac724d456 in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5456)
    8 0x415a4e in main /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:415
    9 0x7fdac62351e1 in __libc_start_main (/usr/lib64/libc.so.6+0x281e1)
    10 0x403ddd in _start (/home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut+0x403ddd)

The fix is to call fuse_session_destroy() only if the fuse session is != NULL.

Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I41881243227d83e8d1e6b90e72c1b6d62ccd98d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10225
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-11-17 10:58:50 +00:00
Ben Walker
c57bf80497 nvme: Do not construct the namespace object on ns_create
Wait until the namespace is attached, where it does this operation
again. As of this commit it doesn't really matter because it is just
filling in some values in a structure and if it does it twice it's not a
problem. But later when we only allocate active namespaces, we do not
want to allocate the namespace twice.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ie28653b178975d1ca80bf71ca6b5095224f1c5d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-11-15 11:59:59 +00:00
Ben Walker
937086379f nvme: Add nvme_ctrlr_construct_namespace
This constructs a new namespace object.

Change-Id: I2b13c7491a221f047553433df451b6236422b7c8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-11-15 11:59:59 +00:00
Ben Walker
f555a8af97 nvme: Add nvme_ctrlr_destruct_namespace
This function destructs a single namespace and removes it from the
controller.

Change-Id: I4b7b3576beda85c9ddad4e0f2db6d1964fa72b82
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10024
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-11-15 11:59:59 +00:00
Ben Walker
84688fdb1c nvme: Rename max_active_ns_idx to active_ns_count
This was sometimes used as the maximum array index and sometimes as the
maximum count. Make it consistent everywhere and give it a better name.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I518efd99a7d36584624490b0b3497bb6e81ce9ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10101
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-11-15 11:59:59 +00:00
Ben Walker
c7888feebc nvme: Do not allow nvme_ctrlr_identify_active_ns_swap to overrun the
list

The list should always be null terminated, but add an additional layer
of buffer overrun protection.

Change-Id: Iee31057fdca5ec4a6177615dff5171e5cb07984e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10027
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-11-15 11:59:59 +00:00
Jim Harris
cf4ab8a6de nvme: add DELAY_BEFORE_INIT quirk to Intel 0x0A54 SSD
This quirk was already applied to the 0x0A53 SSD, but is
likely needed on 0x0A54 as well.

Possible fix for issue #2231.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36a48ab92d1698a411472f714b5108413bbc3c56
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10162
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-11-15 11:59:14 +00:00
Changpeng Liu
af31c4c928 nvme/opal: don't print error log for dirves that don't support OPAL
Fix issue #2230.

Change-Id: I3df18e12f26bdfcad0c9e50bd5b5ddd3049f0946
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10139
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <junx.wen@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-11-12 01:01:28 +00:00
Changpeng Liu
46b355c0ef nvme: add an API to check existing transport type is fabric or not
We already provides the API `spdk_nvme_ctrlr_is_fabrics` to return
input controller is fabrics controller or not, but it needs a controller
data structure as the input, so here we add another API to do the same
thing and it takes the transport type as the input, with this change,
both nvme and nvmf library can use the API.

Also we should treat UINT8_MAX(255) as valid fabrics transport type.

Change-Id: Ib62e7d3eca3da1ddb1a4cc55b0b62e274522f1ce
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10059
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-11-05 00:53:27 +00:00
Jim Harris
e40bd53175 nvme/pcie: only set qpair state from qpair's thread
The qpair's state member is only 3 bits of a uint8_t,
and the in_completion_context bit is another bit in that
same uint8_t.

We know that the qpair's state is only ever updated by
one thread, but it is possible that the state could
be modified by one thread, while another thread
is modifying in_completion_context.

in_completion_context is only modified by the thread
that is polling the qpair (or the qpair's poll group).
But with async mode, another thread that has a qpair
on the same PCIe controller could poll its adminq and
reap the SQ completion for the qpair that's owned by
the other thread.

So do *not* set the generic qpair state to CONNECTED
from the SQ completion callback.  Instead just set
the pcie_state to READY, and let the thread that owns
the qpair detect the qpair is READY and set the state
to CONNECTED itself.

Fixes issue #2157.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9efc0c954504f1841e1c3890ae78211ad0d1990e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9975
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
2021-10-25 19:53:14 +00:00
Alexey Marchuk
2696886c75 dma: Update translation result to hold iovec pointer
In some cases a single virtually contriguos memory
buffer can be translated to several chunks of memory.
To make such translation possible, update structure
spdk_memory_domain_translation_result to use a pointer
to iovec.
Add a single iov structure or cases where translation
is always 1:1, it will make easier translation callback
implementation. For RDMA transport translation of address
is always 1:1, so treat iovcnt other than 1 as an
error.

Change-Id: I65605575d43a490490eba72c1eb19f3a09d55ec6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9779
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-10-20 22:55:52 +00:00
Alexey Marchuk
549bcdc0a4 dma: Update memory domain context structure
Instead of a union with domain type specific
parameters, store an opaque pointer to user
context. Depending on the memory domain type,
this context can be cast to a specific struct,
e.g. to spdk_memory_domain_rdma_ctx for RDMA
memory domains.
This change provides more flexibility to
applications to create and manage custom
memory domains

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: Ib0a8297de80773d86edc9849beb4cbc693ef5414
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-10-20 22:55:52 +00:00
Krzysztof Karas
c37e776efe trace: move all trace definitions to a separate file
This is to help with binding trace objects together and
for the convenience (all trace definitions are in one place
instad of being scattered accross multiple files).

Change-Id: Ib15bc9c2eeee9c4d0816bcee509ab69f3f558e19
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9574
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
2021-10-20 07:22:00 +00:00
Konrad Sztyber
c1f6054114 nvme: asynchronous detach without shutdown notification
Detaching a controller with `no_shn_notification` flag set will follow
the regular detach path making it asynchronous too.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0b9c6c23626b4cc1cfaedb3268024776a07b9195
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9003
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-10-04 15:00:35 +00:00
Konrad Sztyber
b6ecc37298 nvme: make ctrlr detach fully asynchronous
The controller detach had asynchronous API (with async/poll), but the
register operations were synchronous, so they would block on fabrics
controllers.  In this patch, they're changed to their non-blocking
counterparts, making the detach fully asynchronous.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I74df12ab40a54f1d675639672e03755c89768bef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8726
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-10-04 15:00:35 +00:00
Ziye Yang
34c901e308 nvme/tcp: Fix tcp_req->datao calculation issue.
When data digest is enabled for a nvme tcp qpair, we can use accel_fw
to calculate the data crc32c. Then if there are multiple
c2h pdus are coming, we can use both CPU resource directly
and accel_fw framework to caculate the checksum. Then the datao value compare
will not match since we will not update "datao" in the pdu coming order.

For example, if we receive 4 pdus, named as A, B, C, D.
   offset   data_len (in bytes)
A:  0       8192
B:  8192    4096
C:  12288   8192
D:  20480   4096

For receving the pdu, we hope that we can continue exeution even if
we use the offloading engine in accel_fw. Then in this situation,
if Pdu(C) is offloaded by accel_fw. Then our logic will continue receving
PDU(D). And according to the logic in our code, this time we leverage CPU
to calculate crc32c (Because we only have one active pdu to receive data).
Then we find the expected data offset is still 12288. Because "datao" in tcp_req will
only be updated after calling nvme_tcp_c2h_data_payload_handle function. So
while we enter nvme_tcp_c2h_data_hdr_handle function, we will find the
expected datao value is not as expected compared with the data offset value
contained in Pdu(D).

So the solution is that we create a new variable "expected_datao"
in tcp_req to do the comparation because we want to comply with the tp8000 spec
and do the offset check.

We still need use "datao" to count whether we receive the whole data or not.
So we cannot reuse "datao" variable in an early way. Otherwise, we will
release tcp_req structure early and cause another bug.

PS: This bug was not found early because previously the sw path in accel_fw
directly calculated the crc32c and called the user callback. Now we use a list and the
poller to handle,  then it triggers this issue. Definitely, it will be much easier to
trigger this issue if we use real hardware engine.

Fixes #2098

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I10f5938a6342028d08d90820b2c14e4260134d77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9612
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-09-27 10:53:04 +00:00
Konrad Sztyber
a4b7f87b61 nvme: abort queued admin requests during init
Abort any queued admin requests once admin queue gets enabled. A request
can get queued if a controller is being reset and it gets submitted
while admin qpair is being reconnected.  If these requests aren't
aborted, the init process will stall, as requests don't get resubmitted
while controller is resetting and subsequent admin commands required for
the initialization would be queued too.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If456a297d2d434b3cc741816cbfb13b01d37e963
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-09-24 07:38:57 +00:00
Alexey Marchuk
9381d8d399 nvme: Update spdk_nvme_ctrlr_get_memory_domain
Allow to return more than one memory domain.
This change aligns bdev and nvme API and provides
more flexibility for custom transports.

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica9b12ad8463c361be6cb62ee2c0513eec0b486d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9546
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-09-24 07:37:45 +00:00
Alexey Marchuk
ea86c035bb nvme/tcp: NVME TCP poll group statistics
Enable dump of transport stats in functional test.
Update unit tests to support the new statistics

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I815aeea7d07bd33a915f19537d60611ba7101361
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8885
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-09-24 07:30:06 +00:00
Alexey Marchuk
7d589976f2 nvme_tcp: Check return value of spdk_sock_group_poll
This function may return an error and we should handle it

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I996ceb6e300bd9527385aded9068b2841e94a20d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9278
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-09-24 07:30:06 +00:00
Shuhei Matsumoto
c9ab7ae4b3 nvme_tcp: Use qpair->poll_group only if it is not NULL
nvme_transport_poll_group_remove() clears qpair->poll_group. Hence
we should not use it after that.

Fixes #2170

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I57ceee8c66684e2d02b51b8a0f3d66aacbcb9915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9560
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-09-24 07:29:05 +00:00
Konrad Sztyber
1fbeeb23b3 nvme: rename nvme_qpair_abort_reqs to *_with_cbarg
Renamed nvme_qpair_abort_reqs() to nvme_qpair_abort_reqs_with_cbarg() to
highlight the fact that it only aborts requests with specified cb_arg
and to distinguish it from _nvme_qpair_abort_reqs() which aborts all
requests immediately.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I32fec5ab0501b1beb8605689d73ec42a6424fba5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9323
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-09-22 06:55:28 +00:00
Konrad Sztyber
75aa60e123 nvme: disable keep-alive during controller reset
When a controller is reset, it goes through the whole init process,
including establishing keep-alive, so there's no point in sending it
during that time.  Additionally, it can stall the initialization if it
gets queued when adminq is connecting.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib9fee42f6097972e8ba336958f81e1b6b1a5f006
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9310
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-09-22 06:55:28 +00:00
Konrad Sztyber
0825befa59 nvme: use saved CC register value when creating IO qpairs
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6c518df9d8ecd74247ed8f8ffe133305cbd627f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8622
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-22 06:55:28 +00:00
Konrad Sztyber
9e2166807a nvme: asynchronously wait for the controller to be enabled
Additionally, this patch removes reading the CC and CSTS registers from
`nvme_ctrlr_process_init()`, as it's no longer needed.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If4f9e57dbf249fbce87e90018cff389f59906e38
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8621
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-22 06:55:28 +00:00
Konrad Sztyber
e4e94c383b nvme: print ERRLOGs when init fails due to CSTS read failure
When the controller initialization is failed due to not being able to
read the CSTS register, an ERRLOG is now printed instead of DEBUGLOG.
It should make it easier to debug initialization failures.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I42602bd08e263faf2e87fe4a68abcc82ddec11ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9556
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-22 06:55:28 +00:00
Konrad Sztyber
73050d511a nvme: enable the controller asynchronously
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I2a8116bbb95f6835cd37118f81ec1144501c5b3a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8620
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-09-22 06:55:28 +00:00
Konrad Sztyber
5f3764852c nvme: asynchronously disable the controller during init
The CC register is now re-read again when disabling the controller as
preparation for subsequent patches, in which the synchronous CC register
read will be removed from nvme_ctrlr_process_init().

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibfc8ed85bab188c3938451fbdfb771b969157807
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8619
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-09-22 06:55:28 +00:00
Krishna Kanth Reddy
490c83c6ac lib/nvme: NVMe ZNS - Zone Descriptor Extension support
Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Change-Id: I11a72f48bf4e39e0547c29cb0213679ab24388b2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9494
Reviewed-by: Klaus Jensen <its@irrelevant.dk>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
2021-09-16 07:21:40 +00:00
Konrad Sztyber
8da3c16612 nvme: asynchronously check CSTS.RDY in disable path
The CSTS reads in DISABLE_WAIT_FOR_READY_(0|1) states are now done
asynchronously.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4ca8ad286e259e8fcfbf484223288554280347fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-09-16 07:16:52 +00:00