Add a new flag is_disconnecting to struct spdk_nvme_ctrlr.
Separate calling nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done()
by using the flag is_disconnecting.
Additionally, change nvme_ctrlr_fail() to skip setting ctrlr->is_failed
to true if ctrlr->is_disconnecting is true.
Change-Id: Ie2c74ba41f120662a30f6198751d07005d23abcf
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11000
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
nvme_transport_ctrlr_connect_qpair() calls
nvme_transport_ctrlr_disconnect_qpair() if failed.
If async qpair disconnect is supported, even when connect qpair failed,
nvme_transport_ctrlr_connect_qpair() may complete asynchronously later.
The cases that qpair->async is set to true are I/O qpair for the NVMe
bdev module and admin qpair.
example/nvme/perf and example/nvme/reconnect use I/O qpair but both
set qpair->async to false.
For the NVMe bdev module, I/O qpair is connected when creating I/O
channel or resetting ctrlr. If spdk_nvme_ctrlr_connect_io_qpair()
returns 0 for a I/O qpair, the qpair is in a poll group and is polled
by spdk_nvme_poll_group_process_completions() and a disconnected
callback is called to the qpair. Hence we do not need to add additional
polling for I/O qpair in the NVMe bdev module.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6e0aadcfd98e5cb77b362ef1a79e0eca2985f36e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11112
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If poll group is not used and if qpair is disconnecting,
spdk_nvme_qpair_process_completions() has to poll qpair until it is
actually disconnected even if ctrlr is failed.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6da84f1e35780d21480fbe6f07e76af3048a777b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11018
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Serparate spdk_nvme_ctrlr_disconnect() into nvme_ctrlr_disconnect()
and nvme_ctrlr_disconnect_done() to call nvme_ctrlr_disconnect_done()
after adminq is actually disconnected when disconnecting adminq
asynchronously.
The following patches will add a new flag is_disconnecting to struct
spdk_nvme_ctrlr and prevent us from setting ctrlr->is_failed to true
between nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done().
By this patch, nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done()
are executed in the same context. So it is not possible to set
ctrlr->is_failed to true between nvme_ctrlr_disconnect() and
nvme_ctrlr_disconnect_done().
Hence nvme_ctrlr_disconnect_done() does not have to clear ctrlr->is_failed
again.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I18b5b68f37e27b54782691823edae9738c26faa1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10999
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change spdk_nvme_ctrlr_reset() to use spdk_nvme_ctrlr_disconnect(),
spdk_nvme_ctrlr_reconnect_async(), and
spdk_nvme_ctrlr_reconnect_poll_async().
Then remove the deprecated spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().
These changes simplify the following patches to make
spdk_nvme_ctrlr_disconnect() asynchronous.
Change-Id: Ia71e8e0ad5b2dff42b7423634f66de47863926e2
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10913
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a preparation to make nvme_transport_ctrlr_disconnect_qpair()
asynchronous.
For nvme_transport_ctrlr_disconnect_qpair(), factor out operations after
returning from transport's specific ctrlr_disconnect_qpair() into a helper
function nvme_transport_ctrlr_disconnect_qpair_done().
Then move nvme_transport_ctrlr_disconnect_qpair_done() into the end of
the transport specific ctrlr_disconnect_qpair().
Additionally remove the operation to overwrite the qpair state to
DISCONNECTED from nvme_transport_connect_qpair_fail() because
this is duplicated and nvme_transport_ctrlr_disconnect_qpair() is responsible
to make the qpair disconnected even after it completes asynchronously.
Change-Id: I9c8faa7039d306d3e31a8f51826755ce8840a8aa
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10851
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
spdk_nvme_poll_group has followed spdk_nvme_qpair about how to
process I/O qpair deletion inside of a completion context.
spdk_nvme_qpair_process_completions() accesses qpair after
returning from nvme_transport_qpair_process_completions().
So this is reasonable.
On the other hand, if spdk_nvme_poll_group_process_completions()
can execute spdk_nvme_ctrlr_free_io_qpair() inside of a completion
context, the target qpair is ensured to be deleted after returning
from spdk_nvme_ctrlr_free_io_qpair(). Then the target qpair is
not accessed anymore in spdk_nvme_poll_group_process_completions().
Remove two variables, in_completion_context and num_qpairs_to_delete,
of spdk_nvme_transport_poll_group and the related code.
This change is really necessary to support the following case.
In the NVMe bdev module, a nvme_qpair has a qpair and a poll_group
channel. disconnected_qpair_cb calls spdk_nvme_ctrlr_free_io_qpair()
for the qpair and spdk_put_io_channel() to the poll_group_channel.
spdk_nvme_ctrlr_free_io_qpair() is executed after unwinding stack
but spdk_put_io_channel() is executed now. The callback to
spdk_put_io_channel() calls spdk_nvme_poll_group_destroy(). However,
spdk_nvme_ctrlr_free_io_qpair() is not executed. Hence
spdk_nvme_poll_group_destroy() fails.
Update the corresponding stub in unit test together.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Icd1f1daf049c6c7ffb28790fe87989a1060f8952
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11496
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We want to call disconnected_qpairs_cb only if qpair is actually
disconnected. When we disconnect qpair asynchronously, for qpairs in
the group->disconnected_qpairs list, we want to poll them until
actually disconnected and then call disconnected_qpairs_cb for them.
As a preparation, call disconnected_qpair_cb only for qpairs which is
in the group->disconnected_qpairs list.
For TCP and PCIe transports, disconnecting qpair will continue to be
synchronous for now.
So we change only RDMA transport.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ifaf6157e1e02fa13f52a66409c9e60fc814d71dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11495
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_sock_close() may not complete synchronously.
Then the following scenario is possible.
ctrlr_disconnected_qpair() calls spdk_sock_close(). Then any request may
complete and call _pdu_write_done(). qpair is still associated with a
poll_group and is inserted into the group->needs_poll list.
After ctrlr_disconnected_qpair() completes, disconnected_qpair_cb() is
called. disconnected_qpair_cb() calls spdk_nvme_ctrlr_free_io_qpair().
spdk_nvme_ctrlr_free_io_qpair() calls spdk_nvme_poll_group_remove().
spdk_nvme_poll_group() removes qpair only from the
group->disconnected_qpairs list.
Even after qpair->poll_group is nullified, qpair is still in the
group->needs_poll list.
Then spdk_nvme_poll_group_process_completions() polls all qpairs in
the group->needs_poll list and hits the assert.
Fixes the issue #2390
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6ede8bd3b7b1a57a34ac61436159975ab6253fbe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11882
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Currently shadow doorbell updates are not counted; add statistics for
those, and rename the other statistic for clarity.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I211a77902e38265c99b15862034c6d022dc582a0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_pcie_qpair_update_mmio_required() is only called for QPs with
shadow doorbells enabled, so we don't need an additional branch here.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idf65f92deb0c6f93019892ef5a02bc28ba4c0f8c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11843
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_pcie_qpair_update_mmio_required() qpair argument is unused.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I0bd6897eb8e6a06f211cc599ab6045409bb43641
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11842
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Note that this also works around a false positive in
gcc-11 of type -Wstringop-overread.
Fixes issue #2391.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib5301b9ef9fa3ead2a1a2318655533a8cfba03fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11709
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
According to NVMe over Fabrics spec number of SGLs supported by the
controller is reported in MSDBD. But it is also implicitly limited by
command capsule size (IOCCSZ) since SGL are passed in capsule.
This patch adjusts max_sges to capsule size if required. Adjustment to
MSDBD is also moved to transport layer because it is fabrics specific
parameter and is not valid for PCIe transport.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I44918eb949345c61242ca50a524d21d04b6ac058
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11669
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The patch
nvme: Set dnr to zero for nvme_qpair_abort_reqs()
1b3172f726
did the change stated in the title.
However,
Revert "nvme/rdma: Correct qpair disconnect process"
c8f986c7ee
destroyed it for RDMA transport.
Additionally, we had still set DNR to 1 in nvme_qpair_init().
This patch fixes both.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iee60ac24aa7e04cce0f394014c9d9afc9d2b56ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11644
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
With async connect, we need to avoid the case
where the initiator is sending the icreq, and
meanwhile the application submits enough I/O
such that the request objects are exhausted, leaving
none for the FABRICS/CONNECT command that we need
to send after the icreq is done.
So allocate an extra request, and then use it
when sending the FABRICS/CONNECT command, rather
than trying to pull one from the qpair's STAILQ.
Fixes issue #2371.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If42a3fbb3fd9d863ee48cf5cae75a9ba1754c349
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11515
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Group fields such that those not used in the I/O path
are at the end of the structure.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I43eca1faacd29a5bf34be6ee644191d865cd42a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11514
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This macro will be used in an upcoming patch
that needs to construct an nvme_request structure
outside of the standard nvme_allocate() routines.
Examined x86 optimized assembly with this patch,
and there is no change.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0f6b8500e06b56edc33f437f351536cf857d13d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11513
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
SPDK can submit more commands to remote NVMf target than allowed by
negotiated queue size. SPDK submits up to SQSIZE commands, but only
SQSIZE-1 are allowed.
Here is a relevant quote from NVMe over Fabrics rev.1.1a ch.2.4.1
“Submission Queue Flow Control Negotiation”:
If SQ flow control is disabled, then the host should limit the number
of outstanding commands for a queue pair to be less than the size of
the Submission Queue. If the controller detects that the number of
outstanding commands for a queue pair is greater than or equal to the
size of the Submission Queue, then the controller shall:
a) stop processing commands and set the Controller Fatal
Status (CSTS.CFS) bit to ‘1’ (refer to section 10.5 in the NVMe Base
specification); and
b) terminate the NVMe Transport connection and end the association
between the host and the controller.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ifbcf5d51911fc4ddcea1f7cde3135571648606f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11413
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
According to NVMe over Fabrics specification (rev.1.1a) HSQSIZE sent
in RDMA_CM_REQUEST private data (ch.7.3.6.4) shall be the same as
SQSIZE later sent in Connect command (ch.3.3).
SPDK NVMe RDMA initiator adjusts SQSIZE to CRQSIZE received from
target in RDMA_CM_ACCEPT private data. Target is allowed to send
CRQSIZE < HSQSIZE if RNR retries are used. So, it is possible that
SQSIZE sent by SPDK will be lower than previously sent HSQSIZE. There
are targets validating this match and they reject connection from
SPDK.
Linux kernel NVMe initiator doesn't perform such adjustments and
connects well to such targets.
This patch aligns SPDK behavior with specification and Linux kernel
implementation.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I01968d1c07d284396fa5939932d85841351d7a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11350
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are errors occur that uninitialised value created by a stack allocation when running unittest_accel and unittest_nvme_rdma with valgrind.
Change-Id: I4b48b472cc7c189cbcaf8ca772830a23118e7e17
Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10559
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To allow SO_MINOR updates on LTS for the whole year it is supported,
the major version for all components needs to be increased.
This is to prevent scenario where two versions exists with matching
versions, but conflicting ABI.
Ex. Next SPDK release adds an API call increasing the minor version,
then LTS needs just a subset of those additions.
Increasing major so version after LTS, allows the future releases
to update versions as needed. Yet allowing LTS to increase minor
version separately.
Disabled test for increasing SO version without ABI change, as
that is goal of this patch. This check shall be removed with SPDK 22.05
release.
This patch:
- increases SO_VER by 1 for all components
- resets SO_MINOR to 0 for all components
- removes suppressions for ABI tests
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id1a5358882dc496faa5b0b5c9a63b326c378c551
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We saw this unexpected behavior by the current SPDK master.
Add the check to clarify this behavior occurs only when we use
Soft RoCE.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3a5eaa9064a0601c65139e7868898545926d0dbf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11225
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
This reverts commit eb09178a59.
Reason for revert:
This caused a degradation for adminq.
For adminq, ctrlr_delete_io_qpair() is not called until ctrlr is destructed.
So necessary delete operations are not done for adminq.
Reverting the patch is practical for now.
Change-Id: Ib55ff81dfe97ee1e2c83876912e851c61f20e354
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10878
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fix issue: #2320
Only the primary process will do the unmap bar operation as for
the map bar operation.
The DevHandle is process specific and the issue here is the
secondary process's function pointer of DevHandle is not properly
set.
Change-Id: I95dddc76c6ce4be8775b6aaf54699002baffd3b9
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11216
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We should disconnect ADMIN queue pair after shutdown
returned, or we may leak ADMIN socket resources after
free the controller data structure.
Fix issue #2289.
Change-Id: I956191fcd51cdcef5de2c3c7b15ffc70f22b040b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11133
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: <qun.wan@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Admin commands can be sent and polled from any thread, which also means
that the error injection queue on the admin qpair can be accessed from
multiple threads. Therefore, any modifications to that queue should be
done under the ctrlr lock.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib1ed194405cb5b93f65a007b9749fd4433dc367d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11099
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
poll_group_disconnect_qpair() is used only in a single place now
and transport_poll_group_disconnect_qpair() always returns 0 for all
transport.
Let's remove unnecessary processing for return code.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I45d7f8cea2117b3ec00028df234d1eb9ecc65713
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10677
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvme_poll_group_disconnect_qpair() is called only by a single place now.
We do not need the flag poll_group_disconnect_in_progress any more.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I8f9c0f14baa8fcb9b0637635a5bb3d34a8b11af5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_nvme_poll_group_remove() is available only for disconnected
qpairs now. Hence spdk_nvme_poll_group_remove() does not have to
check if qpair is connected and call nvme_ctrlr_disconnect_qpair().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3b05246c4be6adfa3392b8f0e5ecaf274a8a7795
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10846
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Previously, when connecting qpair, we allocated stats per qpair if poll
group is not used or we set stats per poll group otherwise.
Then when deleting qpair, we freed per qpair stats if allocated.
However, if qpair is still not completely disconnected after removing
qpair from poll group, pqpair->stat is use-after-free and it causes
a segmentation fault.
To fix this issue, we set pqpair->stat to &g_dummy_stats instead.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibf303e6db5176e93ed75cbe3a414bb923d6e3ab6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10845
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously, when connecting qpair, we allocated stats per qpair
if poll group is not used or we set stats per poll group otherwise.
Then when removing qpair from poll group, we cleared qpair->stats pointer.
However, if qpair is still not completely disconnected after removing
qpair from poll group, tqpair->stats is NULL and it causes a segmentation
fault.
Hence we set tqpair->stats to &g_dummy_stats instead of NULL.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ice6469627ce8d4bf4567f57c304759206b6432f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_ctrlr_disconnect_qpair() calls nvme_poll_group_disconnect_qpair() if the qpair
uses a poll group, and nvme_poll_group_disconnect_qpair() calls
nvme_ctrlr_disconnect_qpair() if the state of the qpair is not DISCONNECTING.
This relationship made the code very complex.
A few patches starting from this patch simplifies disconnect and free qpair
operations.
This patch swaps the ordering of nvme_ctrlr_disconnect_qpair() and
spdk_nvme_poll_group_remove() in spdk_nvme_ctrlr_free_io_qpair().
This ensures the qpair is disconnected when spdk_nvme_ctrlr_free_io_qpair()
calls spdk_nvme_poll_group_remove().
This enables us to limit spdk_nvme_poll_group_remove() to be available
only for disconnected qpairs.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0601a74f953a2efc4f177a51a4450baea33533d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10670
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is necessary to failover another path when multipath is configured.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0b6bcf63501e38f75efb4b0d6bec58abb4b67aef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10250
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
FUSE has a limitation of 128KiB. Adding a check that returns ENOMEM for
ioctl and logs the error. Applies to both in and out buffers
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I9ce5fdc413b047a1ec074468be5abf433da26d7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10855
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Adding metadata support for io commands. Currently metadata is ignored
even if present in the cmd struct. Making metadata adress
readable/writable depending on data transfer bits. Adding extra unit
test to make sure metadata fields are populated.
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I1d01974a6b2831c82b43e94073065d235eea429a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10854
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Modify admin passthru so that result field of passthru struct is always
populated. This should be safe since dw0 is either reserved or contains
command specific info. This is specifically meant for the namespace
management command when attempting to create a namespace. As per spec:
"Dword 0 of the completion queue entry contains the Namespace Identifier
created.". So for nvme cli and perhaps other application to see what is
the id of the namespace created there needs to be a way to pass the
information back.
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: Ide4effc126ad9eedac95b0700dd65041ed4b35b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10633
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
-Change cuse ioctl reply from status code to whole status field.
-Add negative test for nvme cli cuse: Power Managment on Namespace
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I55a88a4f5ace5040f79c05edfc0b8559905bdd2e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is the only time where we're allowed to invalidate namespace
handles, so use this opportunity to release inactive ones.
Change-Id: I53626ddf30e48e04207078fe406ec6e02138ac9f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10103
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is the count of items in the RB_TREE, so put the two next to each
other.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib30bee12e65065dc414b55e85cfffa2026057e9f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10035
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We only populate active namespaces into the main namespace tree, so we
don't need a separate list of active namespaces too.
Change-Id: Iaf194f806cc1d9672f5567cff3dffafff3165069
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10034
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These are no longer complex enough to warrant being separate functions.
Change-Id: I5f3c9fc904b768b6509283c4b7def686bab9a1d2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Since this is now sparsely populated, a tree is a better choice.
Change-Id: Ie66d913fa1d298de56a7d22ef55f0adf7f8803b8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10031
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Some subsystems report a very large maximum value for the number of
namespaces, but in essentially every case the subsystem is sparsely
populated with active namespaces. To save memory, don't allocate
objects for the inactive ones.
Change-Id: I4cbeb5a7a898d3c685f4a3a9ec4c2ce45efffb92
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9898
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Each portion of the discovery log has a header which
includes a 'genctr'. This number indicates the
current generation of the discovery log. If this
number changes during the process of fetching the
discovery log in multiple chunks, wait for the
current fetch to complete, but then start over.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5f8623593b7f935eecc37a98daf92e7d8c0dd566
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Return if outstanding_commands > 0. This reduces
indentation for the rest of the code in the
function and simplifies the diff for an upcoming
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I49f09eff7c0908829819e6b797c922211c56e7db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We recently improved qpair disconnect process and added assert
if we get a completion without any error when a qpair is disconnected.
However unexpectedly we saw this case very often when we ran the test
test/nvmf/host/multipath.sh for the real hardware in the test pool.
So we remove the assert and change the ERRLOG to INFOLOG.
Fixes one of the issues in #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: Iedbf7e0afa5025da6a810043ba95348ba5b856b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10901
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If nvme ctrlr is resetting or initializing, free_io_qids
bitmap is already freed or not created yet. In that case
an attempt to create IO qpair leads to segmentation fault.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I6a97bf81d5a568db20d23b3f88cf01e994ba42e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10827
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
In current implementation RDMA qpair is destroyed right after
disconnect. That is not graceful qpair shutdown process since
there can be requests submitted to HW and we may receive
completions for already destroyed/freed qpair.
To avoid this, only disconnect qpair in ctrlr_disconnect_qpair
transport callback, all other resources will be released in
ctrlr_delete_io_qpair cb.
This patch is useful when nvme poll groups are used since in
that case we use shared CQ, if the disconnected qpair has WRs
submitted to HW then qpair's destruction will be deferred to
poll group.
When nvme poll groups are not used, this patch doesn't change
anything, in that case destruction flow is still ungraceful.
However since CQ is destroyed immediately after qpair,
we shouldn't receive any requests which point to released
resources. A correct solution for non-poll group case
requires async diconnect API which may lead to significant
rework.
There is a bug when Soft Roce is used - we may receive
a completion with "normal" status when qpair is already
disconnected and all nvme requests are aborted. Added
a workaround for it.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0680d9ef9aaa8737d7a6d1454cd70a384bb8efac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This API is a helper for getting the full discovery
log page from a discovery controller. It will read the
log page header to get the total number of entries,
allocate a buffer for all of the entries, and then
issue a series of get_log_page commands to read each
4KiB worth of entries.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I02666ef5adcb9fc8825a221655811ace708f97b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10564
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_ctrlr_construct_namespaces
We can just reallocate here to be more efficient.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8cfc87da23aee6c05ff83aea2165683dddba1dbd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10688
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the one place this was called, we can call nvme_ns_construct
instead. There's no harm in re-fetching the identify pages.
Change-Id: I91292ff9650bdc7edd5588a05837b671dcac1922
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10102
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The host may have specified a hostnqn to use to connect to
a discovery ctrlr, so we can't just use the default ctrlr
opts to connect - we need to call the probe_cb (if there is
one) to get any options that the host may have specified.
Tested by using discovery_aer tool, creating a subsystem and
listener, and then adding a host on the target side that matches
the hostnqn specified to the discovery_aer tool.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07266e984e0094d3a768e6a0d5ea3a3bd71e32ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10547
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The NVMe bdev module enables asynchronous IO QP creation by default, after
calling `spdk_nvme_ctrlr_alloc_io_qpair` and `spdk_nvme_ctrlr_connect_io_qpair`,
the queue pair is in connecting state at the beginning, then users may call
`spdk_nvme_ctrlr_free_io_qpair` immediately, and the common layer will
change queue state to NVME_QPAIR_DISCONNECTING and NVME_QPAIR_DESTROYING,
so in function `nvme_pcie_ctrlr_delete_io_qpair` the workaround to wait
for create cq/sq callbacks will not be called, instead of using the common
layer queue state here, we should use the internal `pcie_state`.
Fix#2245.
Change-Id: I801caf26563464b135035bf7fa2f63def13de9f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10445
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The NVMe bdev module will support two features, delayed reconnect and
delete after multiple failures of reconnect to improve error recovery.
The recently added two APIs, spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async(), were not good enough.
spdk_nvme_ctrlr_reset_ctx was not necessary. It had only a pointer to ctrlr.
Using a pointer to ctrlr directly saves us from undesirable malloc error
processing.
Separate spdk_nvme_ctrlr_reset_async() into spdk_nvme_ctrlr_disconnect()
and spdk_nvme_ctrlr_reconnect_async(). spdk_nvme_ctrlr_disconnect()
disconnects ctrlr including disconnecting adminq.
spdk_nvme_ctrlr_reconnect_async() moves the ctrlr state to INIT.
Then rename spdk_nvme_ctrlr_reset_poll_async() by
spdk_nvme_ctrlr_reconnect_poll_async().
Finally deprecate spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().
The following patches will change the NVMe bdev module to use these new APIs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id1d6858dcdc5fc2e9db0a6ebf3f79cab4f9bbcb7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10091
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also we don't treat exceptions when getting INTEL log pages
as a fatal error, the initialization will still contine.
Change-Id: Ic2fd2be510fde2679c1546482934d0a180266936
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10341
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Move the CONFIGURE_AER state before SET_KEEP_ALIVE to
make sure that we run the CONFIGURE_AER state for
discovery controllers.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia4e24f6507c43e3fece06b9161ff8e0b8fa0e97d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We will actually run the CONFIGURE_AER state for
discovery controllers in a future patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib114beb886ab4b9214e4525479eb5ec7e038e5d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10331
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These can produce a lot of output, which doesn't really give any
additional information.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I572cd203d61c717ce6400f67ef27ec1d7bb54c0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
For error case, just set ctrlr->num_aers to 0, and
then the loop won't execute at all. This avoids an
extra call to nvme_ctrlr_set_state() and simplifies
the code a bit.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iff7bbf6e03d18b5f553b9e8527b4c803db583917
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10330
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Discovery services using the SPDK nvme driver may
use long-lasting connections that detect AER completions
to determine when there are changes in the discovery
log. This means that we still need to send keep alives
on discovery controller admin queues. So move the
SET_KEEP_ALIVE_TIMEOUT state immediately after
IDENTIFY, and run the SET_KEEP_ALIVE_TIMEOUT state
even for discovery controllers.
Note, we need the IDENTIFY's KAS value to properly
set the keep alive timeout, so we have to keep the
IDENTIFY state before SET_KEEP_ALIVE_TIMEOUT.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6403c28fb72d42629c5f9009a89c4bfd44d162
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10329
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Keep alive is valid for discovery controllers, so don't overwrite
the value requested with zero in nvme_fabric_ctrlr_scan().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7dcda6ebf4ab1c8a9085e4e3a02b814d8e586a97
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10328
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is a bad memory corruption where the code for CUSE attempts to write
one byte (with value 0x1) after the memory is freed.
Context:
When the CUSE device is unregistered, the poller thread is signaled with
fuse_session_exit(), which writes the value 1 to fuse_session::exited.
The poller thread then detects with fuse_session_exited() that it must exit
the routing and finally destroys its own fuse session with
cuse_lowlevel_teardown() before it exits.
However, FUSE may also call fuse_session_exit() for its internal purposes.
I'm not sure exactly under what conditions that happens, but I added
some trace messages and I could clearly see that the CUSE thread exits
before it was requested to exit in cuse_nvme_ns_stop().
If the poller thread early-exits, it would destroy its own FUSE session
(and free the memory) before fuse_session_exit() gets executed, causing
the memory to be corrupted with a single byte of value 0x1.
Reproducer:
The bug can be reproduced by resetting the FUSE session to NULL after it
is destroyed. This will cuse_nvme_ns_stop() to crash with a segmentation
fault in fuse_session_exit() because it tries to access a NULL pointer.
static void *
cuse_thread(void *arg)
{
[...]
free(buf.mem);
fuse_session_reset(cuse_device->session);
+ cuse_device->session = NULL;
pthread_exit(NULL);
}
This fix:
The fix I suggest is to destroy the FUSE session with
cuse_lowlevel_teardown() after the thread is joined.
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I47202891a358f139506845110b012f840974b6fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9931
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_nvme_ctrlr_free_io_qpair can be called when
qpair is already disconnected. In that case qpair's
state is changed to NVME_QPAIR_DESTROYING and
transport's ctrlr_delete_io_qpair callback is
called. RDMA and TCP transports call
nvme_transport_ctrlr_disconnect_qpair in
the callback and since qpair's state is
not DISCONNECTED or DISCONNECTING, qpair
is disconnected for the second time.
If spdk_nvme_ctrlr_free_io_qpair is called
when qpair is in ENABLED state than nothing
changes, qpair will be disconnected before destroy.
PCIE/vfio_user don't implement transport disconnect
callback, so they are not affected.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I23e11856ecafb51669acf4a3118be049c11eecda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
For DSM command, the NVMe drive may take a long time to finish it,
if we set a small timeout value for DSM command, the bdev/nvme module
will try to reset the IO queue pair when timeout happens,
in `spdk_nvme_ctrlr_free_io_qpair`, we will abort the outstanding
IO requests first, then in the `nvme_pcie_ctrlr_delete_io_qpair`,
we will poll the CQ for any requests that have been completed by
the NVMe controller, if there are NVMe completions in the CQ,
we will finish them again, thus double completions happened.
Here we rename `nvme_qpair_abort_reqs` to `nvme_qpair_abort_all_queued_reqs`,
so the common layer will just abort queued request, and let each
transport to abort outstanding requests case by case.
Fix#2233.
Change-Id: Icae6214239160c615418cb514fc51cfe77b59211
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10233
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a parameter which determines the owner of the
map - target or initiator. It allows to set different
access flags when creating Memory Regions
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0016847fe116e193d0954db1c8e65066b4ff82bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10283
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The unit test test_nvme_cuse_stop() manually creates 2 cuse devices
and executes nvme_cuse_stop(). Problem is that the Fuse session is
never initialized for those 2 cuse devices, causing cuse_nvme_ns_stop()
to access 'ns_device->session', which is a NULL pointer.
This bug is detected by ASAN as follows:
==77298==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000180 (pc 0x7fdac6d7d40e bp 0x000000000000 sp 0x7fff74768320 T0)
==77298==The signal is caused by a READ memory access.
==77298==Hint: address points to the zero page.
0 0x7fdac6d7d40e in fuse_session_destroy (/usr/lib64/libfuse3.so.3+0x1640e)
1 0x40dc7a in cuse_nvme_ns_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:851
2 0x40df59 in cuse_nvme_ctrlr_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:923
3 0x40f103 in nvme_cuse_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:1094
4 0x415803 in test_nvme_cuse_stop /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:393
5 0x7fdac724c1a6 (/usr/lib64/libcunit.so.1+0x41a6)
6 0x7fdac724c528 (/usr/lib64/libcunit.so.1+0x4528)
7 0x7fdac724d456 in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5456)
8 0x415a4e in main /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:415
9 0x7fdac62351e1 in __libc_start_main (/usr/lib64/libc.so.6+0x281e1)
10 0x403ddd in _start (/home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut+0x403ddd)
The fix is to call fuse_session_destroy() only if the fuse session is != NULL.
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I41881243227d83e8d1e6b90e72c1b6d62ccd98d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10225
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Wait until the namespace is attached, where it does this operation
again. As of this commit it doesn't really matter because it is just
filling in some values in a structure and if it does it twice it's not a
problem. But later when we only allocate active namespaces, we do not
want to allocate the namespace twice.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ie28653b178975d1ca80bf71ca6b5095224f1c5d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function destructs a single namespace and removes it from the
controller.
Change-Id: I4b7b3576beda85c9ddad4e0f2db6d1964fa72b82
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10024
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was sometimes used as the maximum array index and sometimes as the
maximum count. Make it consistent everywhere and give it a better name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I518efd99a7d36584624490b0b3497bb6e81ce9ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10101
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
list
The list should always be null terminated, but add an additional layer
of buffer overrun protection.
Change-Id: Iee31057fdca5ec4a6177615dff5171e5cb07984e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10027
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This quirk was already applied to the 0x0A53 SSD, but is
likely needed on 0x0A54 as well.
Possible fix for issue #2231.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36a48ab92d1698a411472f714b5108413bbc3c56
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10162
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We already provides the API `spdk_nvme_ctrlr_is_fabrics` to return
input controller is fabrics controller or not, but it needs a controller
data structure as the input, so here we add another API to do the same
thing and it takes the transport type as the input, with this change,
both nvme and nvmf library can use the API.
Also we should treat UINT8_MAX(255) as valid fabrics transport type.
Change-Id: Ib62e7d3eca3da1ddb1a4cc55b0b62e274522f1ce
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10059
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The qpair's state member is only 3 bits of a uint8_t,
and the in_completion_context bit is another bit in that
same uint8_t.
We know that the qpair's state is only ever updated by
one thread, but it is possible that the state could
be modified by one thread, while another thread
is modifying in_completion_context.
in_completion_context is only modified by the thread
that is polling the qpair (or the qpair's poll group).
But with async mode, another thread that has a qpair
on the same PCIe controller could poll its adminq and
reap the SQ completion for the qpair that's owned by
the other thread.
So do *not* set the generic qpair state to CONNECTED
from the SQ completion callback. Instead just set
the pcie_state to READY, and let the thread that owns
the qpair detect the qpair is READY and set the state
to CONNECTED itself.
Fixes issue #2157.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9efc0c954504f1841e1c3890ae78211ad0d1990e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9975
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In some cases a single virtually contriguos memory
buffer can be translated to several chunks of memory.
To make such translation possible, update structure
spdk_memory_domain_translation_result to use a pointer
to iovec.
Add a single iov structure or cases where translation
is always 1:1, it will make easier translation callback
implementation. For RDMA transport translation of address
is always 1:1, so treat iovcnt other than 1 as an
error.
Change-Id: I65605575d43a490490eba72c1eb19f3a09d55ec6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9779
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Instead of a union with domain type specific
parameters, store an opaque pointer to user
context. Depending on the memory domain type,
this context can be cast to a specific struct,
e.g. to spdk_memory_domain_rdma_ctx for RDMA
memory domains.
This change provides more flexibility to
applications to create and manage custom
memory domains
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: Ib0a8297de80773d86edc9849beb4cbc693ef5414
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is to help with binding trace objects together and
for the convenience (all trace definitions are in one place
instad of being scattered accross multiple files).
Change-Id: Ib15bc9c2eeee9c4d0816bcee509ab69f3f558e19
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9574
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Detaching a controller with `no_shn_notification` flag set will follow
the regular detach path making it asynchronous too.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0b9c6c23626b4cc1cfaedb3268024776a07b9195
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9003
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The controller detach had asynchronous API (with async/poll), but the
register operations were synchronous, so they would block on fabrics
controllers. In this patch, they're changed to their non-blocking
counterparts, making the detach fully asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I74df12ab40a54f1d675639672e03755c89768bef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8726
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When data digest is enabled for a nvme tcp qpair, we can use accel_fw
to calculate the data crc32c. Then if there are multiple
c2h pdus are coming, we can use both CPU resource directly
and accel_fw framework to caculate the checksum. Then the datao value compare
will not match since we will not update "datao" in the pdu coming order.
For example, if we receive 4 pdus, named as A, B, C, D.
offset data_len (in bytes)
A: 0 8192
B: 8192 4096
C: 12288 8192
D: 20480 4096
For receving the pdu, we hope that we can continue exeution even if
we use the offloading engine in accel_fw. Then in this situation,
if Pdu(C) is offloaded by accel_fw. Then our logic will continue receving
PDU(D). And according to the logic in our code, this time we leverage CPU
to calculate crc32c (Because we only have one active pdu to receive data).
Then we find the expected data offset is still 12288. Because "datao" in tcp_req will
only be updated after calling nvme_tcp_c2h_data_payload_handle function. So
while we enter nvme_tcp_c2h_data_hdr_handle function, we will find the
expected datao value is not as expected compared with the data offset value
contained in Pdu(D).
So the solution is that we create a new variable "expected_datao"
in tcp_req to do the comparation because we want to comply with the tp8000 spec
and do the offset check.
We still need use "datao" to count whether we receive the whole data or not.
So we cannot reuse "datao" variable in an early way. Otherwise, we will
release tcp_req structure early and cause another bug.
PS: This bug was not found early because previously the sw path in accel_fw
directly calculated the crc32c and called the user callback. Now we use a list and the
poller to handle, then it triggers this issue. Definitely, it will be much easier to
trigger this issue if we use real hardware engine.
Fixes#2098
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I10f5938a6342028d08d90820b2c14e4260134d77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9612
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Abort any queued admin requests once admin queue gets enabled. A request
can get queued if a controller is being reset and it gets submitted
while admin qpair is being reconnected. If these requests aren't
aborted, the init process will stall, as requests don't get resubmitted
while controller is resetting and subsequent admin commands required for
the initialization would be queued too.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If456a297d2d434b3cc741816cbfb13b01d37e963
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Allow to return more than one memory domain.
This change aligns bdev and nvme API and provides
more flexibility for custom transports.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica9b12ad8463c361be6cb62ee2c0513eec0b486d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9546
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Enable dump of transport stats in functional test.
Update unit tests to support the new statistics
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I815aeea7d07bd33a915f19537d60611ba7101361
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8885
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This function may return an error and we should handle it
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I996ceb6e300bd9527385aded9068b2841e94a20d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9278
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
nvme_transport_poll_group_remove() clears qpair->poll_group. Hence
we should not use it after that.
Fixes#2170
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I57ceee8c66684e2d02b51b8a0f3d66aacbcb9915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9560
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Renamed nvme_qpair_abort_reqs() to nvme_qpair_abort_reqs_with_cbarg() to
highlight the fact that it only aborts requests with specified cb_arg
and to distinguish it from _nvme_qpair_abort_reqs() which aborts all
requests immediately.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I32fec5ab0501b1beb8605689d73ec42a6424fba5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9323
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When a controller is reset, it goes through the whole init process,
including establishing keep-alive, so there's no point in sending it
during that time. Additionally, it can stall the initialization if it
gets queued when adminq is connecting.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib9fee42f6097972e8ba336958f81e1b6b1a5f006
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9310
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Additionally, this patch removes reading the CC and CSTS registers from
`nvme_ctrlr_process_init()`, as it's no longer needed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If4f9e57dbf249fbce87e90018cff389f59906e38
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8621
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When the controller initialization is failed due to not being able to
read the CSTS register, an ERRLOG is now printed instead of DEBUGLOG.
It should make it easier to debug initialization failures.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I42602bd08e263faf2e87fe4a68abcc82ddec11ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9556
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The CC register is now re-read again when disabling the controller as
preparation for subsequent patches, in which the synchronous CC register
read will be removed from nvme_ctrlr_process_init().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibfc8ed85bab188c3938451fbdfb771b969157807
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8619
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The CSTS reads in DISABLE_WAIT_FOR_READY_(0|1) states are now done
asynchronously.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4ca8ad286e259e8fcfbf484223288554280347fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Checking if the controller is enabled (CC.EN == 1) is now done without
blocking.
Additionally, a copy of the controller configuration register (CC) value
is now stored in spdk_nvme_ctrlr.process_init_cc. It'll be updated in
subsequent patches whenever the register is written / read. This will
make it possible to make several function non-blocking without having
send asynchronous register reads.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8323cf0c31a5ea282840aab6cf8ca241ce8667be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8617
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is is the first patch in a series changing all register accesses in
the NVMe controller initialization path to be asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic4df9890992eafb402cf3372fe2ff3ac3c503932
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8615
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This removes some code that was duplicated in the
CHECK_EN and DISABLE_WAIT_FOR_READY_1 states.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie5d175540f71c692f7784c7ff22a48f34b9b7082
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8614
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It will allow the async callbacks to retain the existing timeout while
changing controller's state.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4210f2cf7d4171444c338b8926334b985129a6c7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8613
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It will make it easier to set this timeout once the register accesses
are performed asynchronously.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9d61555a17380ddd57567db8dd912ef6d739fc01
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8612
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It will make it easier to support asynchronous register set/get
functions.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9915609ff940596ae4d67388238cc685dfa426fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8608
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch introduces asynchronous versions of the ctrlr_(get|set)_reg
functions. Not all transports need to define them - for those that it
doesn't make sense (e.g. PCIe), the transport layer will call the
synchronous API and queue the callback to be executed during the next
process_completions call.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I2e78e72b5eba58340885381cb279f3c28e7995ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8607
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Previously the nvme_ctrlr_state_string() function
was only defined for DEBUG builds since it was only
used in DEBUGLOGs. So now that we're also using
it for an ERRLOG we need to define this function
in release builds as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic14b251fbd7da1519ad0fd4d8a183d6b6ca17984
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9468
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This type of errors is not fatal and can be observed when
qpairs are diconnected. The same approach is used in target
side.
Change-Id: Ic3c7b1731c0cbd2e98d776f0f0c5d82464b3d556
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9416
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is needed for reporting additional information in JSON RPCs
Change-Id: I45da2ea78cd5415f7536130b6793ca29e929b86e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9338
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When poll_group is used, several qpairs share the same CQ
and it is possible to receive a completion with error (e.g.
IBV_WC_WR_FLUSH_ERR) for already disconnected qpair
That happens due to qpair is destroyed while there are
submitted but not completed send/receive Work Requests
To avoid such situation, we should not detroy ibv
qpair until we reap completions for all submitted
send/receive work requests. That requires some
rework in rdma transport and will be implemented
later
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Idb6213d45c2a7954b9ab280f5eb5e021be00505f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Poll group holds lists of qpairs in different states and
when we got rdma completion with error, we iterate these
lists to find a qpair which qp_num matches. qp_num
is stored inside of ibv_qp which belongs to spdk_rdma_qp
structure. When nvme_rdma_qpair is disconnected, pointer
to spdk_rdma_qp is cleaned but qpair may still exist in
poll group list and when we start searhing for qpair by
qp_num we may dereference NULL pointer.
This patch adds a check that pointer to spdk_rdma_qp
is valid before dereferencing it. To minimize boilerplate code,
wrap all check in macro. Add unit test to verify this fix.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I1925f93efb633fd5c176323d3bbd3641a1a632a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9050
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
John Kariuki tested this patch on a system with
several Intel P5800X Optane SSDs, to determine the
performance impact of adding these two
spdk_trace_records() in the main NVMe I/O path.
The pathological case (512B random reads on a single
Xeon core) decreased from 13.10M to 12.88M, or 1.7%.
Normal workloads (4KB+) would incur a smaller penalty
since the I/O rate would be much lower - maybe even
unnoticeable..
This is a really valuable tracepoint to have enabled
by default, so I think this small amount of degradation
is acceptable.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie2543cadf3541eb74398d31ac0f495522ab49ec0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9303
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This new API signals that the ctrlr will soon be
reset. This allows the transport to skip unnecessary
steps in following calls to the driver prior to the
reset - for example, skipping PCIe DELETE_SQ/CQ
commands when freeing an IO qpair.
Note that if we are deleting a qpair after
prepare_for_reset was called, and the qpair is
still waiting for a CREATE_IO_CQ or CREATE_IO_SQ,
we cannot poll for those commands to complete,
but we also cannot free the qpair immediately.
So set a flag for this case to defer the
destruction until the outstanding CREATE_IO_CQ or
CREATE_IO_SQ callback is invoked (typically as an
aborted command when the reset happens).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I34c6276ae71e7d61ad4a3720f1a985b1ee96bd8b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9249
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Expose the already existing nvme_ctrlr_get_cc as
spdk_nvme_ctrlr_get_regs_cc, similar to spdk_nvme_ctrlr_get_regs_csts and
spdk_nvme_strlr_get_regs_cap etc.
Signed-off-by: Tomasz Bielecki <tomasz.bielecki@wdc.com>
Change-Id: Ibfcf6fbe64dee3719f381184fb728ab6e4d52526
Signed-off-by: Tomasz Bielecki <tomasz.bielecki@wdc.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9220
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If NN is very, very large, this allocates too much memory. For now, just
use a list.
Change-Id: I904977673d8fb6c86f03c94ba798c6cc07f4a4d8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9301
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This can be done immediately after receiving the controller identify
data for now.
Change-Id: I527a44c4d1f4d3ad2eeb8fc77e07086c2358cac3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9300
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When deleting an IO qpair, make sure that it's connection process is
finished (i.e. create CQ/SQ commands are completed) before freeing it.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I487dcef390d73ff4a7264ff97d965c9030916840
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9279
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This allows for creating admin qpair in an asynchronous mode and I/O
qpairs based on what the user specified in spdk_nvme_io_qpair_opts.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1801ea76d5a08b1bdc558eb0f18648f59454b654
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9077
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Split the connection process across two states, which allows the
transport to connect the admin queue asynchronously.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie84477331df0abf5ffdfc2a0ff5d5ada760c9e73
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9076
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows for submitting IO requests before the CONNECT command is
sent and not stopping the connect process due to the CONNECT being
queued. It could happen once IO qpair connection is asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I04e45b1c2f49f9da3412c843ea899341c56a0420
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8624
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This will allow us to call `connect_qpair_poll()` from
`process_completions()`. Otherwise, `nvme_fabric_qpair_connect_poll()`
which calls `process_completions()` might create a loop, which can cause
issues with `nvme_completion_poll_status` used for tracking the CONNECT
command by the fabrics code.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibd67bc3e011b238db264394e005b3268624cceef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8623
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The fabric connect command is now sent without. It will make it
possible to make `nvme_tcp_ctrlr_connect_qpair()` non-blocking too by
moving the polling to process_completions (this will be done in
subsequent patches). Additionally, two extra states,
`NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_SEND` and
`NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_POLL`, were added to keep track of
the state of the connect command. These states are only used by the
initiator code, as the target doesn't need them.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I25c16501e28bb3fbfde416b7c9214f42eb126358
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8605
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These functions will allow for sending the connect fabric command
asynchronously.
Additionally, this patch changes the return code for
`nvme_fabric_qpair_connect()` when a timeout occurs from -EIO to
-ECANCELED. It gives better description of the error as well as make it
more consistent with `nvme_wait_for_completion*` APIs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I95806626d3573ebe4b1568157fd57013c4b909a7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8604
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Since the connect will be completed asynchronously, we
need to keep the pointer around so we can access (and
free it!) later when the command completes.
Also change the code to poll on the status using the
new nvme_wait_for_completion_poll(), as prep for upcoming
patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I28add8f967fd000afed1e50e491a16ea9da16c22
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8603
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This gives us a context that we can poll on when using the
nvme_wait_for_completion framework for async operations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5b23f9096eed5ce95848d1995c1ed0b4c74edc92
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8602
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Modified the async_list to be per-process instead of
on the controller object. This allows an NVMe multi-
process setup to have Asynchronous Events Reported
to each process that may interested in them. In the
previous case, where the async event list was on the
controller object, AER (Async Event Requests) would
not be reported to all the processes.
Fixes: #1874
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: I3e885c0cf5a0fd471d243bc7d96a8b7ffe65d14b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8744
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Cleaning up the API, implementations were removed in 7dbe0e7c.
Signed-off-by: Tomasz Bielecki <tomasz.bielecki@wdc.com>
Change-Id: Iae25bbb301da123f9e784678d4c4f0b8d39f262e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9221
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There may be multiple C2H data pdus recevied.
So we should use the following steps:
1 Use the SPDK_NVME_TCP_C2H_DATA_FLAGS_LAST_PDU
to check whether it is a last pdu or not.
Then we will not cleanup tcp_req, i.e., tcp_req->datao
will not be cleaned.
Then use the SPDK_NVME_TCP_C2H_DATA_FLAGS_SUCCESS
to check whether the controller will use resp pdu
or not.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9dccf2579aadd18f31361444e25bd4b3b76f06c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9192
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was accidentally merged as part of c3a5848.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I430e7035e9c2baf0b8d0592cda76265957c3791e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9251
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This is exension to the existing nvme_wait_for_completion* interface
that allows the user to poll for request's completion without blocking.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6c5b7203883f8e2fa28ceb039ce63aa50631f571
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8601
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is needed to enable using the nvme_completion_poll
machinery in an asynchronous mode.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8009ab81bcc02cba685f560be3d6540aab7ef3f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8600
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be done in stages. This patch adds the
nvme_tcp_ctrlr_connect_qpair_poll function and and makes the icreq step
asynchronous. Later patches will expand it and make the
nvme_fabric_qpair_connect part asynchronous as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ief06f783049723131cc2469b15ad8300d51b6f32
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8599
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The pcie layer can't always detect bad addresses
in the request at submission time - for example,
the transport may not have any trackers available
and the request gets queued at the generic
nvme level.
So this means that we might detect vtophys failures
during submission time, or in a process_completions
context - the latter happening when we complete
one request which triggers submitting a new request.
Currently if the vtophys failure happens during
submission context, we return -EFAULT to the
caller *and* call the completion callback. Nowhere
else in the driver do we do both - the intention
has always been that you get one or the other.
So make all of this consistent by tagging the
tracker and the qpair with a flag if we hit a vtophys
error in the submission path. Return 0 to the caller,
who will then later get a completion callback for the
bad request when the qpair is next processed for
completions.
I considered a separate TAILQ to hold these 'bad'
trackers, but that would have required duplicating
quite a bit of the tracker completion code for this
one case. The flag on the pqpair is already in the
hot cacheline, so it's cheap to check it. We will
only interate the outstanding_tr list when that flag
is set, so this should have zero impact to performance.
Fixes issue #2085.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I60b135fb32d899188e51545b69feb1b27758fd7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9234
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These functions accept extendable structure with IO request options.
The options structure contains a memory domain that can be used to
translate or fetch data, metadata pointer and end-to-end data
protection parameters
Change-Id: I65bfba279904e77539348520c3dfac7aadbe80d9
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6270
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Add a global list of memory domains with reference counter.
Memory domains are used by NVME RDMA qpairs.
Also refactor ibv_resize_cq in nvme_rdma_ut.c to stub
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie58b7e99fcb2c57c967f5dee0417e74845d9e2d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8127
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
This avoids an infite loop when user sends and then immediately aborts
all requests while the transport cannot submit them (returning -EAGAIN).
It might happen in two cases: 1) if the transport doesn't have any free
requests or 2) if the qpair is still in the NVME_QPAIR_CONNECTING state.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia1d0b7c93f387524be0ad39597ec49851e1d3af7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8711
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Needed to make this code path asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6b18d4e9fc1a29b30c78f7f087b80913d00f9726
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8598
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
If a qpair is part of a poll group and it's not configured in the async
mode, it should be using poll group's process_completions variant.
Additionally, connecting qpairs to the poll group was moved up, so that
qpairs are already on the connected qpairs queue when waiting for the
connection to complete.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I08f75bd61a566d1ab60029b6202d9337df75733f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9074
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Replaced poll cycle count with a timeout when destroying a qpair that is
part of a poll group. Tracking the time instead of a poll count is more
stable, as the number of poll cycles can vary based on the application's
behavior when destroying a qpair.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7445bc1b411f2905aab7bf3dc7b2d3344712e1eb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9200
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Number of active namespaces may change. So, on ANA log page update we
should check if buffer has to be resized.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I1720317ea7f59e5afef73d5c4bd1cd69a7dd6520
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8583
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Log page reading function 'spdk_nvme_ctrlr_cmd_get_log_page' does read
into intermediate buffer and then copy to user provided buffer. So,
there is no need for user buffer to allow DMA.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I7337afa99c3ae666cc43ea2a48317de875334cfc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9177
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In the next patch, the qpair is polled from a poll group and needs a
disconnect callback, which should also fail the qpair, so it makes sense
to have a separate function doing that.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ied76431520962b25220027be829a4609afb6bbda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
async_mode option is currently supported in PCIe transport layer
to create io qpair asynchronously. User polls the io_qpair for
completions, after create cq and sq completes in order, pqpair
is set to READY state. I/O submitted before the qpair is ready
is queued internally. Currently other transports only support
synchronous io qpair creation.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib2f9043872bd5602274e2508cf1fe9ff4211cabb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8911
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The generic transport layer still does a busy wait, but at least
the logic in the PCIe transport now creates the queue pair
asynchronously.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: I9669ccb81a90ee0a36d3f5512bc49c503923b293
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8910
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
.ctrlr_connect_qpair
Previously this was assumed to be a synchronous process so the generic
layer transport code updated the state after .ctrlr_connect_qpair
returned. In preparation for making this support asynchronous mode,
shift that responsibility down into the individual transports.
While none of the transports actually do this asynchronously, insert a
busy wait in nvme_transport_ctrlr_connect_qpair to wait for the qpair to
exit from the CONNECTING state. None of the upper layer code can
actually correct handle a transport doing this asynchronously, so the
busy wait will cover that.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3c1a5c115264ffcb87e549765d891d796e0c81fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8909
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Split the NVMe controller reset into pre-init and reinit stages so
that the latter begins with a call to nvme_ctrlr_process_init(),
returning -EAGAIN if the controller is not yet ready so that a poller
can call it again later.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: Ia182b04e438241b139109be93f3ed858fac7f3d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8486
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Implement an async variant of spdk_nvme_ctrlr_reset(). This initial
implementation only allocates a context and returns it to the caller,
relying on the caller to poll the context to execute the existing
spdk_nvme_ctrlr_reset() implementation.
Wire up spdk_nvme_ctrlr_reset() to use this async variant to verify
that NVMe controller reset still works.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: I75d4b75dbf5897db452ee65286aef5a4eb839fca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We find a few files to get the size of a member of a struct. How to
do it is a little complex. So add a macro to do it will be helpful
to read the current code and develop new features.
lib/dif had used member_size() internally but Linux use sizeof_member()
as the macro. Besides, SPDK have used upper case letters for similar
macros, SPDK_CONTAINEROF() and SPDK_COUNTOF(). Hence spdk_member_size()
may be good but propose SPDK_SIZEOF_MEMBER() as the macro.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2179c845a3b75fb71aa039075cc4dfd30617b898
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8738
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
If there are completed asynchronous events that have not been notified
to the user, free them during controller shutdown to avoid memory leaks.
It can happen if an event completes before user has a chance to execute
`spdk_nvme_ctrlr_process_admin_completions()`.
Fixes#2032.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie608bf9100342f8dfd709e070326f67335d27fed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8740
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
NVMe bdev module manages ANA log page itself now. So NVMe driver
should disable managing ANA log page.
Add a new option disable_read_ana_log_page to struct spdk_nvme_ctrlr_opts.
Then NVMe bdev module enables it when calling spdk_nvme_connect_async().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id5249efe90a4d50763c3a7eaa1eb9572f60fbc8c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This fix is as same as for NVMe bdev module.
If a ANA log page has two or more ANA group descriptors, the second
or later of ANA group descriptors will not be 8-bytes aligned.
Then runtime error would occur as follows:
runtime error: member access within misaligned address 0x612000000074
for type 'const struct spdk_nvme_ana_group_descriptor', which requires
8 byte alignment
nvmf_get_ana_log_page() in lib/nvmf/ctrlr.c creates a ANA log page
data and processes 8 bytes alignment correctly because we got the
same runtime error before. However, lib/nvme had been missed at that
time.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idaa610544dc5cb659c387fcd38a2b4b97cbd06e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8398
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The next patch will add an new controller option, disable_read_ana_log_page.
Initializing ns->ana_state to optimized before reading ANA log page
will simplify the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I34e56a2b454e4c02e1899f972e0ad675f2ebe2a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8312
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Nvmf/vfio-user uses this API to map NVMe command sent from
VM from Guest Physical Address to Host Virtual Address, so
now we moved this API from the nvme library to nvmf/vfio-user
as an internal API.
UT code will be added back in coming patch.
Change-Id: I54817fc9811ccd9ddd97b3aa6762a2fce4bbdda6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Latest nvme-cli (>= 1.13) fails to issue commands towards SPDK's cuse
ctrl device, e.g.:
$ nvme get-feature /dev/spdk/nvme0 -f 1 -s 1 -l 100
nvme_cuse.c: 654:cuse_ctrlr_ioctl: *ERROR*: Unsupported IOCTL 0x4E40.
get-namespace-id: Invalid argument
The reason is because nvme-cli now also sends NVME_IOCTL_ID to the
target device to determine if it's indeed a controller or a ns. In
case kernel returns ENOTTY then nvme-cli considers the device to be
a controller. Since cuse_ctrlr_ioctl() returns EINVAL in such a case
the nvme-cli fails.
To avoid this simply replace EINVAL with ENOTTY for the ioctls that
may be not supported by ctrl or ns device.
nvme-cli commit in question:
fa2b91da74
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I29003864bc2a5c1a8906d6d01beba3d6f4e31b0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8531
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Returning an error from this function is not useful - there
is nothing the caller can do with that information. So
change the return value to void. Also add ERRLOG and assert
if a transport actually returns a non-zero status, to
force the transport implementer (which must be an out-of-tree
transport) to make changes as necessary.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I402afec045265db178af821d25b99a6dbe066eab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8659
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It is not uncommon for delete_io_qpair to fail, for
example when a controller is hot removed. So even
if SQ or CQ deletion fails, continue with freeing
resources and report success back up the stack.
There is really nothing the application can do to
account for this failing anyways.
Upcoming patches will add additional checks to
ensure failing delete_io_qpair status never gets
propagated to the caller.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iac007c1eba30f7a8c4936b3ffb6c837f28ee12ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8658
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fixes#2022
If queued aborts are present when trying to fail a ctrlr
using spdk_nvme_ctrlr_fail(), then the abort command completion
will attempt to retry one of the queued aborts.
This eventually leads to a segfault that can be avoided by not
retrying any queued aborts.
Change-Id: I897dcb8809e16af8bdd39d4381ab531e1cc29822
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8585
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
After correct trstring initialization, it is
overwritten with trstring value of the current
probe ctx. That leads to a problem when initiator
connects to a sbusystem with listeners of different
transport types (e.g. TCP and RDMA). If probe_ctx has
TCP type, than discovery probe initialized probe trid
with trtype=RDMA and trstring=TCP. As results, SPDK
creates TCP controller with trtype=RDMA and we hit
assert in nvme_tcp_qpair function.
Change-Id: I9355450c40c58fa55b016220703f6f7ae36b2571
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8464
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
If we continuous setup and teardown cuse session, It will teardown
uninitialized cuse session and cause segment fault, New function
cuse_session_create will do the session create operation and under
g_cuse_mtx to avoid this issue.
Signed-off-by: Weifeng Su <suweifeng1@huawei.com>
Change-Id: I2b32e81c0990ede00eea6d4ed3a7e44d534d4df3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8231
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This update will allow us to use spdk_nvme_detach_async() and
spdk_nvme_detach_poll_async() easier to aggregate multiple detachments.
Previously, we could do:
spdk_nvme_detach_async()
spdk_nvme_detach_async()
spdk_nvme_detach_async()
and then started doing spdk_nvme_detach_poll_async().
Hence aggregating multiple detachments is already supported.
After this patch, the following sequence is possible:
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = 0
The actual changes is to remove the variable polling_started from
struct spdk_nvme_detach_ctx because it is not necessary anymore.
Clarify this change via updating the header file and CHANGELOG.
Verify this change by unit test.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iebdf6c27c5304a2097b7084c315ccc99634ffa1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8468
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add a new function spdk_nvme_detach_poll() to simplify a common
use case to continue polling until all detachments complete.
Then use the function for the common use case throughout.
Besides, usage by simple_copy application was not correct, and
fix it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic14711cd8478bf221c0fe375301e77b395b37f26
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8509
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If the transport returns error when polling for
completions, it gets to a uint32_t and we end up
trying to resubmit all of the requests that are
currently queued. But that's not correct - if
the transport returns an error we shouldn't be
trying to resubmit requests at all.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9198e3e2d71875cc1e46e0ac928338bb983487f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8395
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For receving the pdu, we add the crc32c offloading by Accel framework.
Because the size of to caculate the header digest size is too small, so
we do not offload the header digest.
Change-Id: If2c827a3a4e9d19f0b6d5aa8d89b0823925bd860
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7734
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
siginfo_t is a GNU extension. SPDK (and DPDK) have
direct dependencies on GNU extensions, but it's a bit
nicer if external modules don't also need to define
_GNU_SOURCE. Currently siginfo_t parameter in the
spdk_pci_error_handler is the only thing that violates
this.
Note that DPDK also supports registering sigbus handlers,
but they take the failing address as a parameter instead
of the full siginfo_t structure. Let's adopt the same
for SPDK.
While here, remove an extra semicolon that was just after
the virtio sigbus handler function signature that was
updated in this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07faf11a3ac3589c637cb2196581c102286b1e68
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8333
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Using the CMB for SQs is not a standard use case.
Performance can vary widely when using CMB for SQs
and is typically not the configuration used for
benchmarking.
So let's change the default value here to 'false',
users can still opt-in by setting this option to
true in the spdk_nvme_ctrlr_opts structure prior
to attach.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iab746ba777b04152ffb92fea2a2bb923a0a0bf21
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8227
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
QEMU 6.0 by default uses a RedHat dev/vendor ID rather
than the Intel one that has always been used to date.
We need the NVME_QUIRK_MAXIMUM_PCI_ACCESS_WIDTH quirk
so that we do not use wide instructions to copy SQEs
to a virtualized CMB, since QEMU does not support
that.
The NVME_INTEL_QUIRK_NO_LOG_PAGES quirk is only needed
for devices with SPDK_PCI_VID_INTEL, so we do not need
to carry this one over to the new REDHAT entry.
Fixes issue #1986.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3d339b3525e7c6ceb792eb9d143e7a922c19344d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8226
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The NVMe driver layer will clear this log, so we don't
need to send another one in the aer callback.
Here we change the logic to compare with previous NS
state, if the NS state is same it will fail the test.
Change-Id: I6d80cb6a5f6d5eab92b8ccac601a23c19cea4003
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8175
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
And for some internal functions we need to pass controller
parameter so that we can do vtophys based on transport type.
Change-Id: I3ca4fa162ec9305f62b295ba21f7474c21edfe52
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8031
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We need to wait to process this quirk until after we
have a valid CAP register value. Before this fix,
controllers with this quirk would get their io_queue_size
always capped at 2 (min io queue size) because CAP hadn't
actually been read yet.
Fixes: f5ba8a5e (nvme: add NVME_CTRLR_STATE_READ_CAP)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4df87b5dfb0faa21db5b4cf6fc667d80621d1691
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8211
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The inline function can also be used in the coming submit request
function.
Change-Id: If4a5511001e6586dbce0978298beddc537f54d8b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8173
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The PCIE and VFIOUSER both can use this function, the only difference
is VFIOUSER should use IOVA=VA to do the vtophys translation, so
here we will move the function to the common PCIe layer as the first
step.
Change-Id: I699edb67a00a2fa534072fc02ac2dd4a27aba8f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8030
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Each time the following file
"/sys/kernel/config/nvmet/subsystems/nqn_name/namespaces/ns_id/enable"
on the target side was changed, the SPDK initiator should receive an
async event (type: SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE, info:
SPDK_NVME_ASYNC_EVENT_NS_ATTR_CHANGED).
But actually not.
Since for SPDK, when target sent the non-first event, the condition
"nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_NS_ATTR)" that prevents
target from sending event was matched.
This commit fix this issue by issuing a get_log_page cmd for each async
event received, just as the kernel initiator does.
Fixes#1825.
Signed-off-by: tyler.sun <tyler.sun@dell.com>
Change-Id: I2973470a81893456ca12e86ac390ea1de0eed62c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7107
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
If a reconnect fails, we restore the original
transport_failure_reason after we're done with
the failed reconnect. Save the original reason
in the qpair itself rather than a local variable,
to facilitate upcoming changes where connect will
be asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I20ff43fc687a379aa5c930e17cf3ff8d730320be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8116
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Previously we were only checking trtype==PCIE to
determine whether a controller was fabrics. This
skipped the vfio-user case. So use the new
spdk_nvme_transport_id_is_fabrics() API instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I81f26853f44b1c47522ce6354e5aa4a905796bd0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8089
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Nvme-cli submits a RESCAN IOCTL after a format command to
update any information that may have changed during the
format, such as LBA Format. This patch adds support
for RESCAN by executing nvme_ctrlr_update_namespaces to
update the controller information.
Fixes: #1964
Change-Id: I9f03e00a7f39339947ff02390f69ce806e1cfa0e
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8146
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A Security Receive command with the Security Protocol field cleared to
00h shall return information about the security protocols supported by
the controller, so we can check the TCG security protocol is supported
or not before sending it.
Fix issue #1961.
Change-Id: Id061defe45db981b276e2794fd0b59f8db70b7f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8083
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Initialize ANA state of each namespace to optimized regardless of
whether ANA is supported or not. This will simplify the code to get
the optimal I/O path because we do not have to care if the namespace
supports ANA.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I24dfe08674af398671de6528b884e9d82409eeae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7890
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We map the SPDK_NVME_TRANSPORT_* values directly to
the NVMe-oF trtype values. Since PCIe isn't
Fabrics, we choose 256 which is outside of the
8-bit trtype range of values.
So we can just check if trtype >= 256 to determine
if the trid is for fabrics or not. This is
preferable to checking PCIE || VFIOUSER in case
additional non-fabrics transport types are added
in the future.
I considered taking a trid as the parameter instead,
but went this route since it is consistent with
the existing spdk_nvme_ctrlr_is_discovery().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib62ff4d30549b2324486c81f2dce67f0f1741e9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8077
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Connect the adminq as part of controller initialization
instead of controller construction.
We never actually 'connected' the adminq for
PCIe or vfio-user transports, since its a nop.
But their connect_qpair transport ops function
is also a nop for the adminq, so it's fine to
generically connect the adminq across all transports.
Note that we cannot read registers (cc or csts)
during controller initialization now until after
the adminq has been connected since reading fabrics
registers depends on a connected adminq. This gets
special cased for now, but eventually reading
cc and csts will need to be part of the state machine
itself to make it asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia5566d7c549d78d24b94ea253df51e697da6237f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8079
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
According to NVMe-oF 1.1 spec, it is not a fatal error.
So according to Figure 126 in NVMe Base specification,
we should return "Transient Transport Error".
Change-Id: I601304ae2bb24508882fb1ec8c7e53ec587ab515
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7795
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>