According to NVMe over Fabrics spec number of SGLs supported by the
controller is reported in MSDBD. But it is also implicitly limited by
command capsule size (IOCCSZ) since SGL are passed in capsule.
This patch adjusts max_sges to capsule size if required. Adjustment to
MSDBD is also moved to transport layer because it is fabrics specific
parameter and is not valid for PCIe transport.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I44918eb949345c61242ca50a524d21d04b6ac058
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11669
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The patch
nvme: Set dnr to zero for nvme_qpair_abort_reqs()
1b3172f726
did the change stated in the title.
However,
Revert "nvme/rdma: Correct qpair disconnect process"
c8f986c7ee
destroyed it for RDMA transport.
Additionally, we had still set DNR to 1 in nvme_qpair_init().
This patch fixes both.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iee60ac24aa7e04cce0f394014c9d9afc9d2b56ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11644
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
SPDK can submit more commands to remote NVMf target than allowed by
negotiated queue size. SPDK submits up to SQSIZE commands, but only
SQSIZE-1 are allowed.
Here is a relevant quote from NVMe over Fabrics rev.1.1a ch.2.4.1
“Submission Queue Flow Control Negotiation”:
If SQ flow control is disabled, then the host should limit the number
of outstanding commands for a queue pair to be less than the size of
the Submission Queue. If the controller detects that the number of
outstanding commands for a queue pair is greater than or equal to the
size of the Submission Queue, then the controller shall:
a) stop processing commands and set the Controller Fatal
Status (CSTS.CFS) bit to ‘1’ (refer to section 10.5 in the NVMe Base
specification); and
b) terminate the NVMe Transport connection and end the association
between the host and the controller.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ifbcf5d51911fc4ddcea1f7cde3135571648606f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11413
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
According to NVMe over Fabrics specification (rev.1.1a) HSQSIZE sent
in RDMA_CM_REQUEST private data (ch.7.3.6.4) shall be the same as
SQSIZE later sent in Connect command (ch.3.3).
SPDK NVMe RDMA initiator adjusts SQSIZE to CRQSIZE received from
target in RDMA_CM_ACCEPT private data. Target is allowed to send
CRQSIZE < HSQSIZE if RNR retries are used. So, it is possible that
SQSIZE sent by SPDK will be lower than previously sent HSQSIZE. There
are targets validating this match and they reject connection from
SPDK.
Linux kernel NVMe initiator doesn't perform such adjustments and
connects well to such targets.
This patch aligns SPDK behavior with specification and Linux kernel
implementation.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I01968d1c07d284396fa5939932d85841351d7a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11350
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are errors occur that uninitialised value created by a stack allocation when running unittest_accel and unittest_nvme_rdma with valgrind.
Change-Id: I4b48b472cc7c189cbcaf8ca772830a23118e7e17
Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10559
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We saw this unexpected behavior by the current SPDK master.
Add the check to clarify this behavior occurs only when we use
Soft RoCE.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3a5eaa9064a0601c65139e7868898545926d0dbf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11225
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
This reverts commit eb09178a59.
Reason for revert:
This caused a degradation for adminq.
For adminq, ctrlr_delete_io_qpair() is not called until ctrlr is destructed.
So necessary delete operations are not done for adminq.
Reverting the patch is practical for now.
Change-Id: Ib55ff81dfe97ee1e2c83876912e851c61f20e354
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10878
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_poll_group_disconnect_qpair() is called only by a single place now.
We do not need the flag poll_group_disconnect_in_progress any more.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I8f9c0f14baa8fcb9b0637635a5bb3d34a8b11af5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_nvme_poll_group_remove() is available only for disconnected
qpairs now. Hence spdk_nvme_poll_group_remove() does not have to
check if qpair is connected and call nvme_ctrlr_disconnect_qpair().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3b05246c4be6adfa3392b8f0e5ecaf274a8a7795
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10846
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
This is necessary to failover another path when multipath is configured.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0b6bcf63501e38f75efb4b0d6bec58abb4b67aef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10250
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
We recently improved qpair disconnect process and added assert
if we get a completion without any error when a qpair is disconnected.
However unexpectedly we saw this case very often when we ran the test
test/nvmf/host/multipath.sh for the real hardware in the test pool.
So we remove the assert and change the ERRLOG to INFOLOG.
Fixes one of the issues in #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: Iedbf7e0afa5025da6a810043ba95348ba5b856b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10901
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In current implementation RDMA qpair is destroyed right after
disconnect. That is not graceful qpair shutdown process since
there can be requests submitted to HW and we may receive
completions for already destroyed/freed qpair.
To avoid this, only disconnect qpair in ctrlr_disconnect_qpair
transport callback, all other resources will be released in
ctrlr_delete_io_qpair cb.
This patch is useful when nvme poll groups are used since in
that case we use shared CQ, if the disconnected qpair has WRs
submitted to HW then qpair's destruction will be deferred to
poll group.
When nvme poll groups are not used, this patch doesn't change
anything, in that case destruction flow is still ungraceful.
However since CQ is destroyed immediately after qpair,
we shouldn't receive any requests which point to released
resources. A correct solution for non-poll group case
requires async diconnect API which may lead to significant
rework.
There is a bug when Soft Roce is used - we may receive
a completion with "normal" status when qpair is already
disconnected and all nvme requests are aborted. Added
a workaround for it.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0680d9ef9aaa8737d7a6d1454cd70a384bb8efac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_nvme_ctrlr_free_io_qpair can be called when
qpair is already disconnected. In that case qpair's
state is changed to NVME_QPAIR_DESTROYING and
transport's ctrlr_delete_io_qpair callback is
called. RDMA and TCP transports call
nvme_transport_ctrlr_disconnect_qpair in
the callback and since qpair's state is
not DISCONNECTED or DISCONNECTING, qpair
is disconnected for the second time.
If spdk_nvme_ctrlr_free_io_qpair is called
when qpair is in ENABLED state than nothing
changes, qpair will be disconnected before destroy.
PCIE/vfio_user don't implement transport disconnect
callback, so they are not affected.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I23e11856ecafb51669acf4a3118be049c11eecda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Add a parameter which determines the owner of the
map - target or initiator. It allows to set different
access flags when creating Memory Regions
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0016847fe116e193d0954db1c8e65066b4ff82bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10283
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In some cases a single virtually contriguos memory
buffer can be translated to several chunks of memory.
To make such translation possible, update structure
spdk_memory_domain_translation_result to use a pointer
to iovec.
Add a single iov structure or cases where translation
is always 1:1, it will make easier translation callback
implementation. For RDMA transport translation of address
is always 1:1, so treat iovcnt other than 1 as an
error.
Change-Id: I65605575d43a490490eba72c1eb19f3a09d55ec6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9779
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Instead of a union with domain type specific
parameters, store an opaque pointer to user
context. Depending on the memory domain type,
this context can be cast to a specific struct,
e.g. to spdk_memory_domain_rdma_ctx for RDMA
memory domains.
This change provides more flexibility to
applications to create and manage custom
memory domains
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: Ib0a8297de80773d86edc9849beb4cbc693ef5414
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Allow to return more than one memory domain.
This change aligns bdev and nvme API and provides
more flexibility for custom transports.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica9b12ad8463c361be6cb62ee2c0513eec0b486d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9546
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This type of errors is not fatal and can be observed when
qpairs are diconnected. The same approach is used in target
side.
Change-Id: Ic3c7b1731c0cbd2e98d776f0f0c5d82464b3d556
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9416
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When poll_group is used, several qpairs share the same CQ
and it is possible to receive a completion with error (e.g.
IBV_WC_WR_FLUSH_ERR) for already disconnected qpair
That happens due to qpair is destroyed while there are
submitted but not completed send/receive Work Requests
To avoid such situation, we should not detroy ibv
qpair until we reap completions for all submitted
send/receive work requests. That requires some
rework in rdma transport and will be implemented
later
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Idb6213d45c2a7954b9ab280f5eb5e021be00505f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Poll group holds lists of qpairs in different states and
when we got rdma completion with error, we iterate these
lists to find a qpair which qp_num matches. qp_num
is stored inside of ibv_qp which belongs to spdk_rdma_qp
structure. When nvme_rdma_qpair is disconnected, pointer
to spdk_rdma_qp is cleaned but qpair may still exist in
poll group list and when we start searhing for qpair by
qp_num we may dereference NULL pointer.
This patch adds a check that pointer to spdk_rdma_qp
is valid before dereferencing it. To minimize boilerplate code,
wrap all check in macro. Add unit test to verify this fix.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I1925f93efb633fd5c176323d3bbd3641a1a632a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9050
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These functions accept extendable structure with IO request options.
The options structure contains a memory domain that can be used to
translate or fetch data, metadata pointer and end-to-end data
protection parameters
Change-Id: I65bfba279904e77539348520c3dfac7aadbe80d9
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6270
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Add a global list of memory domains with reference counter.
Memory domains are used by NVME RDMA qpairs.
Also refactor ibv_resize_cq in nvme_rdma_ut.c to stub
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie58b7e99fcb2c57c967f5dee0417e74845d9e2d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8127
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Replaced poll cycle count with a timeout when destroying a qpair that is
part of a poll group. Tracking the time instead of a poll count is more
stable, as the number of poll cycles can vary based on the application's
behavior when destroying a qpair.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7445bc1b411f2905aab7bf3dc7b2d3344712e1eb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9200
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
async_mode option is currently supported in PCIe transport layer
to create io qpair asynchronously. User polls the io_qpair for
completions, after create cq and sq completes in order, pqpair
is set to READY state. I/O submitted before the qpair is ready
is queued internally. Currently other transports only support
synchronous io qpair creation.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib2f9043872bd5602274e2508cf1fe9ff4211cabb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8911
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
.ctrlr_connect_qpair
Previously this was assumed to be a synchronous process so the generic
layer transport code updated the state after .ctrlr_connect_qpair
returned. In preparation for making this support asynchronous mode,
shift that responsibility down into the individual transports.
While none of the transports actually do this asynchronously, insert a
busy wait in nvme_transport_ctrlr_connect_qpair to wait for the qpair to
exit from the CONNECTING state. None of the upper layer code can
actually correct handle a transport doing this asynchronously, so the
busy wait will cover that.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3c1a5c115264ffcb87e549765d891d796e0c81fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8909
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If the transport returns error when polling for
completions, it gets to a uint32_t and we end up
trying to resubmit all of the requests that are
currently queued. But that's not correct - if
the transport returns an error we shouldn't be
trying to resubmit requests at all.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9198e3e2d71875cc1e46e0ac928338bb983487f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8395
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Connect the adminq as part of controller initialization
instead of controller construction.
We never actually 'connected' the adminq for
PCIe or vfio-user transports, since its a nop.
But their connect_qpair transport ops function
is also a nop for the adminq, so it's fine to
generically connect the adminq across all transports.
Note that we cannot read registers (cc or csts)
during controller initialization now until after
the adminq has been connected since reading fabrics
registers depends on a connected adminq. This gets
special cased for now, but eventually reading
cc and csts will need to be part of the state machine
itself to make it asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia5566d7c549d78d24b94ea253df51e697da6237f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8079
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Read CAP (Capabilities) register as part of controller
initialization instead of controller construction.
For now, still read CAP in the pcie and vfio-user
controller construction, since they need the
drstd (doorbell stride) to construct the admin
queue.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I000fe880f2ec0d6de1d565c883d7ea0ae1ac2c81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8078
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Read VS (Version) register as part of controller
initialization instead of controller construction.
This prepares for upcoming changes to make
controller attach fully asynchronous. Since reading
fabrics registers is an asynchronous operation, it
will be easier to read the VS register as part of
controller initialization which operates as an
asynchronous state machine.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I771386dbdf5902633e0d9f91b3b20be98f26fdc3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8076
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The new 2 API function allow to get and free stats
per poll group. New function to get transport name
have been added to report not only transport type but
also the name.
For now only RDMA transport reports statistics,
other transports will be added later.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I2824cb474fde5fa859cf8196dabac2c48c05709c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6299
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
New statistics include number of poller calls,
number of idle polls and total number of completions.
These statistics allow to estimate % of idle polls
and the number of completions per poll.
Since nvme_rdma_cq_process_completions function
returns number of completed NVMF requests and each
NVMF request consumes 2 RDMA completions (send+recv),
this function was extended to return the number of
RDMA completions.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ifdc1e2e467f645adb5d66d39ff2a379e161fbd77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6298
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
These statistics allow to estimate WRs batching
efficiency. The number of send WRs equals the total
number of submitted NVME commands.
Change-Id: I96c9836cd6b9070cf5f62e43b4d2738506866e94
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6297
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It is invalid to try to delete a NULL qpair, so do
not check for it in nvme_tcp_ctrlr_delete_io_qpair and
return an error when NULL. Just change it to an
assert instead. This makes it consistent with pcie
and rdma.
While here, add an assert in rdma as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic2f76deecb21b78749dac85e33fb1fa0d14a1239
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6917
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
If rdma_qp_disconnect is not correctly sent out, we will not wait
for the event.
Change-Id: I99701e421dc93909d481ccf35e9bfd8004e60da8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6163
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Allocate memory with zero number or size, maybe return a unique
pointer rather than NULL. Add a check before common allocation APIs.
Change-Id: I83e07cab5145035e705bc32364652be90f238633
Signed-off-by: Mao Jiang <maox.jiang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5809
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In file included from nvme_rdma_ut.c:36:
/home/clear/spdk/lib/nvme/nvme_rdma.c:651:22: note: ‘bad_send_wr’ was declared here
651 | struct ibv_send_wr *bad_send_wr;
| ^~~~~~~~~~~
In file included from /home/clear/spdk/lib/nvme/nvme_rdma.c:41,
from nvme_rdma_ut.c:36:
/home/clear/spdk/lib/nvme/nvme_rdma.c: In function ‘nvme_rdma_poll_group_process_completions’:
/home/clear/spdk/include/spdk/log.h:132:2: error: ‘bad_send_wr’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
132 | spdk_log(SPDK_LOG_ERROR, __FILE__, __LINE__, __func__, __VA_ARGS__)
| ^~~~~~~~
cc1: all warnings being treated as errors.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I38ae36756b4bacef7e89f0f1737684c8b8981b12
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4696
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch removes the string from register component.
Removed are all instances in libs or hardcoded in apps.
Starting with this patch literal passed to register,
serves as name for the flag.
All instances of SPDK_LOG_* were replaced with just *
in lowercase.
No actual name change for flags occur in this patch.
Affected are SPDK_LOG_REGISTER_COMPONENT() and
SPDK_*LOG() macros.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I002b232fde57ecf9c6777726b181fc0341f1bb17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4495
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
The issue happens when SPDK RDMA initiator is connected to a remote
target and this target reports rather small (or zero) ICD and we try
to send several SGL descriptors.
Since SGL descriptors are located in ICD, we should check that their
total length fits into ICD. In other case sending such a command
will cause RDMA errors (local length error)
Change-Id: I8c0e8375dae799bc442ed2fab249cad2c4ccce51
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4131
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will allow applications to understand why
they were unable to connect.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic04c7e72098c6ec1823de7d6a07d90150ef5ac20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3836
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
To abort requests whose cb_arg matches, add child abort request greedily.
Iterating all outstanding requests is unique for each transport but
adding child abort is common among transports, and adding child abort
is replaceable by other operations.
Hence add qpair_iterate_requests() function to the function pointer table
of transport, and pass the operation done in the iteration by a
parameter of it.
In each transport, the implementation of qpair_iterate_requests() uses
TAILQ_FOREACH_SAFE() for potential future use cases.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic70d1bf2613fce2566eade26335ceed731f66a89
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2038
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Recently two patches were merged but we should have get more reviews.
The fix done in TCP transport will be better because we can keep
the existing functions and make the code change minimum.
Restore nvme_rdma_req_put() and move removing rdma_req from
rqpair->outstanding_reqs to nvme_rdma_req_complete(). One exception
is the case that only nvme_rdma_req_put() is called. For the case
remove rdma_req from rqpair->outstanding_reqs before calling
nvme_rdma_req_put().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I3f68dbc88c60af6b8f4ecc3209fde9b763ac3189
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3073
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
nvme_rdma_qpair_submit_recvs is not judged in
nvme_rdma_poll_group_process_completions path.
If we do not clean the recvs_to_post.first we
may get the wrong current_num_recvs when the rc
is non-zero and call it again.
Change-Id: If0046e711525dcfcb419132a01fed7a09db13ba0
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3163
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Put the destroying state after the disconnected state.
Because nvme_transport_ctrlr_disconnect_qpair will modify the state
of qpair to disconnected, and in the path of rdma, it will postpone
the deletion of qpair until the release of pg by judging the
destroying state. So qpair is not deleted.
Change-Id: Ica606905cddf67d0ffda14bd48cc5f4e424f01ee
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3136
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>