Commit Graph

10287 Commits

Author SHA1 Message Date
Mike Gerdts
531258aa51 thread: get debug stack traces on spinlocks
To help debug spinlocks, capture stack traces as spinlocks are used.
Future commits in this series will make debugging with these stack
traces easier.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I597b730ca771ea3c5b831f5ba4058d359215f7f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15998
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-05-02 22:32:01 +00:00
Mike Gerdts
54db60cdb3 bdev_part: allow UUID to be specified
This introduces spdk_bdev_part_construct_ext(), which takes an options
structure as an optional parameter. The options structure has one
option: uuid.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5e9fdc8e88b78b303e60a0e721d7a74854ac37a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-02 18:59:58 +00:00
Konrad Sztyber
599aee6003 bdev: add extra function when pushing bounce data
This is done in preparation for retrying IOs on ENOMEM when pushing
bounce data.  Also, rename md_buffer to md_buf to keep the naming
consistent with other code which uses this abbreviation.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I014f178a45a2a751ecca40d119f45bf323f37d0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17762
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
28bcf6a760 bdev: retry IOs on ENOMEM from pull/append_copy
The IOs will now be retried after ENOMEM is received when doing memory
domain pull or appending an accel copy.  The retries are performed using
the mechanism that's already in place for IOs completed with
SPDK_BDEV_IO_STATUS_NOMEM.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I284643bf9971338094e14617974f7511f745f24e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17761
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
7952ef88e0 bdev: count push/pull/seq_finish as io_outstanding
The IOs with an outstanding memory domain push/pull or accel sequence
finish operation are now added to the io_outstanding counter.  It'll be
necessary to correctly calculate nomem_threshold when handling ENOMEM
from those operations.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ice1fb94f1c9054a3a96312a0960ac5085d0b21bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17760
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
6ed8bdf7d7 bdev: remove leading underscore from _bdev_io_(inc|dec)rement_outstanding
The leading underscore usually indicate that a function providing the
actual implementation for something that's called from some other
wrapper function without the leading underscore.  That is not the case
for these functions, so this patch removes the leading underscores.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6e1186b156116249ee53a3845ae99ba87db5122b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17868
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
7cb6475ab1 bdev: add _bdev_io_increment_outstanding()
In the next patches we'll need to increment the io_outstanding from a
few more places, so it'll be good to have a dedicated function for that.
Also, move _bdev_io_decrement_outstanding() up, so that both functions
are near each other.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1af5dbe288f7f701c8ba5e85406f02330ae21a39
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17759
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2023-05-02 18:48:27 +00:00
Konrad Sztyber
7c528fdbe1 bdev: add common sequence finish callback
There are some common operations that need to be done each time a
sequence is executed (and more will be added in the following patches),
so it makes sense to have a common callback. data_transfer_cpl is used
for executing user's callbacks since it's unused at this point.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4570acbdbe158512d13c31c0ee0c7bb7bf62d18c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17678
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
d704e6a025 bdev: keep IOs on the io_memory_domain queue during pull/push
The IOs are now kept on the io_memory_domain queue only if they have an
outstanding pull/push operation.  It'll make it easier to support
retrying pull/push in case of ENOMEM.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If5a54fac532206ee8472bacf364a5ef6cde8edea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17677
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
168bc2673e bdev: allow different ways of handling nomem IOs
This is a preparation for reusing the code handling nomem_io for
other type of NOMEM errors (e.g. from pull/push/append_copy).  This
patch doesn't actually change anything functionally - only IOs completed
by a module with SPDK_BDEV_IO_STATUS_NOMEM status are retried.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I12ecb2efcf2d2cdf75b302f9f766b4c16ac99f3e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17676
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2023-05-02 18:48:27 +00:00
Konrad Sztyber
252aea5fad bdev: move adding IOs to the nomem_io queue to functions
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0da93c55371652c5725da6cf602fa40391670da3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17867
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
f6339ffdb7 bdev: push bounce data only for successful IOs
The actual memory domain push already only happened for successfully
completed requests, but the code would go still go through
_bdev_io_push_bounce_data_buffer(), which could cause issues for IOs
completed with NOMEM, because the bounce buffer would be released in
_bdev_io_complete_push_bounce_done().

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Id1af1e31cb416e91bf11101a5ce7919530245e1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17866
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
13b801bf37 bdev: use parent_io when executing sequence for split IOs
The sequence is associated with parent IO, so that's the IO that should
be used when executing a sequence.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ifcdb06094b38a5eaee1691e5aa8de1c8dc9d01a6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17865
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
f20fbfe65b bdev: move pulling md_buf to a function
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I935983a14bedc386ffe31abacc8fa200cd79f750
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17675
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
72a6fff8bb bdev: move pulling data to bounce buffer to a function
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idbabcd5bd812cede6f5159ba0691b2dc28a4022a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17674
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:48:27 +00:00
Konrad Sztyber
eb8f9bbc99 bdev: move resubmitting nomem IOs to a function
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9f91af30ee1dd5f2568d9f76a30f00497aff6bbc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-02 18:48:27 +00:00
Jim Harris
2e56512236 nvmf: fix comparison in nvmf_stop_listen_disconnect_qpairs
This function disconnects any qpairs that match both
the listen trid and the subsystem pointer.  If the
specified subsystem is NULL, it will just disconnect
all qpairs matching the listen trid.

But there are cases where a qpair doesn't yet have an
associated subsystem - for example, before a CONNECT
is received.

Currently we would always disconnect such a qpair, even
if a subsystem pointer is passed.  Presumably this check
was added to ensure we don't dereference qpair->ctrlr
when it is NULL but it was added incorrectly.

Also while here, move and improve the comment about
skipping qpairs.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8b7988b22799de2a069be692f4a5b4da59c2bad4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17854
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2023-05-02 18:45:32 +00:00
Jim Harris
7c30df4ece usdt: add SPDK_DTRACE_PROBE variants that don't collect ticks
While userspace probes have a high overhead when enabled due
to the trap, it is still cleaner and slightly more efficient
to not have all of the SPDK_DTRACE_PROBE macros implicitly
capture the tsc counter as an argument.

So rename the existing SPDK_DTRACE_PROBE macros to
SPDK_DTRACE_PROBE_TICKS, and create new SPDK_DTRACE_PROBE
macros without the implicit ticks argument.

Note this does cause slight breakage if there is any
out-of-tree code that using SPDK_DTRACE_PROBE previously,
and programs written against those probes would need to
adjust their arguments.  But the likelihood of such code
existing is practically nil, so I'm just renaming the
macros to their ideal state.

All of the nvmf SPDK_DTRACE_PROBE calls are changed to
use the new _TICKS variants.  The event one is left
without _TICKS - we have no in-tree scripts that use
the tsc for that event.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icb965b7b8f13c23d671263326029acb88c82d9df
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17669
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-05-02 18:43:44 +00:00
Alexey Marchuk
4045068a32 lib/nvmf: Defer port removal while qpairs exist in poll group
The following heap-use-after-free may happen when RDMA listener
is removed:
1. At least 2 listeners exist, at least 1 qpair is created
on each listening port
2. Listener A is removed, in nvmf_stop_listen_disconnect_qpairs
we iterate all qpair (let's say A1 and B1) and we check if qpair's
source trid matches listener's trid by calling
nvmf_transport_qpair_get_listen_trid. Trid is retrieved from
qpair->listen_id which points to the listener A cmid. Assume that
qpair's A1 trid matches, A1 starts the disconnect process
3. After iterating all qpairs on step 2 we switch to the next
IO channel and then complete port removal on RDMA transport
layer where we destroy cmid of the listener A
4. Qpair A1 still has IO submitted to bdev, destruction is postponed
5. Listener B is removed, in nvmf_stop_listen_disconnect_qpairs
we iterate all qpairs (A1 and B1) and try to check A1's listen trid.
But listener A is already destroyed, so RDMA qpair->listen_id points
to freed memory chunk

To fix this issue, nvmf_stop_listen_disconnect_qpairs was modified
to ensure that no qpairs with listen_trid == removed_trid exist
before destroying the listener.

Fixes issue #2948

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Iba263981ff02726f0c850bea90264118289e500c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17287
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:42:44 +00:00
Mike Gerdts
76f4b77726 lvol: esnap clones must end on cluster boundary
When regular lvols are created, their size is rounded up to the next
cluster boundary. This is not acceptable for esnap clones as this means
that the clone may be silently grown larger than external snapshot. This
can cause a variety of problems for the consumer of an esnap clone lvol.

While the better long-term solution is to allow lvol sizes to fall on
any block boundary, the implementation of that needs to be suprisingly
complex to support creation and deletion of snapshots and clones of
esnap clones, inflation, and backward compatibility.

For now, it is best to put in a restriction on the esnap clone size
during creation so as to not hit problems long after creation. Since
lvols are generally expected to be large relative to the cluster size,
it is somewhat unlikely that this restriction will be a significant
limitation.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Id7a628f852a40c8ec2b7146504183943d723deba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17607
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-02 18:32:19 +00:00
Mateusz Kozlowski
ca0c4dcde8 lib/ftl: Give correct type for seq_id variables/return types
Change-Id: I7d2fd31620481cf66f5f4400e6de4fc736ee3dad
Signed-off-by: Mateusz Kozlowski <mateusz.kozlowski@solidigm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17608
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-28 09:48:18 +00:00
Anil Veerabhadrappa
831773b220 nvmf/fc: delegate memory object free to LLD
'args' object in nvmf_fc_adm_evnt_i_t_delete() is actually allocated in
the FC LLD driver and passed to nvmf/fc in nvmf_fc_main_enqueue_event() call.
So this object should be freed in the LLD's callback function.

Change-Id: I04eb0510ad7dd4bef53fc4e0f299f7226b303748
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17836
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-04-28 09:08:48 +00:00
Richael Zhuang
df4600f4c9 nvme_tcp: fix memory leak when resetting controllor
In failover test, it reports memory leak about tqpair->stats when
detaching a tcp controller and it failover to the other controller.

Because during resetting the controller, we disconnect the controller
at first and then reconnect. when disconnecting, the adminq is not
freed which means the corresponding tqpair and tqpair->stats are not
freed. But when reconnecting, nvme_tcp_ctrlr_connect_qpair will
allocate memory for tqpair->stats again which causes memory leak.

So this patch fix the bug by not reallocating memory for tqpair->stats
if it's not NULL. We keep the old stats because from user perspective,
the spdk_nvme_qpair is the same one.

Besides, when destroying a qpair, the qpair->poll_group is set as
NULL which means if qpair->poll_group is not NULL, it should be a
new qpair. So there's no need to check if stats is NULL or not if
qpair->poll_group is not NULL. So adjusting the if...else... in
_nvme_pcie_ctrlr_create_io_qpair.

Change-Id: I4108a980aeffe4797e5bca5b1a8ea89f7457162b
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17718
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-04-27 11:00:03 +00:00
Ziye Yang
cb97b86081 env_dpdk: optimizing spdk_call_unaffinitized
Purpose: Reduce unnecessary affinity setting.

For some usage cases, the app will not use spdk
framework and already call spdk_unaffinitize_thread
after calling spdk_env_init().

Change-Id: I5fa8349913c4567ab63c5a01271e7b2755e53257
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17720
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-04-27 09:38:49 +00:00
Krzysztof Karas
3edc534216 vhost_blk: make sure to_blk_dev() return value is not NULL
Assert that return pointer of to_blk_dev() is not NULL,
before dereferencing it.

Change-Id: I15adeac0926f23f84fdb3af88fc15ac07c580d91
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17536
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-04-27 09:30:42 +00:00
Krzysztof Karas
50e3b7bf31 nvme_transport: return NULL if transport does not exist
spdk_nvme_ctrlr_get_registers() calls nvme_get_transport()
to get a reference for a transport, whose registers should
be returned, but nvme_get_transport() explicitly returns
NULL, if the transport does not exist. This would result
in dereferencing a NULL pointer on line 862.

To remedy that, if no transport was found, return NULL.

Additionally change "THis" to "This" on line 46.

Change-Id: I3944925659991e9424e2177b5c940b2e2626d1f4
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17532
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-04-27 09:29:59 +00:00
Jim Harris
baf250e5e4 nvmf: initialize trid param in get_***_trid paths
When removing a listener, for example with
nvmf_subsystem_remove_listener RPC, we use the concept of a
"listen trid" to determine which existing connections
should be disconnected.

This listen trid has the trtype, adrfam, traddr and trsvcid
defined, but *not* the subnqn.  We use the subsystem pointer
itself to match the subsystem.

nvmf_stop_listen_disconnect_qpairs gets the listen trid
for each qpair, compares it to the trid passed by the
RPC, and if it matches, then it compares the subsystem
pointers and will disconnect the qpair if it matches.

The problem is that the spdk_nvmf_qpair_get_listen_trid
path does not initialize the subnqn to an empty string,
and in this case the caller does not initialize it either.
So sometimes the subnqn on the stack used to get the
qpair's listen trid ends up with some garbage as the subnqn
string, which causes the transport_id_compare to fail, and
then the qpair won't get disconnected even if the other
trid fields and subsystem pointers match.

For the failover.sh test, this means that the qpair doesn't
get disconnected, so we never go down the reset path
on the initiator side and don't see the "Resetting" strings
expected in the log.

This similarly impacts the host/timeout.sh test, which is
also fixed by this patch.  There were multiple failing
signatures, all related to remove_listener not working
correctly due to this bug.

While the get_listen_trid path is the one that caused
these bugs, the get_local_trid and get_peer_trid paths
have similar problems, so they are similarly fixed in
this patch.

Fixes issue #2862.
Fixes issue #2595.
Fixes issue #2865.
Fixes issue #2864.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36eb519cd1f434d50eebf724ecd6dbc2528288c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17788
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: <sebastian.brzezinka@intel.com>
2023-04-27 09:24:18 +00:00
Mike Gerdts
5b250c0836 vbdev_lvol: load esnaps via examine_config
This introduces an examine_config callback that triggers hotplug of
missing esnap devices.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5ced2ff26bfd393d2df4fd4718700be30eb48063
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16626
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
0cea6b57f6 lvol: add spdk_lvol_get_by_* API
spdk_lvol_get_by_uuid() allows lookup of lvols by the lvol's uuid.

spdk_lvol_get_by_names() allows lookup of lvols by the lvol's lvstore
name and lvol name.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Id165a3d17b76e5dde0616091dee5dff8327f44d0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17546
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
b7d84562cb lvol: add spdk_lvol_iter_immediate_clones()
Add an interator that calls a callback for each clone of a snapshot
volume. This follows the typical pattern of stopping iteration when the
callback returns non-zero.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: If88ad769b72a19ba0993303e89da107db8a6adfc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17545
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
712f9aa452 lvol: hotplug of missing esnaps
This introduces spdk_lvs_notify_hotplug() to trigger the lvstore to call
the appropriate lvstore's esnap_bs_dev_create() callback for each esnap
clone lvol that is missing the device identified by esnap_id.

Change-Id: I0e2eb26375c62043b0f895197b24d6e056905aa2
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16428
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
f2dbb50516 lvol: keep track of missing external snapshots
If an lvol is opened in degraded mode, keep track of the missing esnap
IDs and which lvols need them. A future commit will make use of this
information to bring lvols out of degraded mode when their external
snapshot device appears.

Change-Id: I55c16ad042a73e46e225369bfff2631958a2ed46
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16427
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
87666f5286 blob: esnap clones are not clones
spdk_blob_is_clone() should return true only for normal clones. To
detect esnap clones, use spdk_blob_is_esnap_clone(). This also clarifies
documentation of spdk_blob_is_esnap_clone() to match the implementation.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I9993ab60c1a097531a46fb6760124a632f6857cd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17544
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
8b3dcd6191 blob: add is_degraded() to spdk_blob_bs_dev
The health of clones of esnap clones depends on the health of the esnap
clone. This allows recursion through a chain of clones so that degraded
state propagates up from any back_bs_dev that is degraded.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Iadd879d589f6ce4d0b654945db065d304b0c8357
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17517
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
09bf2b2092 blob: add spdk_blob_is_degraded()
In preparation for supporting degraded lvols, spdk_blob_is_degraded() is
added. To support this, bs_dev gains an optional is_degraded() callback.
spdk_blob_is_degraded() returns false so long as no bs_dev that the blob
depends on is degraded. Depended upon bs_devs include the blobstore's
device and the blob's back_bs_dev.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ib02227f5735b00038ed30923813e1d5b57deb1ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17516
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-04-26 17:32:13 +00:00
Mike Gerdts
1db33a8f74 blob: add spdk_blob_get_esnap_bs_dev()
While getting memory domains, vbdev_lvol will need to be able to access
the bdev that acts as the lvol's external snapshot. The introduction of
spdk_blob_get_esnap_bs_dev() facilitates this access.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I604c957a468392d40b824c3d2afb00cbfe89cd21
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16429
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-04-26 17:32:13 +00:00
Konrad Sztyber
55d6cc0eae accel: add method for getting per-channel opcode stats
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic3cc0ddc5907e113b6d9d752c9bff0f526458a11
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17625
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2023-04-26 11:15:40 +00:00
Konrad Sztyber
d7b29fb9d5 accel: collect stats on the number of processed bytes
For operations that have differently sized input/output buffers (e.g.
compress, decompress), the size of the src buffer is recorded.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1ee47a2e678ac1b5172ad3d8da6ab548e1aa3631
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17624
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-04-26 11:15:40 +00:00
Konrad Sztyber
7c621ff206 accel: specify number of events when updating stats
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5b611c8978b581ac504b033e1f335a2e10a9315b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17623
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-26 11:15:40 +00:00
Konrad Sztyber
0de931dc6b accel: move accel_get_iovlen() up
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6117057a1e3812386a0fb7a10e07978415a48261
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17622
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2023-04-26 11:15:40 +00:00
Konrad Sztyber
9a377ecb22 accel: append support for crc32c
It is now possible to append an operation calculating crc32c to an accel
sequence.  A crc32c operation needs special care when it's part of a
sequence, because it doesn't have a destination buffer.  It means that
we can remove copy operations following crc32c only when it's possible
to change the dst buffer of the operation preceding crc32c.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I29204ce52d635162d2202136609f8f8f33db312d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17427
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-04-26 11:15:40 +00:00
Konrad Sztyber
2b1ad70c4c accel: check operation type in accel_task_set_dstbuf()
This will reduce the amount of changes in the following patch which
makes this function recursive.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If8da6ae52d78358b66b2d9303413a9723687a767
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17568
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-04-26 11:15:40 +00:00
Mike Gerdts
b0c93eb3fb accel: destroy g_stats_lock during finish
g_stats_lock is an spdk_spin_lock that is initialized as the module is
loading. With this change, it is destroyed as the module finishes.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5263547f6d0e8981765d59665bd826cf07a6f83e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17681
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-04-26 11:06:02 +00:00
Konrad Sztyber
bade2d8db5 accel: delay finish until all IO channels are released
This ensures that there are no more outstanding operations, so we can
safely free any global resources.

Fixes #2987

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iac423b4f2a1183278d1db20f96c1a3b1bb657f85
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17767
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-26 11:06:02 +00:00
Jim Harris
e407385e03 env_dpdk: add ERRLOGs to help debug issue #2983
Issue #2983 shows a case where we seem to get a
device remove notification from DPDK (via vfio
path) after we have already detached the device
explicitly by SPDK.

This issue has proven difficult to reproduce
outside of the one observed failure so far, so
adding a couple of ERRLOGs into this path to help
confirm the this theory should it happen again.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0fda4229fe150ca17417b227e8587cd7fbda6692
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17631
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-25 16:54:59 +00:00
Sebastian Brzezinka
737667e155 lib/env_ocf: place allocator variable on hugepages
When using `__lsan_do_recoverable_leak_check` (e.g when fuzzing),
to check for leaks during runtime. Leak sanitizer can not follow
reference of memory that is allocated on heap (e.g. calloc)
and then stored on hugepage causing lsan to incorrectly report
direct leak.

Fixes #2967

Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I3511e117a07ca8daa96f19bf1437c0d788b64cb1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17682
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Amir Haroush <amir.haroush@huawei.com>
2023-04-21 23:49:28 +00:00
Jim Harris
672710c8fc nvme/tcp: increase timeout for async icreq response
This was arbitrarily picked as 2 seconds in commit
0e3dbd. But for extremely high connection count
use cases, such as nvme-perf with several cores
and high connection count per core, this 2 second
time window can get exceeded.

So increase this to 10 seconds, but only for qpairs
that are being connected asynchronously.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I906ca9e6561b778613c80b739a20bd72c807216c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17619
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-04-20 10:56:42 +00:00
Marcin Spiewak
324e3261e6 lib/lvol: lvs_load() shall return, if options are invalid
lvs_load() function verifies if options passed to it
are valid, but doesn't return, if they are not (only error
is logged and callback is called with -EINVAL code). Now
it is corrected and the function ends after the error
is reported.

Change-Id: I19b0b22466b6980345477f62084d27ef13414752
Signed-off-by: Marcin Spiewak <marcin.spiewak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17582
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-04-19 06:37:29 +00:00
Konrad Sztyber
c5efdd55c2 accel: move merging dst buffer to a function
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I62b73f1802a9de35767b72c2cc4ee115e895c538
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17426
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-19 06:36:20 +00:00
Konrad Sztyber
3e0e077939 accel: copy memory domain context when merging tasks
When changing src/dst buffers, we copied memory domain pointers, but we
didn't copy memory domain context, which is obviously incorrect.  It was
probably missed, because we never append a copy with non-NULL memory
domain.  Added a unit test case to verify this behavior.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic174e0e72c33d3f437f0faddd3405638049f0c74
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17425
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-04-19 06:36:20 +00:00