Commit Graph

2142 Commits

Author SHA1 Message Date
Artur Paszkiewicz
882ecb55a8 util/uuid: add API to test/set null uuid
Refactor the code to use these new functions.

Change-Id: I21ee7e9a96f30fbd60106add5e8b071e86bf93c9
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Jim Harris
8de18d3ccb nvmf: use iterator APIs in nvmf_tgt_destroy_cb
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I27b1b851fc8f47150670636cb65ccba40d1a57d6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17961
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
2023-05-09 17:58:11 +08:00
Jim Harris
c4ce596187 nvmf: refactor nvmf_tgt_destroy_cb
This preps for some upcoming patches as well as
removing two levels of indentation.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4f685c1e44ec4aa261e68af1786cfc110f451ed5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17960
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00
Jim Harris
c387766501 nvmf: use iterator APIs in nvmf_tgt_create_poll_group()
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4d9a5dd4655edb8315503e7551aec1926d1cc017
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17959
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-05-09 17:58:11 +08:00
Jim Harris
37d0433fc3 nvmf: use iterator APIs to generate discovery log
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iedd1c0a92e8b5f839ad4905d8063a04ec47f3d9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17938
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00
Jim Harris
c79cfb193b nvmf: fix comparison in nvmf_stop_listen_disconnect_qpairs
This function disconnects any qpairs that match both
the listen trid and the subsystem pointer.  If the
specified subsystem is NULL, it will just disconnect
all qpairs matching the listen trid.

But there are cases where a qpair doesn't yet have an
associated subsystem - for example, before a CONNECT
is received.

Currently we would always disconnect such a qpair, even
if a subsystem pointer is passed.  Presumably this check
was added to ensure we don't dereference qpair->ctrlr
when it is NULL but it was added incorrectly.

Also while here, move and improve the comment about
skipping qpairs.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8b7988b22799de2a069be692f4a5b4da59c2bad4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17854
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2023-05-09 17:58:11 +08:00
Jim Harris
4a47f1f926 usdt: add SPDK_DTRACE_PROBE variants that don't collect ticks
While userspace probes have a high overhead when enabled due
to the trap, it is still cleaner and slightly more efficient
to not have all of the SPDK_DTRACE_PROBE macros implicitly
capture the tsc counter as an argument.

So rename the existing SPDK_DTRACE_PROBE macros to
SPDK_DTRACE_PROBE_TICKS, and create new SPDK_DTRACE_PROBE
macros without the implicit ticks argument.

Note this does cause slight breakage if there is any
out-of-tree code that using SPDK_DTRACE_PROBE previously,
and programs written against those probes would need to
adjust their arguments.  But the likelihood of such code
existing is practically nil, so I'm just renaming the
macros to their ideal state.

All of the nvmf SPDK_DTRACE_PROBE calls are changed to
use the new _TICKS variants.  The event one is left
without _TICKS - we have no in-tree scripts that use
the tsc for that event.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icb965b7b8f13c23d671263326029acb88c82d9df
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17669
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-05-09 17:58:11 +08:00
Alexey Marchuk
e16f4bc7ce lib/nvmf: Defer port removal while qpairs exist in poll group
The following heap-use-after-free may happen when RDMA listener
is removed:
1. At least 2 listeners exist, at least 1 qpair is created
on each listening port
2. Listener A is removed, in nvmf_stop_listen_disconnect_qpairs
we iterate all qpair (let's say A1 and B1) and we check if qpair's
source trid matches listener's trid by calling
nvmf_transport_qpair_get_listen_trid. Trid is retrieved from
qpair->listen_id which points to the listener A cmid. Assume that
qpair's A1 trid matches, A1 starts the disconnect process
3. After iterating all qpairs on step 2 we switch to the next
IO channel and then complete port removal on RDMA transport
layer where we destroy cmid of the listener A
4. Qpair A1 still has IO submitted to bdev, destruction is postponed
5. Listener B is removed, in nvmf_stop_listen_disconnect_qpairs
we iterate all qpairs (A1 and B1) and try to check A1's listen trid.
But listener A is already destroyed, so RDMA qpair->listen_id points
to freed memory chunk

To fix this issue, nvmf_stop_listen_disconnect_qpairs was modified
to ensure that no qpairs with listen_trid == removed_trid exist
before destroying the listener.

Fixes issue #2948

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Iba263981ff02726f0c850bea90264118289e500c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17287
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Anil Veerabhadrappa
bf84d7d814 nvmf/fc: delegate memory object free to LLD
'args' object in nvmf_fc_adm_evnt_i_t_delete() is actually allocated in
the FC LLD driver and passed to nvmf/fc in nvmf_fc_main_enqueue_event() call.
So this object should be freed in the LLD's callback function.

Change-Id: I04eb0510ad7dd4bef53fc4e0f299f7226b303748
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17836
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-05-09 17:58:11 +08:00
Jim Harris
d333b553f3 nvmf: initialize trid param in get_***_trid paths
When removing a listener, for example with
nvmf_subsystem_remove_listener RPC, we use the concept of a
"listen trid" to determine which existing connections
should be disconnected.

This listen trid has the trtype, adrfam, traddr and trsvcid
defined, but *not* the subnqn.  We use the subsystem pointer
itself to match the subsystem.

nvmf_stop_listen_disconnect_qpairs gets the listen trid
for each qpair, compares it to the trid passed by the
RPC, and if it matches, then it compares the subsystem
pointers and will disconnect the qpair if it matches.

The problem is that the spdk_nvmf_qpair_get_listen_trid
path does not initialize the subnqn to an empty string,
and in this case the caller does not initialize it either.
So sometimes the subnqn on the stack used to get the
qpair's listen trid ends up with some garbage as the subnqn
string, which causes the transport_id_compare to fail, and
then the qpair won't get disconnected even if the other
trid fields and subsystem pointers match.

For the failover.sh test, this means that the qpair doesn't
get disconnected, so we never go down the reset path
on the initiator side and don't see the "Resetting" strings
expected in the log.

This similarly impacts the host/timeout.sh test, which is
also fixed by this patch.  There were multiple failing
signatures, all related to remove_listener not working
correctly due to this bug.

While the get_listen_trid path is the one that caused
these bugs, the get_local_trid and get_peer_trid paths
have similar problems, so they are similarly fixed in
this patch.

Fixes issue #2862.
Fixes issue #2595.
Fixes issue #2865.
Fixes issue #2864.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36eb519cd1f434d50eebf724ecd6dbc2528288c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17788
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: <sebastian.brzezinka@intel.com>
2023-05-09 17:58:11 +08:00
Alexey Marchuk
ec13730033 lib/nvmf: Deprecate cb_fn in spdk_nvmf_qpair_disconnect
Handling this callback is quite complex and may lead to
various problems. In most of places, the actual event
when qpair is dosconnected is not importnat for the
app logic. Only in shutdown path we need to be sure
that all qpairs are disconnected, it can be achieved
by checking poll_group::qpairs list

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I453961299f67342c1193dc622685aefb46bfceb6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17165
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-09 17:58:11 +08:00
Alexey Marchuk
b0ef9637e5 lib/nvmf: Update spdk_nvmf_qpair_disconnect return value
If the qpair is already in the process of disconnect,
the spdk_nvmf_qpair_disconnect API now return -EINPROGRESS
and doesn't call the callback passed by the user.

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: If996b0496bf15729654d18771756b736e41812ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17164
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Alexey Marchuk
49415f8ece lib/nvmf: Do not use cb_fn in spdk_nvmf_qpair_disconnect
Current implementation of spdk_nvmf_qpair_disconnect
saves and calls user's callback correctly only on
the first call. If this function is called when
qpair is already in the process of disconnect, the
cb_fn is called immediately, that may lead to stack
overflow.

In most of places this function is called with
cb_fn = NULL, that means that the real qpair disconnect
is not important for the app logic. Only in several
places (nvmf tgt shutdown flow) that is important to
wait for all qpairs to be disconnected.

Taking into account complexity related to possible stack
overflow, do not pass the cb_fn to spdk_nvmf_qpair_disconnect.
Instead, wait until a list of qpairs is empty in shutdown path.

Next patches will change spdk_nvmf_qpair_disconnect behaviour
when disconnect is in progress and deprecate cb_fn and ctx
parameters.

Fixes issue #2765

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Ie8d49c88cc009b774b45adab3e37c4dde4395549
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17163
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Alexey Marchuk
d471bca4cf nvmf/vfio_user: Post SQ delete cpl when qpair is destroyed
This patch removes usage of cb_fn argument of
spdk_nvmf_qpair_disconnect API. Instead of relying
on the callback, post a completion on delete SQ
command when transport qpair_fini is called.

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I68dec97ea94e89f48a8667da82f88b5e24fc0d88
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17168
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-05-09 17:58:11 +08:00
Ben Walker
43e68a8b1f nvmf/tcp: Wait for PDUs to release when closing a qpair
In the presence of hardware offload (for data digest) we may not be
able to immediately release all PDUs to free a connection. Add a
state to wait for them to finish.

Fixes #2862

Change-Id: I5ecbdad394c0296af6f5c2310d7867dd9de154cb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16637
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Jim Harris
c12d468d02 nvmf: retry QID check if duplicate detected
A host will consider a QID as reusable once it disconnects
from the target.  But our target does not immediately
free the QID's bit from the ctrlr->qpair_mask - it waits
until after a message is sent to the ctrlr's thread.

So this opens up a small window where the host makes
a valid connection with a recently free QID, but the
target rejects it.

When this happens, we will now start a 100us poller, and
recheck again.  This will give those messages time to
execute in this case, and avoid unnecessarily rejecting
the CONNECT command.

Tested with local patch that injects 10us delay before
clearing bit in qpair_mask, along with fused_ordering
test that allocates and frees qpair in quick succession.
Also tested with unit tests added in this patch.

Fixes issue #2955.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I850b895c29d86be9c5070a0e6126657e7a0578fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17362
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00
Swapnil Ingle
5a2d23c5f1 nvmf/vfio_user: move cq_is_full() closer to caller
Move cq_is_full() closer to its caller post_completion() and along with
fixing comments.

Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I93262d1805f0f9f075c6946ed97cd3006ffba130
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16415
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-09 17:58:11 +08:00
Jim Harris
3741a85228 nvmf: simplify ctrlr_add_qpair_and_update_rsp code path
This prepares for upcoming patch to enable retrying
CONNECT commands when a duplicate QID is detected.

Most important part is moving the spdk_nvmf_request_complete
call into the ctrlr_add_qpair function.  This facilitates
deferring the spdk_nvmf_request_complete call when we want
to do a retry.

Note: for adding admin qpair, we now call
spdk_nvmf_request_complete instead of _nvmf_request_complete,
but this is actually no functional change.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1282be00441851ee2ed3c2dd281e68b4475d3d28
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17361
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-04-04 23:01:55 +00:00
Jim Harris
3b54b0d70a nvmf: move qid check to nvmf_ctrlr_add_io_qpair
The max qid check is only needed in the add_io_qpair
function, since an admin qpair could never fail this
check.

Upcoming patches will retry a duplicate QID check
after a short period of time.  This patch helps
make the upcoming code path more clear, since there
is no need to do this max qid check more than once.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I95f69fc39ae7989d51d36b0d04b5d9a4087a7c4a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17360
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-04-04 23:01:55 +00:00
Szulik, Maciej
414ff9bc23 nvmf: make async event and error related functions public
This patch makes functions related to Asynchronous Event and error
handling public, so that they can be used in custom nvmf transport
compiled out of SPDK tree.

Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com>
Change-Id: I253bb7cfc98ea3012c179a709a3337c36b36cb0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17237
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-03-31 17:41:35 +00:00
Jim Harris
3b138377e2 nvmf/tcp, nvmf/rdma: default to dynamic buf_cache_size
The nvmf generic transport code creates a mempool of
I/O buffers, as well as its own per-thread cache
of those buffers.  The mempool was being created
with a non-zero mempool cache, effectively duplicating
work - we had a cache in the mempool and then another
in the transport layer.

So patch 019cbb9 removed the mempool cache, but the
tcp transport was significantly affected by it.  It
uses a default 32 buffers per thread cache which is
very small, it was actually mostly relying on the
mempool cache (which was 512).  Performance regression
tests caught this problem, and Karol verified that
specifying a higher buf_cache_size fixed the problem.

So change both the tcp and rdma transports to specify
UINT32_MAX as the default buf_cache_size.  If the
user does not override this when creating the transport,
it will be dynamically sized based on the size of
the buffer pool and the number of poll groups.

Fixes: 019cbb9 ("nvmf: disable data buf mempool cache")

Fixes issue #2934.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idd43e99312d59940ca68402299e264cc187bfccd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17203
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
2023-03-28 20:17:21 +00:00
Jim Harris
3092c61d26 nvmf: enable dynamic buf_cache_size calculation
Allow transports to specify a default UINT32_MAX
as the buf_cache_size. If user does not override this
when creating the transport, calculate the buf_cache_size
dynamically using the number of poll groups and the
size of the buffer pool (num_shared_buffers).  We will
allocate 75% of the buffers for the caches, meaning
the buf_cache_size will be calculated as:

(num_shared_buffers * 3 / 4) / num_poll_groups

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I97768aea701060bbe0ff1925e5322229fa8d051c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17334
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-03-28 20:17:21 +00:00
Jim Harris
280a3abc9c nvmf: return early in nvmf_transport_poll_group_create
When buf_cache_size is 0, just return early.  This
allows us to un-indent a large section of code.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I167da677fdcd0504c6f2bfdb8b1a818155642f66
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17333
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-03-28 20:17:21 +00:00
Jim Harris
2597ebbede nvmf: point poll_groups back to their spdk_nvmf_tgt
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie7eaeb3aa65f0a8f8f9e811d025045fff7f77724
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-03-28 20:17:21 +00:00
Jim Harris
f9424ae73d nvmf: track num_poll_groups in spdk_nvmf_tgt
This will be useful in upcoming patch, where we
use the number of poll groups to dynamically pick
the buf_cache_size for each transport poll group.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id166098244287c56f12cdd88ba27a17fa34a4348
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17331
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-03-28 20:17:21 +00:00
Rui Chang
d0516312ff nvmf: add copy command support in get log page
add copy command support in get log page and idenfity tool

Change-Id: I8771ffb193fc80ffc12f068993005e5702f41a0d
Signed-off-by: Rui Chang <rui.chang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17162
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-03-27 11:25:35 +00:00
Rui Chang
4274fe55c9 nvmf/vfio-user: add copy support in vfio-user
Fix req length issue in supporting copy command in vfio-user.

Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: If4ec325777e1a1f00d15edb2fea4dc85016b3b95
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17279
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-03-24 07:26:14 +00:00
Peng Lian
b13ee3005d nvmf: clean sgroup->queued in _nvmf_qpair_destroy when ctrlr is NULL
Let us consider the following process:
  1. one fabric connect request A comes but the subsystem is paused
     due to adding/removing ns or other operations, so this request A
     will be put into sgroup->queued until the subsystem becomes active;
  2. the subsystem is paused for a long time until the connect timeout,
     related qpair is destroyed, the sgroup->queued will not be cleaned
     because qpair's ctrlr is NULL;
  3. if a new request B comes, it is more likely to be allocated to the
     same memory as the previous fabric command request. And it will be
     put into sgroup->queued again, where has already exists the exactly
     same pointer with request B.

This leads to the pointer hanging problem and it will cause infinitely
loop when traversing sgroup->queued!

So this patch avoids the ptr-hanging problem by checking and cleaning
all sgroups queued req whose qpair is the being destroyed qpair in
_nvmf_qpair_destroy when ctrlr is NULL.

This problem is already described in issue #2133.

Signed-off-by: Peng Lian<peng.lian@smartx.com>
Change-Id: I909d673b5050f21fa193914cc4ffe6634232fa7d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17147
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-03-22 10:11:30 +00:00
Tomasz Zawadzki
f6866117ac freebsd: return negated error from getaddrinfo()
On FreeBSD getaddrinfo() report positive error code
values, meanwhile Linux does it with negative ones.

Make sure that regardless of the system used,
error codes with same sign are reported.
This can be observed in the log reported in #2936.

Besides the above, in some instances replaced EINVAL
with the actual return value.

Change-Id: I7f88c314bdf5c3a03f8661c2213e33b2fc276ef7
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17097
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-03-10 16:44:37 +00:00
Swapnil Ingle
1afb1effc4 nvmf/vfio_user: simplify cq_is_full()
Made cq_is_full() as wrapper around cq_free_slots()

Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I392f62e959c7e23b4360e77759027ea55c2398b9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16789
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2023-03-08 08:45:00 +00:00
Swapnil Ingle
23b518a013 nvmf/vfio_user: mitigate cq full race
Linux host nvme driver processes all pending cqe's in one batch along with
completing backing blk_mq req's and later rings cq_doorbell once for all
processed cqes.
As blk_mq req's are completed there is room for more submissions
before ringing cq_doorbell.

This may race with vfio_user cq_is_full() which uses cq_doorbell to make final
decision and as host has not updated cq_doorbell we fail with cq_full error.

To mitigate this only process commands from sq which have free cq slot.

Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0cefb41df8099eb71de25923d05a9fcb28e4d124
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16788
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2023-03-08 08:45:00 +00:00
Rui Chang
8613654074 bdev: Add default copy command support in bdev
Add default copy command support in bdev layer for backing devices that
does not support copy command.

Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I5632e25544e95ac0c53ff91c4cd135dac53323ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16638
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-03-07 11:52:45 +00:00
sijie.sun
549be9ad81 nvmf/rdma: Recreate resources and listeners after IB device is hotplugged
IB device may be unplugged & hotplugged when modifying slaves of bonding
IB devices. This patch will try to recreate ibv device contexts, poller
and listeners after IB devices come back.

Signed-off-by: sijie.sun <sijie.sun@smartx.com>
Change-Id: I3288174bad847edc2d9859cb34aa93c6af8c673b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15616
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-03-07 11:50:05 +00:00
sijie.sun
8ddc5cd4a7 nvmf/rdma: Destroy all related resources after IB device removed
When IBV_EVENT_DEVICE_FATAL & RDMA_CM_EVENT_DEVICE_REMOVAL occurs,
destory all userspace resources such as qp, poller and ibv_context.

Signed-off-by: sijie.sun <sijie.sun@smartx.com>
Change-Id: Ie4832e4804eb572d6ec3bdc44fb7f9339f443d7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15615
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-03-07 11:50:05 +00:00
Jim Harris
584d295245 nvmf/fc: fix memleaks
Submitted by @udayawati via GitHub comment on
issue #2872.

Fixes issue #2872.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id662fc0178f6112dfe791733bda43f634107403f

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
2023-03-06 13:21:38 +00:00
Jacek Kalwas
019cbb9335 nvmf: disable data buf mempool cache
Depending on the number of cores there are sporadic issues getting
elements of that pool although free elements are there during poll
group creation. Operation returns -ENOBUF. It results in odd notice
msg.

"nvmf_transport_poll_group_create: *NOTICE*: Unable to reserve the
full number of buffers for the pg buffer cache. Decrease the number of
cached buffers from 455 to 1366"

In this case 1366 is the actual number of available elements in the
pool. Few poll groups suceeds and few are ending up with the buffer
cache size set to 0.

Issue has been rootcaused as bug or behaviour change in DPDK v22.01.

Consider example:
We create DPDK mempool with 4K buffers, cache of 256. When first poll
group requests 512 buffers, DPDK mempool first looks in its per-core
cache, sees no buffers (mempool buffer cache doesn't get prepopulated)
and then requests 512 + 256 buffers from the backing pool. It returns
512 of the buffers to the user, and puts the other 256 buffers in the
cache ...it should only request 512 buffers total. For 8 cores and 512
buffers requested only 5 cores will get their buffers.

Disabling mempool cache seems to workaround the issue. More effective
cache is already implemented on nvmf generic layer.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I3149dea95a4f24a75dd0074eda9468c4856d901d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16913
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-02-28 08:57:23 +00:00
Jacek Kalwas
96073478de nvmf: introduce async transport create
An example of async operation which can be handled on specific
transport layer could be creation of spdk thread followed by
a poller registration.

This change also aligns with transport destroy which is already
async operation.

Current transport create function is marked deprecated and is meant
for transports supporting sync create only to maintain backward
compatibility. Async version supports both create operations.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1f5a477819e58f30983d26f81a1416bed1279ecf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16463
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-02-16 16:45:08 +00:00
Ankit Kumar
7bbeb80a31 nvme: support 64 LBA formats for NVM and ZNS command set
Format LBA size (FLBAS) is updated to have:
Bit 3:0 as least significant 4 bits for format index
Bit 6:5 as most significant 2 bits for format index

NVMe format command fields are updated accordingly.

Add a new helper function to fetch the correct format index.
Update examples and unit test files accordingly.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Change-Id: I2d6d9045b9d65ae91cb18843ca75b59cc27ed2f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16515
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-02-15 10:37:56 +00:00
John Levon
11e67d93ff lib/nvmf: sanity check req->iovcnt
If req->data is set, with all the previous changes, then req->iovcnt
should also be more than zero.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I29b5f45541c9dba2dd896109dd43d2b5321ec467
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16274
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-02-13 13:50:51 +00:00
John Levon
70a82d9a95 nvmf: add spdk_nvmf_request_copy_*_buf()
Also deprecate the existing spdk_nvmf_request_data() API, which is
incompatible with iovecs.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I44df8ff30a431873a0c2f34b0cdb58df858fd7e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16200
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-02-13 13:50:51 +00:00
John Levon
cc3184b8b4 nvmf: handle iovecs in reservation handling
Use req->iov instead of req->data in reservation handling code.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I6d79711d03f45bd5e118c6324d22decad887a788
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16199
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-02-13 13:50:51 +00:00
Konrad Sztyber
7db282dc26 tcp: add note about default case in qpair_abort_request()
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I2f3741596be2f06b36894306203214a4ef096d1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16694
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-02-13 13:50:15 +00:00
Konrad Sztyber
25b0c20c0a tcp: remove abort handling for reqs in ZCOPY_START_COMPLETED
This never happens, as requests in this state are always immediately
transitioned to other states.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0408ed9d8003d364bc38c86a9a50312721ab1284
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16642
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-02-13 13:50:15 +00:00
Konrad Sztyber
b12419a231 tcp: don't abort requests waiting for R2T ACK
It is possible for requests waiting for R2T ACK to receive H2C PDU
before receiving the ACK.  Therefore, the following sequence:

1. Host sends a write request to the target.
2. Target sends R2T PDU to the host and sets request's state to
   AWAITING_R2T_ACK.
3. Host sends H2C PDU to the target, but it doesn't reach the target
   yet.
3. Host sends an abort command to abort that request.  Request's state
   is changed to READY_TO_COMPLETE.
4. Target receives the H2C PDU, sees that request's state is
   READY_TO_COMPLETE, which is unexpected, and terminates the
   connection.

will cause the target to terminate the connection, which is obviously
incorrect.

So, to avoid that, we can treat AWAITING_R2T_ACK state in the same way
as TRANSFERRING_HOST_TO_CONTROLLER and register a poller waiting for the
state to be changed.

Fixes #2789.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idddc627050000b74663dba397dc14d10aa0e284f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-02-13 13:50:15 +00:00
plestk
96dca1676b nvmf: Fix new line at the end of log message
Signed-off-by: plestk <plestringant@kalray.eu>
Change-Id: I24c59d0d5b7a889e03d77a40b14ac95f6fe42afe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16102
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-01-30 16:29:49 +00:00
John Levon
6b206e3110 nvmf: sanity check passthru handlers
These routines can only handle a single buffer; double check that is the
case, and fail if not.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I136482c27c73655887c49405f747b8ed073f7b69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16198
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-30 16:28:35 +00:00
John Levon
e1413e9197 nvmf/rdma: use req->iov consistently
Use req->iov as needed, to make it easier to remove req->data later.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ie625f374e846f7e6afd6a5d143a5174d27d419b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16256
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-30 16:28:35 +00:00
John Levon
fd05a2ff47 nvmf/tcp: use req->iov consistently
Use req->iov as needed, to make it easier to remove req->data later.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I4095e3c4089b730db123705d0168cd409375cc43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16196
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-30 16:28:35 +00:00
John Levon
c0ddb423e0 nvmf/vfio-user: use req->iov consistently
Use req->iov as needed, to make it easier to remove req->data later.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Id23c4ef8018d6a7aad42c3d5054fa9addcf16f0a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16195
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-30 16:28:35 +00:00
John Levon
49c0d28ab1 nvmf: drop req->data usage in ctrlr.c
Refer to req->iov instead of req->data. As the queue connection code
already presumes a single data buffer, add some sanity checking for
this.

We also need to fix vfio_user.c as a result to correctly set ->iovcnt.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ib1e4ef3885200ffc5194f00b4e3fe20ab1934fd7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16194
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-30 16:28:35 +00:00