Commit Graph

10043 Commits

Author SHA1 Message Date
Tomasz Zawadzki
2f8a723edf ublk: use count in ublk_io_xmit() in release builds
count was unused in release builds. Otherwise following
error is produced on clang build:

19:30:56  ublk.c:974:14: error: variable 'count' set but not used
[-Werror,-Wunused-but-set-variable]
19:30:56          int rc = 0, count = 0, tag;
19:30:56                      ^
19:30:56  1 error generated.

Change-Id: If7ca88de37ed6e40826e09b055355c07f67c8869
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
2023-01-25 10:01:03 +00:00
Shuhei Matsumoto
bbd3d96b85 nvme_rdma: Ignore response if its QP was already destroyed
This is a workaround but is necessary to fix the github issue #2874.
Due to some unknown reason, in nightly test with Intel e810 NICs
when a qpair is created with synchronous mode and connection errors
are detected, the qpair is destroyed even if requests for the qpair are
still inflight. Then, nvme_rdma_process_recv_completion() causes NULL
pointer acccess. To fix this NULL pointer access, change
nvme_rdma_process_recv_completion() to return immediately if rsp->rqpair
is NULL. Add a TODO comment to find a root cause and really fix the
issue.

One of the fixes for the issue #2874.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic810922f7ea1b32373b15f4e0cf7c2429659cbab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16431
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-01-25 10:01:03 +00:00
Shuhei Matsumoto
9aabfb59d9 nvme_rdma: Fix null pointer access and memory leaks for rqpair->reqs and rsps
Supporting SRQ caused two kinds of memory leaks. Fix both in this patch.

1. rqpair->rsps was leaked and null pointer access occurred

An error was detected during the nightly nvmf_delete_subsystem test.
The NVMe perf tool crashed with SIGABRT.

The reason of the crash was

nvme_rdma.c:2504:2: runtime error: member access within null pointer of type 'struct nvme_rdma_rsps'

This was caused by clearing rqpair->rsps before freeing rqpair->rsps.
rqpair->rsps should have been held until rqpair->rsps is freed. However,
when we support SRQ, rqpair->rsps was cleared when releasing rqpair->poller
by mistake. rqpair->rsps should be cleared only if SRQ is enabled because
in this case rqpair uses rsps of rqpair->poller.

2. rqpair->reqs and rsps are leaked for admin qpair at controller reset

To avoid unnecessary alloc and free for rqpair->rsps when enabling SRQ,
nvme_rdma_create_reqs() and nvme_rdma_create_rsps() were moved to
nvme_rdma_connect_established().

On the other hand, nvme_rdma_free_reqs() and nvme_rdma_free_rsps() were
called by nvme_rdma_ctrlr_delete_io_qpair().

However, at controller reset, admin qpair was just disconnected and
reconnected. In this case, nvme_rdma_create_reqs() and
nvme_rdma_create_rsps() were called again without calling
nvme_rdma_free_reqs() and nvme_rdma_free_rsps().

Hence, memory leak occurred.

To fix the memory leak, move nvme_rdma_free_reqs() and nvme_rdma_free_rsps()
from nvme_rdma_ctrlr_delete_io_qpair() to nvme_rdma_qpair_destroy().

One of the fixes fot the issue #2874

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I167ba908cff73d7a0be2248affce4c54f233da51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16384
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-25 10:01:03 +00:00
John Levon
f77387cce8 nvmf/transport: fix spdk_nvmf_request_free_buffers()
This routine was neglecting to reset ->iovcnt, leading to havoc when the
request was re-used.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ifd4ac47b95edd517ce5df731c682697bf51da819
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16273
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-24 18:18:49 +00:00
Ben Walker
b11407d04c nvmf/tcp: Use an spdk_sock_flush instead of an spdk_sock_write
For IC_RESP and C2H_TERM_REQ, use the regular async write path but add
an additional flush. The flush operation reports errors, so we may at
some point attempt to handle busy conditions by blocking. However, for
now this doesn't do that because previously the writev call didn't
either.

Change-Id: I4d05e19ac6dd781be7c96005549abdb52511b8c1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15213
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-24 18:18:33 +00:00
Ben Walker
ac9dbf7912 sock: Allow flushing even if the socket is in a poll group
If we call flush, we want to flush regardless of whether there is a poll
group.

Change-Id: I88680105d999a909f3f1fe75be9caff31a8555ff
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-24 18:18:33 +00:00
Jim Harris
c80a57cad3 ublk: fix device queue shutdown processing
When a queue has finished processing on its polling
thread, it sends a message to the app thread signaling
that it is done.  Then when the app thread gets
messages from all of the queues for that device, it can
proceed with tearing the device down.

But if there are still ctrl_ring commands in progress,
it needs to wait.  Previously it would register a
poller that would retry the same function if it
found commands in progress.  But the problem is that
it did not differentiate the function getting called
as a direct message from the polling thread vs. retried
via the poller on the app thread.  This could result
in lost messages.

So fix it to always increment the queues_closed
counter (renamed from q_deinit_num), and then
only check for ctrl ring commands in progress after
we received all of the queue closed messages.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0ea23ebc69acb29d5ab7e1d86ddbe74b9973e225
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16405
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
2023-01-24 17:09:34 +00:00
Jim Harris
57c27f02d5 ublk: remove extra pthreads from ctrl uring processing
We make a few changes here to enable this:

1) Set IORING_SETUP_SQPOLL on the control ring.
   Otherwise when UBLK_START_DEV is submitted it
   will be processed in the context of the system
   call itself, resulting the kernel block layer
   submitting reads to the new device which blocks
   the thread - meaning the system call may never
   return.
2) Save the cmd_op in each sqe, along with the
   spdk_ublk_dev pointer.
3) Add a poller to poll the ctrl ring.  The poller
   can get the ublk and cmd_op from the cqe to
   know which ctrl operation completed and take
   next steps as necessary.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia0e51a4ff74781c85967c54969fbfc67a0d3f115
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16404
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-24 17:09:34 +00:00
Kamuda, Szymon
cb2f0a2cf5 nvmf: pause/resume polling for the target
There is a way to pause/resume spdk pollers, however there is no way
to achieve that using public API for the given target which has
a hook behaving similar to pollers. Exposing such functionality can
be used for pausing and restoring target pollers during
reset, e.g. new commands should not be fetched to assure
that all internal resources can be cleared/reinitialized safety.
Pausing target poller during the reset will assure that, without
need for destroying transport or adding condition statements in IO path.

Similar use case might be hitless upgrade. Depending on implementation
there might be need that no new command can be submitted when
secondary processes are being switched to upgraded versions.
Pausing target pollers should be useful in this case.

Signed-off-by: Kamuda Szymon <szymon.kamuda@intel.com>
Change-Id: I419816552c710c43e02197ebcc20a967fb23b3bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15911
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-24 14:49:24 +00:00
Tomasz Zawadzki
3359bf34d6 so_ver: increase all major versions
To allow SO_MINOR updates on LTS for the whole year it is supported,
the major version for all components needs to be increased.
This is to prevent scenario where two versions exists with matching
versions, but conflicting ABI.
Ex. Next SPDK release adds an API call increasing the minor version,
then LTS needs just a subset of those additions.

Increasing major so version after LTS, allows the future releases
to update versions as needed. Yet allowing LTS to increase minor
version separately.

Disabled test for increasing SO version without ABI change, as
that is goal of this patch. This check shall be removed with SPDK 23.05
release.
Looks like this was left over from prior LTS, to avoid that
make sure it is only skipped when running against v23.01.x as latest
release.

This patch:
- increases SO_VER by 1 for all components
- resets SO_MINOR to 0 for all components
- removes suppressions for ABI tests


Short reference to how the versions were changed:
MAX=$(git grep "SO_VER := " | cut -d" " -f 3 | sort -ubnr | head -1)
for((i=$MAX;i>0;i-=1)); do find . -name "Makefile" -exec \
	sed -i -e "s/SO_VER := $i\$/SO_VER := $(($i+1))/g" {} +;  done
find . -name "Makefile" -exec \
	sed -i -e "s/SO_MINOR := .*/SO_MINOR := 0/g" {} +

Change-Id: I3e5681802c0a5ac6d7d652a18896997cd07cc8bf
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16419
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-24 08:37:21 +00:00
Shuhei Matsumoto
139940d6c1 bdev/rpc: Fix race condition for per channel bdev_get_iostat
With per channel stats called on thread A,
spdk_bdev_for_each_channel calls
spdk_for_each_channel which immediately sends
a message to thread B.
If thread B has no workload, it may execute the
message relatively fast trying to write stats to
json_write_ctx.
As result, we may have 2 scenarious:
1. json_write_ctx is still not initialized on
thread A, so thread B dereferences a NULL pointer.
1. json_write_ctx is initialized but thread A writes
response header while thread B writes stats - it leads
to corrupted json response.

To fix this race condition, initialize json_write_ctx
before iterating bdevs/channels

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Reported-by: Or Gerlitz <ogerlitz@nvidia.com>
Change-Id: I5dae37f1f527437528fc8a8e9c6066f69687dec9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16366
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-23 17:48:55 +00:00
Shuhei Matsumoto
db3869d1b2 bdev/rpc: Factor out start RPC response of bdev_get_iostat
This is a preparation for the next patch to fix the race condition of
per channel mode.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I9eaefc527ccf82011af39b8261f5b3cc12983bda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16365
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-23 17:48:55 +00:00
Jim Harris
69d4ec0708 ublk: break ublk_start_disk into two parts
For now, we will still execute the two parts
consecutively and synchronously.  Follow-up patches
will do the second part asynchronously, after the
ublk cmds associated with the first part have
completed.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I814d885a8a113c3367207d11ae09dd536eb63460
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16403
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-23 17:48:10 +00:00
Jim Harris
5f1b5f18ae ublk: change where new dev's ublk file gets opened
Upcoming patches will submit ctrl cmds and wait for
them to complete asynchronously.  So we will want to
first send the ADD_DEV and SET_PARAMS commands, wait
for them to complete, and only then open the ublk
device file.

So to prepare for that sequencing, move the open()
from _ublk_start_disk to ublk_start_disk.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9fdf19ce9b51bd552faa917e1e842f9ddfb111a1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16402
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-23 17:48:10 +00:00
Jim Harris
f08743e9eb ublk: pass ublk pointer to ublk_ctrl_cmd()
We will put this pointer into the sqe.  It will
be useful when we start doing async completions
on the ctrl ring.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3bdb728eb1d3ed66a8ecd05df208e4f36e3fbe0b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16401
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-23 17:48:10 +00:00
Tomasz Zawadzki
dfa7f14bc4 blobstore: mark blobstore dirty on extent page write
When blobstore is shutdown unexpectedly the super block should
already be marked as dirty. Only when proper blobstore unload
happens, the super block is marked as clean.

Super block is not marked as dirty on blobstore load,
but on first action that starts to modify the metadata.
At this time it only happens through blob persist,
which is fine for creation/deletion of blobs or
their modification (resize/xattr).

It works for cluster allocation with extent_table disabled,
and when extent page needs to be allocated.

Yet it fails for cases when no new extent page is required.
It will result in not marking blobstore as dirty and
then fail when loading a particular blob due to mismatch
between used_clusters and contents of extent page.

To fix that, the blobstore is now marked dirty on a very first
extent page update since blobstore load.

Fixes #2830

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ied37ecf90d46e1bc51b22c323dce278a0fa88f72
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16179
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-20 18:58:34 +00:00
Krzysztof Karas
87c59b28a3 virtio_blk: add dump opts
Currently we do not have a way to dump opts
for virtio_blk transports. This patch introduces
necessary changes to let us save and load those
via JOSN config.

Change-Id: I7ee4f31062f3d4a264f322e66a67ba3d075f1d75
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-20 18:57:38 +00:00
Jim Harris
b9f7ba0d09 ublk: fix unused variable warning
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5b8ce0a4571872e6755c5fa0abbfa1a981dd411f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16400
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-20 18:47:09 +00:00
Tomasz Zawadzki
acdfc03590 blobstore: make blob_persist_check_dirty separate from persist
This patch makes marking blobstore as dirty, separate from
persist process. Adding a context strucutre and new single
callback once completed.

It is being as part of refactor to allow marking blobstore
dirty when writing out extent page - see #2830.

Change-Id: Ie2e9cc32860697e0e747939842ab04f48fbff49b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-20 17:56:56 +00:00
Tomasz Zawadzki
3eb03cc7b6 blobstore: workaround for bs_cluster_to_lba() scan-build false positive
Patch https://review.spdk.io/gerrit/c/spdk/spdk/+/16328,
introduced a refactor for persist path. No functional change
should occur with it, but the code layout after compilation
(path length) might have changed.

It resulted in unrelated scan-build failure:
https://ci.spdk.io/results/autotest-per-patch/builds/95746/archive/scanbuild-vg-autotest/scan-build/report-d08e76.html#EndPath

Tried to replicate the issue without the above patch,
by increasing maxloop or -analyze-headers in scan-build.
Didn't result in any new failures in blobstore.

This seemed like a false positive, so it was verified:
https://review.spdk.io/gerrit/c/spdk/spdk/+/16333/

With no other options, an assert is added only to the
function where the false positive occured.

scan-build log for posterity:
blobstore.c:1062:58: warning: Division by zero [core.DivideZero]
                desc_extent_rle->extents[extent_idx].cluster_idx = lba /
lba_per_cluster;
                                                                   ~~~~^~~~~~~~~~~~~~~~~
blobstore.c:1079:58: warning: Division by zero [core.DivideZero]
                desc_extent_rle->extents[extent_idx].cluster_idx = lba /
lba_per_cluster;
                                                                   ~~~~^~~~~~~~~~~~~~~~~

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I9dd729fa13ce1c9bbcb91e4326658e2b4e326e6d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16335
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-20 17:56:56 +00:00
Konrad Sztyber
01608f1af2 accel: temporarily disable iobuf thread caches
The iobuf buffers are only used in accel when executing chained
operations.  Since none of the components in SPDK are using chaining
yet, there's little point in having per-thread iobuf caches, as they
only reduce the number of available buffers in other libraries.

This change will be reverted once bdev layer and bdev modules are
updated to support chaining.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibad19ea92f2218a8dec01e802a736cfdd357dfc6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16398
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-01-20 17:56:44 +00:00
Alexey Marchuk
8c8cd12e1d env_dpdk: Add function to iterate memory chunks in a pool
This allows to get start address and length of each
memory chunk in order to create app-specific
resources.
Since we don't want to expose rte structure in the
callback, we have to remap rte data types to SPDK.

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I3865c4cfe532c6a99a5a3c6c983ded8b9a338de1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-20 17:56:34 +00:00
paul luse
d6d0ef6c16 lib/accel: remove 2 lines of dead code
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I3c4b04109a4c2f8319408f2ba5b805db05626535
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16051
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-20 10:26:21 +00:00
paul luse
bb5083a85d bdev/compress: Port to use accel_fw instead of compressdev
directly

This patch removes hardcoded compressdev code from the
vbdev module and instead uses the accel_fw. The port required
a few changes based on how things are plumbed and accessed,
nothing that isn't be too obscure.  CI tests were updated to
run ISAL accel_fw module as well as DPDK compressdev with QAT.

Unit tests for the new module will follow in a separate patch.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I769cbc888658fb846d89f6f0bfeeb1a2a820767e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13610
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2023-01-20 10:26:21 +00:00
Yifan Bian
90a6407df6 ublk: add an rpc method to get current ublk devices
Signed-off-by: Yifan Bian <yifan.bian@intel.com>
Co-authored-by: Xiaodong Liu <xiaodong.liu@intel.com>
Change-Id: I3fdf9795b90d7a30478ba81d8144bbf2f1cbdd2a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15987
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2023-01-20 07:48:25 +00:00
Yifan Bian
e8a94a7122 ublk: add ublk to export block device
ublk could export a backend device as ublk block device (/dev/ublkb*).
A rpc method is used to add ublk device and it should be done
after creating ublk target. Corresponding, ublk_del_dev is
used to delete the specified ublk device.

Signed-off-by: Yifan Bian <yifan.bian@intel.com>
Co-authored-by: Xiaodong Liu <xiaodong.liu@intel.com>
Change-Id: I3a4ba8d8dc5f5ad241511ccbc9d3336b582a6dc5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15976
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2023-01-20 07:48:25 +00:00
Yifan Bian
a1944e0170 ublk: add ublk target creation and destruction
Add rpc methond for ublk target creation and destruction. Before to
add ublk device, need to initialize ublk target to create ublk
threads, corresponding an rpc methond to destroy ublk target is
also added. It will deinitialize ublk target and release all ublk
devices.

Signed-off-by: Yifan Bian <yifan.bian@intel.com>
Co-authored-by: Xiaodong Liu <xiaodong.liu@intel.com>
Change-Id: I5db0cf9cc68745440df999169aa1c61111010e02
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15962
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
2023-01-20 07:48:25 +00:00
Yifan Bian
ed2b53f389 ublk: add configure and event/subsystem
ublk backend could support ublk driver with kernel. Specify
configuration parameter to start it up.

Signed-off-by: Yifan Bian <yifan.bian@intel.com>
Co-authored-by: Xiaodong Liu <xiaodong.liu@intel.com>
Change-Id: I55e7d757e04315b25e9bfab5fdcbb6621be3e29e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15680
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-20 07:48:25 +00:00
Alexey Marchuk
a1dfa7ec92 module/accel: Add mlx5 accel module
The mlx5 accel module supports crypto operations.
Data buffer is split into `block_size` chunks and each
chunk is enrypted individually.
mlx5 library contains some utility functions that will
later be used by other libraries, this lib will be
exntended later.

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Iacdd8caaade477277d5a95cfd53e9910e280a73b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-19 22:00:58 +00:00
Alexey Marchuk
13f97e6737 bdev/crypto: Use accel framework
All DPDK related code is removed, handling of
RESET command was sligthly updated.
Handling of -ENOMEM was updated for cases when
accel API returns -ENOMEM

Crypto tests in blockdev.sh were extended with more
crypto_bdevs to verify NOMEM cases - that failed
with original vbdev_crypto implementation

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: If1feba2449bee852c6c4daca4b3406414db6fded
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14860
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-19 22:00:58 +00:00
Richael Zhuang
567d6b535b bdev: rename values of enum spdk_bdev_reset_stat_mode
Add the prefix "SPDK_" to values of enum spdk_bdev_reset_stat_mode
for it's public.

Change-Id: If0e2a84849048ca03b5945f6155b9719f00254b4
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16343
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-19 17:33:24 +00:00
Jim Harris
8e002e770b ftl: fix FTL_LOG_COMMON to avoid annoying gcc-12 warnings
gcc-12 is really tricky.  It detected that in
ftl_nv_cache_load_state() that when we do an FTL_NOTICELOG,
the dev and nv_cache values are associated with each
other by the SPDK_CONTAINEROF() operation.

So then in FTL_LOG_COMMON, it checks if dev is NULL.  If it
is, it doesn't print the dev->conf.name, but still prints
the varargs which include nv_cache members.  But if dev is
NULL then these nv_cache members wouldn't be valid either,
and that's what gcc-12 is complaining about, in a very
unclear way.

So now we just have FTL_LOG_COMMON contain a single line, with
a tertiary operator to print either dev->conf.name or "N/A"
depending on whether dev is NULL or not.  I suspect this
fixes it because we've replaced the if statement with
a tertiary operator that is independent from the VA_ARGS.

Fixes issue #2829 (partially).

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia56e2c7fb7966e7a5ceff35b36b0346b556ce7e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16342
Reviewed-by: <sebastian.brzezinka@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-19 11:22:46 +00:00
paul luse
91f3063b14 lib/accel: add output_size to decompress API
We had it for compress but simply didn't think of a use case for
decompress.  During the develpoment of the compressdev accel_fw
module it was discovered that compressdev does indeed provide the
uncompressed length on completion of decompress and the reducelib
uses it.  So, add it here.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I2f6a8bbbe3ef8ebe0b50d6434845f405afa7d37d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16035
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-19 11:16:01 +00:00
paul luse
976f8b0992 module/accel: Add compressDev accel_module
This is the port of the vbdev compress logic into the accel
framework.  It includes just one enhancement, to only fill each
mbuf in either src or dst array with max "window size" param to
avoid QAT errors. Note that DPDK ISAL PMD was not ported as we
have native ISAL compression in accel now.

Note: ISAL w/DPDK is still built w/this patch, that can't be
removed until the vbdev module moves to accel fw as it still
depends on DPDK ISAL PMD.

Follow-on patches will include addition C API for PMD selection,
this patch just gets equivalent functionality going.  Upcoming
patches will also convert the vbdev compress module to use the
accel framework instead of talking directly to compressdev.

More patches will also address comments on vbdev common code
that addressed here would make the review challenging.

This patch also fixes a bug in the ported code that needs to
be fixed here to pass CI.  Capability discovery was incorrect
causing all devices to appear to not support chained mbufs,
with the mbuf splitting code this is important to get right.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7f526404819b145ef26e40877122ba80a02fcf51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15178
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-19 11:16:01 +00:00
Krystyna Szybalska
537929e1d4 virtio_blk: added virtio_blk_get_transports RPC
This patch adds virtio_blk_get_transports matching the
virtio_blk_create_transport RPC. Allowing for querying
existing virtio_blk transports and displaying their options.

Signed-off-by: Krystyna Szybalska <krystyna.szybalska@gmail.com>
Change-Id: I0ec49c5f2ad11962feb5087dd376407ad125c349
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16303
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-19 10:30:26 +00:00
Richael Zhuang
8ddc102a31 bdev: add public APIs for IO statictics processing
Export functions bdev_reset_io_stat(), bdev_add_io_stat() and
bdev_dump_io_stat_json() as public APIs.

Change-Id: Ibd0bcf44f2967d79d1ceb9e183c08579410061db
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16065
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-01-19 01:57:11 +00:00
Sebastian Brzezinka
a8d21b9b55 lib/ioat: initialized ‘status’ variable before use
This patch fix `gcc-12` warning: #2829

```
warning: ‘status’ may be used uninitialized [-Wmaybe-uninitialized]
ioat.c:360:18: note: ‘status’ was declared here
  360 |         uint64_t status;
      |                  ^~~~~~
```

Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I8d010256e51cf6f1b9047d054773cb85d435ccf9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16339
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-18 21:27:01 +00:00
GangCao
687d5a8766 lib/part: check the return of spdk_bdev_register
Change-Id: I855a68dfcf6da565a97e33e4389eee5ed6141f74
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16079
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-18 15:15:02 +00:00
Shuhei Matsumoto
bcd987ea2d nvme_rdma: Support SRQ for I/O qpairs
Support SRQ in RDMA transport of NVMe-oF initiator.

Add a new spdk_nvme_transport_opts structure and add rdma_srq_size
to the spdk_nvme_transport_opts structure.

For the user of the NVMe driver, provide two public APIs,
spdk_nvme_transport_get_opts() and spdk_nvme_transport_set_opts().

In the NVMe driver, the instance of spdk_nvme_transport_opts,
g_spdk_nvme_transport_opts, is accessible throughtout.

From an issue that async event handling caused conflicts between
initiator and target, the NVMe-oF RDMA initiator does not handle
the LAST_WQE_REACHED event. Hence, it may geta WC for a already
destroyed QP. To clarify this, add a comment in the source code.

The following is a result of a small performance evaluation using
SPDK NVMe perf tool. Even for queue_depth=1, overhead was less than 1%.
Eventually, we may be able to enable SRQ by default for NVMe-oF
initiator.

1.1 randwrite, qd=1, srq=enabled
./build/examples/perf -q 1 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  162411.97     634.42       6.14       5.42     284.07
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  163095.87     637.09       6.12       5.41     423.95
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  164725.30     643.46       6.06       5.32     165.60
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  162548.57     634.96       6.14       5.39     227.24
========================================================
Total                                                                     :  652781.70    2549.93       6.12

1.2 randwrite, qd=1, srq=disabled
./build/examples/perf -q 1 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  163398.03     638.27       6.11       5.33     240.76
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  164632.47     643.10       6.06       5.29     125.22
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  164694.40     643.34       6.06       5.31     408.43
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  164007.13     640.65       6.08       5.33     170.10
========================================================
Total                                                                     :  656732.03    2565.36       6.08       5.29     408.43

2.1 randread, qd=1, srq=enabled
./build/examples/perf -q 1 -s 1024 -w randread -t 30 -c 0xF -o 4096 -r '
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  153514.40     599.67       6.50       5.97     277.22
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  153567.57     599.87       6.50       5.95     408.06
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  153590.33     599.96       6.50       5.88     134.74
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  153357.40     599.05       6.51       5.97     229.03
========================================================
Total                                                                     :  614029.70    2398.55       6.50       5.88     408.06

2.2 randread, qd=1, srq=disabled
./build/examples/perf -q 1 -s 1024 -w randread -t 30 -c 0XF -o 4096 -r '
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  154452.40     603.33       6.46       5.94     233.15
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  154711.67     604.34       6.45       5.91      25.55
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  154717.70     604.37       6.45       5.88     130.92
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  154713.77     604.35       6.45       5.91     128.19
========================================================
Total                                                                     :  618595.53    2416.39       6.45       5.88     233.15

3.1 randwrite, qd=32, srq=enabled
./build/examples/perf -q 32 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r 'trtype:RDMA adrfam:IPv4 traddr:1.1.18.1 trsvcid:4420'
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  672608.17    2627.38      47.56      11.33     326.96
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  672386.20    2626.51      47.58      11.03     221.88
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  673343.70    2630.25      47.51       9.11     387.54
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  672799.10    2628.12      47.55      10.48     552.80
========================================================
Total                                                                     : 2691137.17   10512.25      47.55       9.11     552.80

3.2 randwrite, qd=32, srq=disabled
./build/examples/perf -q 32 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r 'trtype:RDMA adrfam:IPv4 traddr:1.1.18.1 trsvcid:4420'
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  672647.53    2627.53      47.56      11.13     389.95
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  672756.50    2627.96      47.55       9.53     394.83
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  672464.63    2626.81      47.57       9.48     528.07
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  673250.73    2629.89      47.52       9.43     389.83
========================================================
Total                                                                     : 2691119.40   10512.19      47.55       9.43     528.07

4.1 randread, qd=32, srq=enabled
./build/examples/perf -q 32 -s 1024 -w randread -t 30 -c 0xF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  677286.30    2645.65      47.23      12.29     335.90
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  677554.97    2646.70      47.22      20.39     196.21
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  677086.07    2644.87      47.25      19.17     386.26
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  677654.93    2647.09      47.21      18.92     181.05
========================================================
Total                                                                     : 2709582.27   10584.31      47.23      12.29     386.26

4.2 randread, qd=32, srq=disabled
./build/examples/perf -q 32 -s 1024 -w randread -t 30 -c 0XF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  677432.60    2646.22      47.22      13.05     435.91
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  677450.43    2646.29      47.22      16.26     178.60
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  677647.10    2647.06      47.21      17.82     177.83
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  677047.33    2644.72      47.25      15.62     308.21
========================================================
Total                                                                     : 2709577.47   10584.29      47.23      13.05     435.91

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I843a5eda14e872bf6e2010e9f63b8e46d5bba691
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14174
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:53:01 +00:00
Shuhei Matsumoto
4999a9850c nvme_rdma: Move responses from rdma_qpair into a separate object
Move parallel arrays of response buffers and response SGLs from
qpair to a new responses object.

Use options to create the responses object.

Use spdk_zmalloc() to allocate the responses object because qpair
is also allocated by spdk_zmalloc().

The purpose is to share the code and the data structure between
SRQ is enabled and disabled.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ia23fe7328ae1f2f551fed5863fd1414f8567d602
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14172
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-17 23:53:01 +00:00
Konrad Sztyber
9cdbd9e4f3 accel: support appending encrypt/decrypt operations
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7bbe90936ff11b50a7cca7b15eade2025daac83b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16292
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
3de19b0b55 accel: allow modules to report memory domain support
Accel modules can now implement the get_memory_domains() callback to
indicate the types of memory domains they support.  If unimplemented, a
module is assumed not to support memory domains and accel will take care
of pulling/pushing data to local buffers prior to passing a task to be
executed by a module.

For now, similarly to the bdev layer, we only check if a module supports
memory domains, but we don't verify the types of the domains.  That
could be easily added in the future, if necessary.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia513f4f31124672b705b6dd33a2624f0ae94d3ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16027
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
a6fef9b194 accel: store in-use modules in an extra structure
It allows accel to store private data per each opcode/module without
having to change externally visible structures or allocate anything when
a module is registered. Since a single module can service multiple
opcodes at the same time, so some of these values might be duplicated.
However, there are only a handful of opcodes, so it shouldn't be a
problem.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I609a6ccc2d241cb9b8273cc2c6d1933d2bc25e0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
81fe7ef0af accel: push data if dstbuf is in remote memory domain
If the destination buffer is in remote memory domain, we'll now push the
temporary bounce buffer to that buffer after a task is executed.

This means that users can now build and execute sequence of operations
using buffers described by memory domains.  For now, it's assumed that
none of the accel modules support memory domains, so the code in the
generic accel layer will always allocate temporary bounce buffers and
pull/push the data before handing a task to a module.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia6edf266fe174eee4d28df0ca570c4d825436e60
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15948
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
316f9ea3f5 accel: pull data if srcbuf is in remote memory domain
If the source buffer is from a remote memory domain, we will now pull it
to the temporary bounce buffer before a task is executed.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I476684a4359410c69dd69a2b425b9e61d4c55a7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15947
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
78186df493 accel: put completed tasks on the completed queue in process_sequence
The first task on a sequence's task queue is the one that we're
currently executing.  By moving the place where we remove it from that
queue and place it on the completed queue to process_sequence(), we'll
be able to perform some extra steps (e.g. memory domain push) after a
task has been completed.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia98f491eb52be0156954372461e05c198c070e3b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15946
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
fb67fa548f accel: sequence state machine
Processing a sequence consists of multiple steps and we call
accel_process_sequence() mutliple times, so we need to check various
things to verify if some of those steps have already been done.  Having
a state machine allows us to reduce the number of such checks and makes
it easier to add additional steps.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I254819fee0893866de395193041b319cbad228ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15945
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
9a562043a7 accel: alloc buffers for data in remote memory domains
If a task has buffers in a remote memory domains, we'll now allocate a
buffer from local memory and replace the original buffer with it.  This
is the first step in supporting buffers in remote memory domains. To
fully support it, we'll also need to pull/push the data before/after
executing a task.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I3c86bbb6dbe6a31cb2cae8ce7d73e272ddc2734c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15944
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
957076108f accel: remove nbytes from spdk_accel_task
All operations are using iovecs to describe their buffers and only
encrypt/decrypt additionally used nbytes to store the total size of a
src buffer.  We don't really need this value in the generic accel code,
so we can let modules calculate it, if necessary.  That way, we won't
waste cycles calculating it if a module doesn't use it and it makes the
code a bit easier, as we won't have to deal with the fact that nbytes is
only valid for certain operations.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I29252be34a9af9fd40f4c7fec9d0a0c1139c562d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16306
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:34:43 +00:00
Konrad Sztyber
1866faffe2 accel: use iovecs for compress operations
Also, since this was the last operation using dst and nbytes, these
fields were removed from spdk_accel_task.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0d6b090e101c016d1bdcbe7a3bee7d6f691f1c9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15943
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-17 23:34:43 +00:00