Patch 55f947933 ("bdev: remove spdk_bdev_ext_io_opts from spdk_bdev_io")
changed the way bdev_nvme submits IO to the NVMe driver causing
performance degradation for requests with iovcnt = 1, as they also had
to go through the path that executes the reset_sgl/next_sge callbacks.
This patch reverts those changes back to the original code checking
iovcnt and using the non-SGL functions if possible.
Suggested-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5e7c6620d38b7690ff862d8cd0075afacc578217
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16961
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This is the beginning of support for external snapshots. An external
snapshot is a read-only blobstore device (struct spdk_bs_dev) that can
be used as a blob's back device. Normally a blob will have no back
device (a normal blob), a zeroes back device (a thin provisioned blob),
or a blob back device (a clone blob). When a blob has an external
snapshot ("esnap") as its back device, it is called an esnap clone.
With this patch, esnap clones can be created but they are not yet
useful. Subsequent patches in the series will plumb the IO path, enable
various features, and allow lvol bdevs to be esnap clones.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I29206b628a2b03b6386a88532565e228df988e0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14969
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Perform all tests on devices that do and do not support
spdk_bs_dev::copy.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ic4c13ade9f45709b34a57f9fb7456d6c6a790a85
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16691
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This reworks how blob_snapshot_rw() tracks the number of bytes read and
written. It has no functional change: it simply makes the patch that
follows less complex.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ieeb738b6a814e7939931fecdfaf14b9f162d8431
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16861
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If the bs_dev was opened read-write, continue to take a
read-many-write-one claim. If it was opened read-only, take a
read-many-write-none claim.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I25d977c6961f962423899fb891ec912cd847930a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16282
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
External snapshots, which will be introduced in a later commit, will
need read-only blob_bdev instances. This support is partially needed to
support underlying devices that are naturally read-only and partially to
provide an extra layer of protection against accidental writes.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ibcb28d00ad644a6053aa5f4de15471c2cd8e348a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14968
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch removes references to deprecated PMEM from accel library.
The code that was executed when ACCEL_FLAG_PERSISTENT flag is set,
is no longer needed and is removed.
_sw_accel_copy() function is removed and replaced with memcpy(), as
after PMEM removal its functionality is the same as memcpy().
_sw_accel_dualcast() is no longer needed, replaced with direct calls
to memcpy()
Removed 'flags' parameter - it is no longer needed
accel_ut.c: removed references to PMDK
deprecation.md updated
ACCEL_FLAG_PERSISTENT flag will be removed in next patch.
Change-Id: I86130466fe7a5f6ee547df1517b803035ff41a7a
Signed-off-by: Marcin Spiewak <marcin.spiewak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16899
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If a driver is registered and selected, it'll now be used to execute
sequences of accel operations. The driver has priority over accel
modules, so the modules will only be used to execute operations that the
driver cannot perform.
Once driver completes a task (or a number of tasks), it notifies accel
using standard spdk_accel_task_complete(). To let accel continue
processing a sequence, driver can call spdk_accel_sequence_continue().
This can be done when the driver executes all tasks (1), an error occurs
(2), or the driver doesn't know how to execute a given opcode (3). In
case of (3), that operation will be executed using appropriate accel
module and, while the rest of the sequence will be sent back to the
driver.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If414c02073ffc731454e03d25c7ee02bef58463b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16548
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also test a case when opening an lvol fails due to insufficient
resources.
Change-Id: I8b1b7a9c4d67e93691f89541374c7ef09a7d3f18
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16944
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Fix for number of dwords which is 0 based as per spec.
Use bitwise operators instead of division and modulus.
Change-Id: Ib315bf9394ef599317f41429742e7b8054069549
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16814
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Remove libuuid usage on FreeBSD and add dedicated implementation of
spdk_uuid API using functions from the standard library.
Fixes: #2878
Change-Id: Ie49ccb2842acad6064bffd789e4f64b7365b6e5c
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16558
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
An example of async operation which can be handled on specific
transport layer could be creation of spdk thread followed by
a poller registration.
This change also aligns with transport destroy which is already
async operation.
Current transport create function is marked deprecated and is meant
for transports supporting sync create only to maintain backward
compatibility. Async version supports both create operations.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1f5a477819e58f30983d26f81a1416bed1279ecf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16463
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The spdk_bdev_ext_io_opts structure is used to pass extra options when
submitting a bdev IO request, without having to modify/add functions to
handle new options. Additionally, the structure has a size field to
allow adding new fields without breaking the ABI (and thus having to
bump up the major version of a library).
It is also a part of spdk_bdev_io and there are several reasons for
removing it from that structure:
1. The size field only makes sense in structures that are passed
through pointers. And spdk_bdev_ext_io_opts is indeed passed as a
pointer to spdk_bdev_{readv,writev}_blocks_ext(), however it is
also embedded in spdk_bdev_io (internal.ext_opts_copy), which is
also part of the API. It means that each time a new field is added
to spdk_bdev_ext_io_opts, the size of spdk_bdev_io will also
change, so we will need to bump the major version of libspdk_bdev
anyway, thus making spdk_bdev_ext_io_opts.size useless.
2. The size field also makes internal.ext_opts cumbersome to use, as
each time one of its fields is accessed, we need to check the size.
Currently the code doesn't do that, because all of the existing
spdk_bdev_ext_io_opts fields were present when this structure was
initially introduced, but we'd need to do check the size before
accessing any new fields.
3. spdk_bdev_ext_io_opts has a metadata field, while spdk_bdev_io
already has u.bdev.md_buf, which means that we store the same thing
in several different places in spdk_bdev_io (u.bdev.md_buf,
u.bdev.ext_opts->metadata, internal.ext_opts->metadata).
Therefore, this patch removes all references to spdk_bdev_ext_io_opts
from spdk_bdev_io and replaces them with fields (memory_domain,
memory_domain_ctx) that were missing in spdk_bdev_io. Unfortunately,
this change breaks the API and requires changes in bdev modules that
supported spdk_bdev_io.u.bdev.ext_opts.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I49b7524eb84d1d4d7f12b7ab025fec36da1ee01f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16773
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Format LBA size (FLBAS) is updated to have:
Bit 3:0 as least significant 4 bits for format index
Bit 6:5 as most significant 2 bits for format index
NVMe format command fields are updated accordingly.
Add a new helper function to fetch the correct format index.
Update examples and unit test files accordingly.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Change-Id: I2d6d9045b9d65ae91cb18843ca75b59cc27ed2f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16515
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Fix compile errors on older gcc versions (reproduced
on gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0).
The error is: initializer element is not constant.
Signed-off-by: Slawomir Ptak <slawomir.ptak@intel.com>
Change-Id: I3f56304649b141b6422d84257cdc386c5cb14cc4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16718
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
That way, we are sure that each test case starts with the same, clean
state of the g_seq_operations array and we don't need to manually zero
out each individual value.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6a45499a87480b0803f3af52c9e22b3bb68e9996
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16547
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It makes it possible to check the number of times a task was submitted
to be executed by a module, even if we defined a submit() function for
that opcode.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Id6b592b0461c722bf22ab04d5bad1a7542bb17e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16546
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Starting in SPDK 23.01, calling spdk_bdev_register() and
spdk_bdev_examine() from a thread other than the app thread was
deprecated. This commit removes the deprecation and as such calling
these functions from a thread other than the app thread is an error.
As a side effect of this commit, all bdev module examine_config() and
examine_disk() callbacks will be called on the app thread.
Change-Id: Idaae06608101e2a513d9312ac5544ffe94effe4a
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15826
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
With the introduction bdev module claims v2, existing consumers should
transition off of v1 claims. This transitions blob bdevs from v1
exclusive writer claims to v2 read write once claims.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I1884585a540fa17ee341430e03de3c4f5d35322b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16168
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add a unit test to ensure spdk_bs_bdev_claim() takes an exclusive write
claim and it is released when the destroy callback is called.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ia5185545b148a8a83315c688a9c99a16b199063a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16230
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This creates a minimal test for module/blob/bdev/blob_bdev.c.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I04698863f3228a27f73a90d50f0d5fbde30c0870
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16229
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
If multiple claims exist on a bdev, examine_disk() is called for each of
them.
Change-Id: I0a6dc3e4bd1da20bbcbddf97a16e04c62c82354c
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15290
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This implements the v2 claims API. Compared to the original v1 claims,
v2 claims:
- Support read-write-once, read-write-many, and read-only-many claims.
- Are claimed with spdk_bdev_module_claim_desc().
- Are associated with a bdev descriptor that is passed to
spdk_bdev_module_claim_bdev_desc().
- Are released upon close of the bdev descriptor used to obain the
claim.
- Cannot be taken when a descriptor other than the one passed to
spdk_bdev_module_claim_bdev_desc() has write access.
Later commits in this series are needed to fully integrate them with the
bdev subsystem.
Change-Id: I39a356f5893aa45ac346623ec9ce0ec659b38975
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15288
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The next patch will improve media mgmt notifications but it will be
almost same as _resize_notify() and _remove_notify().
On the other hand, there are a few differences between _resize_notify()
and _remove_notify(). _remove_notify() will be better.
To avoid duplication, unify _resize_notify() and _remove_notify() by
adding abstraction event_notify() and _event_notify().
Add unit tests for the complex race conditions.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ibe2478479c61459c0da0db8d28c7273f05275e0f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16577
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
These routines can only handle a single buffer; double check that is the
case, and fail if not.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I136482c27c73655887c49405f747b8ed073f7b69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16198
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Use req->iov as needed, to make it easier to remove req->data later.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I3084254ec44cfc4e11f8beccc61c895232daf272
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16197
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Use req->iov as needed, to make it easier to remove req->data later.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ie625f374e846f7e6afd6a5d143a5174d27d419b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16256
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Add a new API for incremental copying in or out of an iovec, and replace
current code to use the new API.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I088b784aef821310699478989e61411952066c18
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16193
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We should always called the unregister callback on
the same thread that spdk_bdev_unregister() was
originally called. So save the thread pointer and
use an spdk_thread_send_msg() to make sure it gets
called on the correct thread when the unregister
finishes.
Also add unit test that reproduces the original
issue.
Fixes issue #2883.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib3d89368aa358bc7a8db46a8a8cb6339340469d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16554
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently we do not have a way to dump opts
for virtio_blk transports. This patch introduces
necessary changes to let us save and load those
via JOSN config.
Change-Id: I7ee4f31062f3d4a264f322e66a67ba3d075f1d75
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The addition of mbuf splitting requires a few more unit tests.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I24309911d40aec14e7f7c504be276c7f79e3ef1d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16094
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Quite a few changes from the vbdev compress unit tests
mainly due to plumbing and structural changes from the
code under test now being an accel_fw module instead of
a bdev module. Coverage of critical functions matches
what it was for the common code.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia40c7a0ed72a427e71c00607d93e215e0265fcb1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16076
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
The new compressdev unit tests will re-use quite a bit of code
from the old compress vbdev module so start this that so that the
next patch will be easier to review what's changed for as the
accel compressdev unit tests.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id91bb8630213e6046a5b38f31227476a33eb0675
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16063
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
directly
This patch removes hardcoded compressdev code from the
vbdev module and instead uses the accel_fw. The port required
a few changes based on how things are plumbed and accessed,
nothing that isn't be too obscure. CI tests were updated to
run ISAL accel_fw module as well as DPDK compressdev with QAT.
Unit tests for the new module will follow in a separate patch.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I769cbc888658fb846d89f6f0bfeeb1a2a820767e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13610
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
All DPDK related code is removed, handling of
RESET command was sligthly updated.
Handling of -ENOMEM was updated for cases when
accel API returns -ENOMEM
Crypto tests in blockdev.sh were extended with more
crypto_bdevs to verify NOMEM cases - that failed
with original vbdev_crypto implementation
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: If1feba2449bee852c6c4daca4b3406414db6fded
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14860
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Attempt to send any queued writes even if the socket is in a group. This
will not work if there is a send task already outstanding.
Change-Id: Icc0b5884e3d247042194ad26b30340ceb824886c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15212
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Make vbdev_crypto_destruct() return 1 to signal that program
execution should wait for spdk_bdev_destruct_done() function,
which is added inside _device_unregister_cb().
This change is related to _vdev_dev_get() not being able
to find the devices, when called from _cryptodev_sym_session_free(),
as it uses device driver name, which might already be freed.
This occurs only during bdev module finish, when crypto bdevs
are being unregistered and vbdev_crypto_finish() proceeds to
call bdev name deletion without waiting for the unregister
callbacks to complete, which ultimately results in reading
freed pointers.
This only happens when code execution takes path for DPDK 22.11+.
Change-Id: Id9a43d07c90aef7a82867383fd77354ac521a3e7
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16290
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch is a combination of commits which update vdev_crypto:
110d8411e bdev/crypto: do not create mempool for session private data
495055b05 bdev/crypto: update rte_cryptodev usage for DPDK 22.11
02caed6b5 bdev/crypto: remove mempool usage matching < DPDK 19.02
5887eb321 bdev/crypto: do not track type of crypto session
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I30c4f76e4e7b4865a7daa638d357888bb5e02071
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16039
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The io paths' stat will get lost when they are destroyed. Record
the stat in the nvme_ns structure.
Change-Id: I12fc0b04fac0d59e7465fe543ee733f2822a9cdb
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14744
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently we have stat per bdev I/O channel, but for NVMe bdev
multipath, we don't have stat per I/O path. Especially for
active-active mode, we may want to observe each path's statistics.
This patch support IO stat for nvme_io_path. Record each nvme_io_path
stat using structure spdk_bdev_io_stat.
The following is the comparison of bdevperf test.
Test on Arm server with the following basic configuration.
1 Null bdev: block size: 4K, num_blocks:16k
run bdevperf with io size=4k, qdepth=1/32/128, rw type=randwrite/mixed with 70% read/randread
Each time run 30 seconds, each item run for 16 times and get the average.
The result is as follows.
qdepth type IOPS(default) IOPS(this patch) diff
1 randwrite 7795157.27 7859909.78 0.83%
1 mix(70% r) 7418607.08 7404026.54 -0.20%
1 randread 8053560.83 8046315.44 -0.09%
32 randwrite 15409191.3 15327642.11 -0.53%
32 mix(70% r) 13760145.97 13714666.28 -0.33%
32 randread 16136922.98 16038855.39 -0.61%
128 randwrite 14815647.56 14944902.74 0.87%
128 mix(70% r) 13414858.59 13412317.46 -0.02%
128 randread 15508642.43 15521752.41 0.08%
Change-Id: I4eb5673f49d65d3ff9b930361d2f31ab0ccfa021
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14743
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Support to specify rr_min_io for multipath round-robin policy,
which makes I/O switches to another io path after rr_min_io I/Os are
rounted to current io path.
Change-Id: I09f0d8d24271c0178ff816fa63ce8576b6e8ae47
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15445
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Support selecting io path according to number of outstanding io of
each path in a channel. It's optional, and can be set by calling
RPC "bdev_nvme_set_multipath_policy -s queue_depth".
Change-Id: I82cdfbd69b3e105c973844c4f34dc98f0dca2faf
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14734
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When we submit more tasks than supported by qp,
extra tasks are queued on io_channel. Later completion
poller tries to resubmit these tasks one by one. That
is not efficient since every enqueu_burst may cause
doorbell updates in HW.
Instead add a check for qpir capacity and submit
appropriate number of requests. If qpair is full,
tasks are queued in dedicated list. This approach
should remove or minimize the need to resubmit
individual crypto operations.
This also handles a case where there are no entries
in global pools (crypto_ops or rte_mbuf)
Fixes issue #2756
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Iab50e623e7a82a4f5bef7a1e4434e593240ab633
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15769
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Previously vbdev_crypto used DPDK directly and
the restriction on max IO size was propagated to
generic bdev layer which split big IO requests.
Now, when DPDK code is a standalone accel module,
this restriction on max IO size is not visible to
the user and we should get rid of it.
To remove this limitation, allow to submit crypto
operations for part of logical blocks in big IO,
the rest blocks will be processed when all submitted
crypto ops are completed.
To verify this patch, add a functional test which
submits big IO verify mode
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I0ee89e98195a5c744f3fb2bfc752b578965c3bc5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15768
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Move parallel arrays of response buffers and response SGLs from
qpair to a new responses object.
Use options to create the responses object.
Use spdk_zmalloc() to allocate the responses object because qpair
is also allocated by spdk_zmalloc().
The purpose is to share the code and the data structure between
SRQ is enabled and disabled.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ia23fe7328ae1f2f551fed5863fd1414f8567d602
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14172
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Unit tests are already linked with isa-l-crypto if CONFIG_ISAL_CRYPTO is
set, so there's no need to stub them. And by not stubbing them, we can
do tests involving actual encryption/decryption.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I162a2cd26112cc5adb8eeed7336f4280aa4bdb6b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16291
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Accel modules can now implement the get_memory_domains() callback to
indicate the types of memory domains they support. If unimplemented, a
module is assumed not to support memory domains and accel will take care
of pulling/pushing data to local buffers prior to passing a task to be
executed by a module.
For now, similarly to the bdev layer, we only check if a module supports
memory domains, but we don't verify the types of the domains. That
could be easily added in the future, if necessary.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia513f4f31124672b705b6dd33a2624f0ae94d3ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16027
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It allows accel to store private data per each opcode/module without
having to change externally visible structures or allocate anything when
a module is registered. Since a single module can service multiple
opcodes at the same time, so some of these values might be duplicated.
However, there are only a handful of opcodes, so it shouldn't be a
problem.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I609a6ccc2d241cb9b8273cc2c6d1933d2bc25e0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the destination buffer is in remote memory domain, we'll now push the
temporary bounce buffer to that buffer after a task is executed.
This means that users can now build and execute sequence of operations
using buffers described by memory domains. For now, it's assumed that
none of the accel modules support memory domains, so the code in the
generic accel layer will always allocate temporary bounce buffers and
pull/push the data before handing a task to a module.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia6edf266fe174eee4d28df0ca570c4d825436e60
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15948
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the source buffer is from a remote memory domain, we will now pull it
to the temporary bounce buffer before a task is executed.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I476684a4359410c69dd69a2b425b9e61d4c55a7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15947
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
All operations are using iovecs to describe their buffers and only
encrypt/decrypt additionally used nbytes to store the total size of a
src buffer. We don't really need this value in the generic accel code,
so we can let modules calculate it, if necessary. That way, we won't
waste cycles calculating it if a module doesn't use it and it makes the
code a bit easier, as we won't have to deal with the fact that nbytes is
only valid for certain operations.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I29252be34a9af9fd40f4c7fec9d0a0c1139c562d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16306
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also, since this was the last operation using dst and nbytes, these
fields were removed from spdk_accel_task.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0d6b090e101c016d1bdcbe7a3bee7d6f691f1c9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15943
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Also, since this was the last operation using src, remove this field
from spdk_accel_task.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I55fd98697ef4f92a13dd0563b4adf9ccb0af171b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Implement the resize function for RAID0. raid0_resize() calculate the
new raid_bdev's block count and if it is different from the old block
count, call spdk_bdev_notify_blockcnt_change() with the new block count.
A raid0 bdev always opens all base bdevs. Hence, if the size of base
bdevs are reduced, resize fails now. This limitation will be removed
later.
Add a simple functional test for this feature. The test is to create
a raid0 bdev with two null bdevs, resize one null bdev, check if the
raid0 bdev is not resize, resize another null bdev, check if the raid0
bdev is resized.
test/iscsi_tgt/resize/resize.sh was used a reference to write the test.
Using jq rather than grep&sed is better and hence replace grep&sed by jq
of test/iscsi_tgt/resize/resize.sh together in this patch.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I07136648c4189b970843fc6da51ff40355423144
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16261
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also, make it possible to remove copy operations following a fill
operation if they're using the same buffers.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7da195ce80650a02c5db99d9400ee692f797b1f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15940
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Some of the copy operations can be elided, so they're not the best for
this kind of test. So, use another operation, decompress, that can be
appended to an accel sequence.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic59e7678436bdf1d5ab6eb103de4cc0c0c347b9f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16018
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Also, replace src2 with an iovec + iovcnt and rename it to s2 to
keep the naming consistent with the source buffer (s).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I44787128377addd514818ec5aaec084b1a31f0c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15939
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Also, replace dst2 with an iovec + iovcnt and rename it to d2 to
keep the naming consistent with the destination buffer (d).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib394c127eeb5890451535ff485f96f7edd2897a4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15938
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This patch is first in the series of patches aimed to make all accel
operations describe their buffers with iovecs. The intention is to make
it easier to handle tasks in a generic way.
It doesn't mean that we change the API - all function signatures are
preserved. If a function doesn't use iovecs, we use the aux_iovs array.
However, this does mean that each accel module that provides support for
a given operation will need to be adjusted to use iovecs.
Additionally, update the unit test checking copy elision to verify the
buffers of the copy operation that is left.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9e6d8d1be3b8b9706cb4a6222dad30e8c373d8fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15937
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Users can now specify buffers allocated through `spdk_accel_get_buf()`
when appending operations to a sequence. When an operation in a
sequence is executed, we check it if it uses buffers from accel domain,
allocate data buffers and update all operations within a sequence that
were also using those buffers.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I430206158f6a4289e15f04ddb18f0d1a2137f0b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15748
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
It's common to set up an iovec around a single buffer; add a helper for
this.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic4183e29d78549ec102045c6af0b5ff448cb5c59
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
And use it in a couple of places.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I4b86cef0e9489c1435c0206dd6c5cda4ffe4d33a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16191
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
When a buffer is get, it does not need to reserve the space
for tailq header.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I0aa2d77739fbb86a6e2df1c00a772aff1cb7c6e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16181
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In order to connect to a zoned SPDK NVMe-oF target the ZNS specific
identify functions must be implemented and the supported ZNS opcodes
must be set accordingly.
Implementing ZNS specific identify functions to return the 'I/O Command
Set specific Identify Namespace data structure (CNS 05h)'
(`spdk_nvmf_ns_identify_iocs_specific`) and 'I/O Command Set specific
Identify Controller data structure (CNS 06h)'
(`spdk_nvmf_ctrlr_identify_iocs_specific`).
Those functions return a null filled data structure for any I/O Command
Set other than ZNS.
Signed-off-by: Dennis Maisenbacher <dennis.maisenbacher@wdc.com>
Change-Id: I6b9529ce0a86400afb01d4e09cbdb3e5c3a68514
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16044
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When an I/O gets an I/O error, the I/O path to which the I/O was
submitted may be still available. In this case, the I/O should be
retried to the same I/O path. However, a new I/O path was always
selected for an I/O retry.
For the active/passive policy, the same I/O path was selected naturally.
However, for the active/active policy, it was very likely that a
different I/O path was selected.
To use the same I/O path for an I/O retry, add a helper function
bdev_nvme_retry_io() into bdev_nvme_retry_ios() and replace
bdev_nvme_submit_request() by bdev_nvme_retry_io(). bdev_nvme_retry_io()
checks if nbdev_io->io_path is not NULL and is available. Then, call
_bdev_nvme_submit_request() if true, or call bdev_nvme_submit_request()
otherwise. For I/O path error, clear nbdev_io->io_path for
clarification. Add unit test to verify this change.
Linux kernel native NVMe multipath already takes this approach. Hence,
this change will be reasonable.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I7022aafd8b1cdd5830c4f743d64b080aa970cf8d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16015
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Richael <richael.zhuang@arm.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The following patches will change I/O retry to use the same io_path if
it is still available. However, bdev_nvme_submit_request() always calls
bdev_nvme_find_io_path() first. For I/O retry, if possible, we want to
skip calling bdev_nvme_find_io_path() and use nbdev_io->io_path instead.
To reuse the code as much as possible and not to touch the fast code
path, factor out request submit functions from
bdev_nvme_submit_request() into _bdev_nvme_submit_request().
While developing this patch, a bug/mismatch was found such that
bdev_io->internal.ch was different from ch of
bdev_nvme_submit_request(). Fix it together in this patch.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id003e033ecde218d1902bca5706c772edef5d5e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16013
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The new module replaces functionality in vbdev_crypto.
This module is bdev agnostic, so some inernal parts
were reworked.
io_channel: contains a qp of every configured DPDK PMD
crypto key: for mlx5_pci we register a key on each available
device since keys are bound to Protection Domain.
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: If1845cb87eadacbb921c593ba82207a97f2209a3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14859
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This patch is just a copy of vbdev_crypto.c and the
corresponding UT file. It makes it easier to review
the next patch which adds accel operations
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Ib88b45d573b011b1acb35da9bf4dab922d8fb183
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16182
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
* generic metadata support for raid modules
* raid is not created when metadata formats for base bdevs differ
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: Ifaf9cfc4f2472c3820da1070deda758c5334edb2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13549
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Error counters for NVMe error was added in the generic bdev layer but
we want to know more detailed information for some use cases.
Add NVMe error counters per type and per code as module specific
statistics.
For status codes, the first idea was to have different named member
for each status code value. However, it was bad and too hard to test,
review, and maintain.
Instead, we have just two dimensional uint32_t arrays, and increment
one of these uint32_t values based on the status code type and status
code. Then, when dump the JSON, we use spdk_nvme_cpl_get_status_string()
and spdk_nvme_cpl_get_status_type_string().
This idea has one potential downside. This idea consumes 4 (types) *
256 (codes) * 4 (counter) = 4KB per NVMe bdev. We can make this smarter
if memory allocation is a problem. Hence we add an option
nvme_error_stat to enable this feature only if the user requests.
Additionally, the string returned by spdk_nvme_cpl_get_status_string()
or spdk_nvme_cpl_get_status_type_string() has uppercases, spaces, and
hyphens. These should not be included in JSON strings. Hence, convert
these via spdk_strcpy_replace().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I07b07621e777bdf6556b95054abbbb65e5f9ea3e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15370
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
spdk_nvme_cpl_get_status_string() returns a string which contains upper
cases, spaces, and hyphens. To use the returned string for JSON RPC, we
have to convert it to a string which contains only lowercases and
underscores.
For our convenience, add a new API spdk_strcpy_replace() to replace
all occurrences of the search string with the replacement string.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3ca9774d0bfb2d0bb7bd7412bc671e6f69104b7d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16054
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added num_outstanding_reqs in struct spdk_nvme_qpair to record outstanding
req number in each qpair. This can be used by multipath to select I/O
path.
Increment num_outstaning_reqs when req is removed from free_req queue and
decrement it when req is put back in free_req queue.
Change-Id: I31148fc7d0a9a85bec4c56d1f6e3047b021c2f48
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15875
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
bdev_crypto uses memset() to zero secrets passed
by the user (cleanup/error path) which is not safe -
compiler may detect that the buffer being zeroed
is not accessed any more and may "optimize" (drop)
zerofying.
C11 standard introduces memset_s which guarantess to
change the buffer content, but this function is optional,
gcc may not support it. As alternative, add not optimal
from performance point of view default implementation.
Add unit test to math_ut.c to avoid creating new .c file
for 1 simple test
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I11c7d15610df02e4a3761a88c85f6f8c54fb4b0a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16038
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Attribute base_bdevs_max_degraded of raid_bdev_module struct is
replaced with more generic structure allowing implementation of
raid levels for which constraint is by number of operational
drives instead of maximum number of failed drives.
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: Ie7079993d27d32118b865c3aabd92252a2807b94
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14411
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
It will be used for allocating buffers from accel domain and
allocating bounce buffers to push/pull the data from memory domains for
modules that don't support memory domains.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idbe4d2129d0aff87d9e517214e9f81e8470c5088
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15745
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This domain is meant to represent data being transformed by accel
engine. Users will be able to allocate buffers from that memory domain
and use them when appending operations to an accel sequence.
Since these buffers are only meant to be used as placeholders for actual
buffers, none of the push/pull/translate callbacks are implemented. To
access the data after it was transformed by accel, users should make
sure that the final command's destination buffer isn't allocated from
accel memory domain.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia031c7b205e98792d0a93f01513101b86afa9faa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15744
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reversing a sequence means that the order of its operations is reversed,
i.e. the first operation becomes last and vice versa. It's especially
useful in read paths, as it makes it possible to build the sequence
during submission, then, once the data is read from storage, reverse the
sequence and execute it.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I93d617c1e6d251f8c59b94c50dc4300e51908096
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15636
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Operation sequence should always be treated as a whole, meaning that
users cannot rely on the contents of any intermediate buffers and should
only care about the buffer that's the destination of the whole
operation. This allows us to remove some of those copy operations by
changing source / destination buffer of a preceding / following
operation.
If a sequence is using buffers from non-local memory domain, users can
append a copy operation to a sequence to specify a local destination
buffer. If the module executing the operations is aware of memory
domains, this can avoid doing an extra spdk_memory_domain_pull_data().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I93b94d46ee32700819e9e6f1c55350692db8a67a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15530
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This patch introduces the concept of chaining multiple accel operations
and executing them all at once in a single step. This means that it
will be possible to schedule accel operations at different layers of the
stack (e.g. copy in NVMe-oF transport, crypto in bdev_crypto), but
execute them all in a single place. Thanks to this, we can take
advantage of hardware accelerators that supports executing multiple
operations as a single operation (e.g. copy + crypto).
This operation group is called spdk_accel_sequence and operations can be
appended to that object via one of the spdk_accel_append_* functions.
New operations are always added at the end of a sequence. Users can
specify a callback to be notified when a particular operation in a
sequence is completed, but they don't receive the status of whether it
was successful or not. This is by design, as they shouldn't care about
the status of an individual operation and should rely on other means to
receive the status of the whole sequence. It's also important to note
that any intermediate steps within a sequence may not produce observable
results. For instance, appending a copy from A to B and then a copy
from B to C, it's indeterminate whether A's data will be in B after a
sequence is executed. It is only guaranteed that A's data will be in C.
A sequence can also be reversed using spdk_accel_sequence_reverse(),
meaning that the first operation becomes last and vice versa. It's
especially useful in read paths, as it makes it possible to build the
sequence during submission, then, once the data is read from storage,
reverse the sequence and execute it.
Finally, there are two ways to terminate a sequence: aborting or
executing. It can be aborted via spdk_accel_sequence_abort() which will
execute individual operations' callbacks and free any allocated
resources. To execute it, one must use spdk_accel_sequence_finish().
For now, each operation is executed one by one and is submitted to the
appropriate accel module. Executing multiple operations as a single one
will be added in the future.
Also, currently, only fill and copy operations can be appended to a
sequence. Support for more operations will be added in subsequent
patches.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Id35d093e14feb59b996f780ef77e000e10bfcd20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15529
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This code has a similar potential problem as the identify
and log page commands did: stop using req->data in favour of IOVs.
We also need to fix the unit tests to initialize the iovs.
We don't change the existing "set" behaviour of requiring a single IOV
here.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I257567a7abd5fc3ed9ee21b432c7da7d70fbbde0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16122
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add a define for the Identify command buffer instead of using a raw
value.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9073ff84e2fa2ef9268051b898fe1027d8e97baa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
In preparation for supporting additional claim types, create a claim
type that represents the current claim type. Everything that sticks to
the public APIs should continue to work as before.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I0d02e4b3f4bbf4eb5a7391028aa31e999f9da915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15286
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In preparation for an updated claims API, refactor
bdev->internal.claim_module into a union that will eventually hold
different information based on the the type of claim.
Change-Id: I7ade6f03128bdb0f8375a95ae953cb63d6aa686d
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15285
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot