It's common to set up an iovec around a single buffer; add a helper for
this.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic4183e29d78549ec102045c6af0b5ff448cb5c59
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
And use it in a couple of places.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I4b86cef0e9489c1435c0206dd6c5cda4ffe4d33a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16191
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
When a buffer is get, it does not need to reserve the space
for tailq header.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I0aa2d77739fbb86a6e2df1c00a772aff1cb7c6e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16181
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In order to connect to a zoned SPDK NVMe-oF target the ZNS specific
identify functions must be implemented and the supported ZNS opcodes
must be set accordingly.
Implementing ZNS specific identify functions to return the 'I/O Command
Set specific Identify Namespace data structure (CNS 05h)'
(`spdk_nvmf_ns_identify_iocs_specific`) and 'I/O Command Set specific
Identify Controller data structure (CNS 06h)'
(`spdk_nvmf_ctrlr_identify_iocs_specific`).
Those functions return a null filled data structure for any I/O Command
Set other than ZNS.
Signed-off-by: Dennis Maisenbacher <dennis.maisenbacher@wdc.com>
Change-Id: I6b9529ce0a86400afb01d4e09cbdb3e5c3a68514
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16044
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_bdev_module_init() must only be called if the module sets
async_init to true. This patch fixes the doc string to match the
implementation and adds an assert() to catch API usage errors early.
Change-Id: I677345de028c8f7597ecf81ff9b9b855867bbf01
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16133
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
spdk_nvme_cpl_get_status_string() returns a string which contains upper
cases, spaces, and hyphens. To use the returned string for JSON RPC, we
have to convert it to a string which contains only lowercases and
underscores.
For our convenience, add a new API spdk_strcpy_replace() to replace
all occurrences of the search string with the replacement string.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3ca9774d0bfb2d0bb7bd7412bc671e6f69104b7d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16054
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added spdk_nvme_qpair_get_num_outstanding_reqs to get the number
of outstanding reqs for a specific qpair.
Change-Id: I55d75a7363ac63bd26db76594e70e8b17b3e5830
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
bdev_crypto uses memset() to zero secrets passed
by the user (cleanup/error path) which is not safe -
compiler may detect that the buffer being zeroed
is not accessed any more and may "optimize" (drop)
zerofying.
C11 standard introduces memset_s which guarantess to
change the buffer content, but this function is optional,
gcc may not support it. As alternative, add not optimal
from performance point of view default implementation.
Add unit test to math_ut.c to avoid creating new .c file
for 1 simple test
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I11c7d15610df02e4a3761a88c85f6f8c54fb4b0a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16038
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It's the same as spdk_iovcpy(), but the dst/src buffers can overlap.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6daa0a846d7d1deac2c01d1a1be09171fa8bf796
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15747
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The data buffers backed by these accel buffers aren't allocated
immediately, but only when they're necessary to execute a given
operation. It allows users to append operations to a sequence, without
actually reserving large space for the data. That way, if some of these
buffers aren't needed to execute a sequence, they won't be allocated.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ieeea8a011b40c7f2f33e9a6f03fe34264e9316f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15746
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This domain is meant to represent data being transformed by accel
engine. Users will be able to allocate buffers from that memory domain
and use them when appending operations to an accel sequence.
Since these buffers are only meant to be used as placeholders for actual
buffers, none of the push/pull/translate callbacks are implemented. To
access the data after it was transformed by accel, users should make
sure that the final command's destination buffer isn't allocated from
accel memory domain.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia031c7b205e98792d0a93f01513101b86afa9faa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15744
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch introduces the concept of chaining multiple accel operations
and executing them all at once in a single step. This means that it
will be possible to schedule accel operations at different layers of the
stack (e.g. copy in NVMe-oF transport, crypto in bdev_crypto), but
execute them all in a single place. Thanks to this, we can take
advantage of hardware accelerators that supports executing multiple
operations as a single operation (e.g. copy + crypto).
This operation group is called spdk_accel_sequence and operations can be
appended to that object via one of the spdk_accel_append_* functions.
New operations are always added at the end of a sequence. Users can
specify a callback to be notified when a particular operation in a
sequence is completed, but they don't receive the status of whether it
was successful or not. This is by design, as they shouldn't care about
the status of an individual operation and should rely on other means to
receive the status of the whole sequence. It's also important to note
that any intermediate steps within a sequence may not produce observable
results. For instance, appending a copy from A to B and then a copy
from B to C, it's indeterminate whether A's data will be in B after a
sequence is executed. It is only guaranteed that A's data will be in C.
A sequence can also be reversed using spdk_accel_sequence_reverse(),
meaning that the first operation becomes last and vice versa. It's
especially useful in read paths, as it makes it possible to build the
sequence during submission, then, once the data is read from storage,
reverse the sequence and execute it.
Finally, there are two ways to terminate a sequence: aborting or
executing. It can be aborted via spdk_accel_sequence_abort() which will
execute individual operations' callbacks and free any allocated
resources. To execute it, one must use spdk_accel_sequence_finish().
For now, each operation is executed one by one and is submitted to the
appropriate accel module. Executing multiple operations as a single one
will be added in the future.
Also, currently, only fill and copy operations can be appended to a
sequence. Support for more operations will be added in subsequent
patches.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Id35d093e14feb59b996f780ef77e000e10bfcd20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15529
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Add a define for the Identify command buffer instead of using a raw
value.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9073ff84e2fa2ef9268051b898fe1027d8e97baa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
In preparation for supporting additional claim types, create a claim
type that represents the current claim type. Everything that sticks to
the public APIs should continue to work as before.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I0d02e4b3f4bbf4eb5a7391028aa31e999f9da915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15286
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In preparation for an updated claims API, refactor
bdev->internal.claim_module into a union that will eventually hold
different information based on the the type of claim.
Change-Id: I7ade6f03128bdb0f8375a95ae953cb63d6aa686d
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15285
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
This closes races between concurrent spdk_bdev_module_claim_bdev()
and/or spdk_bdev_module_release_bdev() calls affecting the same bdev by
holding bdev->internal.spinlock while claiming and releasing a bdev. It
also closes a potential TOCTOU bug in that optimizing compilers probably
already eliminate in bdev_finish_unregister_bdevs_iter() and documents
that bdev->internal.claim_module is protected by
bdev->internal.spinlock.
This can be removed when the bdev_register_examine_thread deprecation
is removed.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ib48552df065d5172139a61bbc00b391f36552c0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15282
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Since bdev_examine() can happen on any thread and it happens without any
other lock being held on the spdk_bdev_module, it is possible for
multiple threads to try to simultaneously increment
module->internal.action_in_progress. Decrements may also race.
This commit adds bdev_module->internal.spinlock and holds it while
modifying module->internal.action_in_progress.
This can be removed when the bdev_register_examine_thread deprecation
is removed.
Change-Id: I9c401eeb3c7c97c484e16fa9cfd82668b32e508b
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15281
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
This introduces a deprecation for calling spdk_bdev_register() and
spdk_bdev_examine() on a thread other than the app thread. The
deprecation period starts in SPDK 23.01 and removal is expected in SPDK
23.05.
The intent of this deprecation is to ensure that bdev modules'
examine_config() and examine_disk() callbacks are only ever called on
the app thread. This largely a formalization of what has long happened
due to the RPC poller running on the first thread started by
spdk_app_start().
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ic9d7b87b6522be20357d2eab2d0c77cd5753452f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15690
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
The spdk_nvme_ctrlr_opts now supports a transport_tos option
that allows setting of the 'type of service' value in the IPv4 header.
This is needed to support lossless RoCE setups.
Note: Only RDMA is supported at this point.
Change-Id: I21825fc197c60f539a7d2d651a970ea380d8b56d
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add spdk_nvme_cpl_get_status_type_string() to return ASCII
string for the type of an error.
Append a dummy entry to return "RESERVED" for unknown types.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibc07132ee067f146ac149884c6344f313bfcbfff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Based on NVMe-2.0c, add newly added status codes to the corresponding
enums.
Status codes of 0x80 to 0xBF are different between I/O commands and
fabrics commands. 0x80 to 0xBF of enum spdk_nvme_command_specific_status_code
has been used for I/O commands. Hence, add status codes for I/O commands
for consistency.
Command specific status codes for fabrics commands will be considered
later.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I8f549e76420ee72dcaf412c5941d74d8359761c9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15833
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
However, when querying or resetting module specific statistics,
the generic bdev layer have to access it.
For this purpose, add functions pointers.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ie86d0a4a406cec7e0f1e9a62de5982cd3d877eae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14839
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Define struct spdk_bdev_io_error_stat privately in lib/bdev/bdev.c.
Add a pointer to struct spdk_bdev_io_error_stat to struct
spdk_bdev_io_stat.
Allocate spdk_bdev_io_error_stat for bdev and RPC, but do not allocate
spdk_bdev_io_error_stat for I/O channel.
Dump the contents of spdk_bdev_io_error_stat only if its total is
non-zero.
As a result of these, only spdk_bdev_get_device_stat() can query
spdk_bdev_io_error_stat for the bdev_get_iostat RPC. This will be
acceptable.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Idae868afe65347a96529eedc3dcc692101de4a29
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14826
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We can allocate an array for error status dynamically via negating
SPDK_MIN_BDEV_IO_STATUS.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id36a92bfaa906b445715c03b69a0fd9a154a49e0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15898
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When merging data from one spdk_histogram_data to
another, the merging is only valid if the bucket_shift
for each structure is the same. Otherwise we are
combining data points that cover different ranges
of values.
So check that the bucket_shifts are the same before
merging. Change the return type to int to
return -EINVAL if structures with different
bucket_shifts are attempted to be merged.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If98e2d03384d85f478965956da2a42cfcff4713d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Basic IO completion counting can be done at the common
layer, to enable some level of stat tracking even for
transports that don't have transport-specific tracking
yet.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If04f854b97440089b8ad149b64cb59173c73975c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15912
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
The internal mempools were replaced with the newly added iobuf
interface.
To make sure we respect spdk_bdev_opts's (small|large)_buf_pool_size, we
call spdk_iobuf_set_opts() from spdk_bdev_set_opts(). These two options
are now deprecated and users should switch to spdk_iobuf_set_opts().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib1424dc5446796230d103104e272100fac649b42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Users can now specify a number of small/large buffers to be cached on
each iobuf channel. Previously, we relied on the cache of the
underlying spdk_mempool, which has per-core caches. However, since iobuf
channels are tied to a module and an SPDK thread, each module and each
thread is now guaranteed to have a number of buffers available, so it
won't be starved by other modules/threads.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1e29fe29f78a13de371ab21d3e40bf55fbc9c639
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15634
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The idea behind "iobuf" is to have a single place for allocating data
buffers across different libraries. That way, each library won't need
to allocate its own mempools, therefore decreasing the memory footprint
of the whole application.
There are two reasons for putting these kind of functions in the thread
library. Firstly, the code is pretty small, so it doesn't make sense to
create a new library. Secondly, it relies on the IO channel abstraction,
so users will need to pull in the thread library anyway.
It's very much inspired by the way bdev layer handles data buffers (much
of the code was directly copied over). There are two global mempools,
one for small and one for large buffers, and per-thread queues that hold
requests waiting for a buffer. The main difference is that we also need
to track which module requested a buffer in order to allow users to
iterate over its pending requests.
The usage is fairly simple:
```
/* Embed spdk_iobuf_channel into an existing IO channel */
struct foo_channel {
...
struct spdk_iobuf_channel iobuf;
};
/* Embed spdk_iobuf_entry into objects that will request buffers */
struct foo_object {
...
struct spdk_iobuf_entry entry;
};
/* Register the module as iobuf user */
spdk_iobuf_register_module("foo");
/* Initialize iobuf channel in foo_channel's create cb */
spdk_iobuf_channel_init(&foo_channel->iobuf, "foo", 0, 0);
/* Finally, request a buffer... */
buf = spdk_iobuf_get(&foo_channel->iobuf, length,
&foo_objet.entry, buf_get_cb);
...
/* ...and release it */
spdk_iobuf_put(&foo_channel->iobuf, buf, length);
```
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ifaa6934c03ed6587ddba972198e606921bd85008
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15326
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The kernel vfio_pci driver module introduced vf_token checking
mechanism since kernel version 5.7, and has been supported by
DPDK. So add support for it to deal with the scenario of VF.
Signed-off-by: Jun Zeng <jun1.zeng@intel.com>
Change-Id: Ie9700fa395327da4e847c6213167284c148a64e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Found with misspell-fixer.
Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: If062df0189d92e4fb2da3f055fb981909780dc04
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15207
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Copy-on-write happens when cluster is written for the first time for
thin provisioned volume. Currently it is implemented as two separate
requests to underlying bdev: read of the whole cluster to bounce
buffer and then write of this buffer to the new location on the same
underlying bdev.
This patch improves copy-on-write flow by utilizing copy command of
underlying bdev if it is supported. In this case we have just one
request to bdev and don't need the bounce buffer.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I92552e0f18f7a41820d589e7bb1e86160c69183f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
New `translate_lba` operation allows to translate blob lba to lba on
the underlying bdev. It recurses down the whole chain of bs_dev's. The
operation may fail to do the translation when blob lba is not backed
by the real bdev. For example, when we eventually hit zeroes device in
the chain.
This operation is used in the next commit to get source LBA for copy
operation.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I89c2d03d1982d66b9137a3a3653a98c361984fab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14528
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Add max/min_read/write/unmap_latency_ticks into the struct
spdk_bdev_io_stat.
When initializing or resetting the instance of the struct
spdk_bdev_io_stat, initialize max to 0 and min to UINT64_MAX.
Then update max if a new value is larger than the current max,
and update min if a new value is smaller than the current min.
For the bdev_get_iostat RPC, it prints max and prints min if min is not
UINT64_MAX or 0 if min is UINT64_MAX.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1b30b3825c15e37e9f0cf20104b866186de788a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14825
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The following patches will extend I/O statistics to include error
counters and module specific counters to output these via the
bdev_get_iostat RPC.
In this case, the size of the struct spdk_bdev_iostat will be variable.
As a preparation, allocate spdk_bdev_io_stat dynamically.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1979a9d867859d5cb5d05717bfcc677f07fa03f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15479
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
When use of deprecated featues is encountered, SPDK now calls
SPDK_LOG_DEPRECATED(). This logs the use of deprecated functionality in
a consistent way, making it easy to add further instrumentation to catch
code paths that trigger deprecated behavior.
Change-Id: Idfd33ade171307e5e8235a7aa0d969dc5d93e33d
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
This introduces an enhanced spinlock that adds safeguards compared to
the default pthread_spinlock_t. In particular:
- A pthread_spinlock_t is still used, but additional error checking is
performed to ensure there is no undefined behavior on relock,
unlocking when not the owner, or destoying a locked lock.
- The SPDK concurrency model allows an SPDK thread to be migrated
between pthreads. Releasing a pthread spinlock on a different thread
from where it is taken is undefined behavior. If an SPDK spinlock is
held at a time that a time when a poller or message returns control to
thread_poll(), the program will abort.
- SPDK spinlocks can only be obtained from an SPDK thread.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6dd6493ab5f5532ae69e20654546405a507eb594
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
This is unsafe, because we touch need_buf_* queues, which aren't
thread-safe. Also, documented this requirement in
spdk_bdev_io_get_buf()'s description.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iabc141e051c543fdd51f079ae212f69e980d8148
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15668
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
We make decisions on how to pick a poll group for a new
qpair by looking at each poll group's current_io_qpairs
count. But this count isn't always accurate since it
doesn't get updated until after the CONNECT has
been received.
This means that if we accept a bunch of connections
all at once, they may all get assigned the same poll
group, because the target poll groups counter doesn't
get immediately incremented.
So add a new counter, current_unassociated_qpairs,
to account for these qpairs. We protect this counter
with a lock, since the accept thread will increment
the counter, and the poll group thread will
decrement it when the qpair receives the CONNECT
allowing us to associated with a subsystem/controller..
If the qpair gets destroyed before the CONNECT is
received, we can use the qpair->connect_received
flag to decrement current_unassociated_qpairs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8bba8da2abfe225b3b9f981cd71b6f49e2b87391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15693
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Currently we use qpair->ctrlr at qpair destroy
time to decide if we need to decrement the
qpair's poll group's qpair count. But this is
not correct - these counters get incremented
when the CONNECT is received, but qpair->ctrlr
doesn't get set until later.
So add a new connect_received bool to the spdk_nvmf_qpair.
Use this instead to determine when we should decrement
the poll group qpair counters.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I174a0fda36c4558171953bf58f2f5117bc074f76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
NVMf target reports copy command support if all bdevs in the subsystem
support copy IO type. Maximum copy size is reported for each namespace
independently in namespace identify data. For now we support just one
source range.
Note, that command support in the controller is initialized once on
controller create. If another namespace which doesn't support copy
command is added to the subsystem later, it will not be reflected in
the controller data structure and will not be communicated to the
initiator. Attempt to execute copy command on such namespace will
fail. This issue is not specific to copy command and applies also to
write zeroes and unmap (dataset management) commands.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I5f06564eb43d66d2852bf7eeda8b17830c53c9bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Added new API 'spdk_bdev_histogram_get_channel' to get histogram of
a specified channel for a bdev. A callback function is passed to it
to process the histogram.
Change-Id: If5d56cbb5fe6c39cda7882f887dcc9c6afa769ac
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15539
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
SPDK threads generally run on dedicated cores and locks should be rarely
contended. Thus, putting a thread to sleep while waiting on a mutex does
not free up CPU cycles for other pthreads or processes. Even when
running in interrupt mode, lock contention should be low enough that
spinlocks are a net win by avoiding context switches.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6e2e78b2835bbadb56bbec34918d998d75280dfd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>