Commit Graph

2123 Commits

Author SHA1 Message Date
Mike Gerdts
7241a075be bdev: hold spinlock while changing claim_module
This closes races between concurrent spdk_bdev_module_claim_bdev()
and/or spdk_bdev_module_release_bdev() calls affecting the same bdev by
holding bdev->internal.spinlock while claiming and releasing a bdev. It
also closes a potential TOCTOU bug in that optimizing compilers probably
already eliminate in bdev_finish_unregister_bdevs_iter() and documents
that bdev->internal.claim_module is protected by
bdev->internal.spinlock.

This can be removed when the bdev_register_examine_thread deprecation
is removed.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ib48552df065d5172139a61bbc00b391f36552c0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15282
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2023-01-05 23:28:32 +00:00
Mike Gerdts
b5075dcc5b bdev: action_in_progress counting is racy
Since bdev_examine() can happen on any thread and it happens without any
other lock being held on the spdk_bdev_module, it is possible for
multiple threads to try to simultaneously increment
module->internal.action_in_progress. Decrements may also race.

This commit adds bdev_module->internal.spinlock and holds it while
modifying module->internal.action_in_progress.

This can be removed when the bdev_register_examine_thread deprecation
is removed.

Change-Id: I9c401eeb3c7c97c484e16fa9cfd82668b32e508b
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15281
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2023-01-05 23:28:32 +00:00
Mike Gerdts
a6e58cc44c bdev: examine and register on app thread
This introduces a deprecation for calling spdk_bdev_register() and
spdk_bdev_examine() on a thread other than the app thread. The
deprecation period starts in SPDK 23.01 and removal is expected in SPDK
23.05.

The intent of this deprecation is to ensure that bdev modules'
examine_config() and examine_disk() callbacks are only ever called on
the app thread. This largely a formalization of what has long happened
due to the RPC poller running on the first thread started by
spdk_app_start().

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ic9d7b87b6522be20357d2eab2d0c77cd5753452f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15690
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
2023-01-05 23:28:32 +00:00
Kefu Chai
6ac1082864 nvme: fix a typo in doxygen comment
s/HVMe/NVMe/

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Change-Id: Ie1ae542a0d1a98f84646460ce94b5c1cb70773e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16139
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-05 23:26:07 +00:00
Michael Haeuptle
7706450f2a nvme_rdma: Support TOS for RDMA initiator
The spdk_nvme_ctrlr_opts now supports a transport_tos option
that allows setting of the 'type of service' value in the IPv4 header.

This is needed to support lossless RoCE setups.

Note: Only RDMA is supported at this point.

Change-Id: I21825fc197c60f539a7d2d651a970ea380d8b56d
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-05 19:54:53 +00:00
Shuhei Matsumoto
ce92d919d7 nvme: Add a helper function to return status type string
Add spdk_nvme_cpl_get_status_type_string() to return ASCII
string for the type of an error.

Append a dummy entry to return "RESERVED" for unknown types.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibc07132ee067f146ac149884c6344f313bfcbfff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-04 08:22:31 +00:00
Shuhei Matsumoto
5ce17cb89b nvme: Add new values to nvme_status_code enums based on NVMe 2.0c
Based on NVMe-2.0c, add newly added status codes to the corresponding
enums.

Status codes of 0x80 to 0xBF are different between I/O commands and
fabrics commands. 0x80 to 0xBF of enum spdk_nvme_command_specific_status_code
has been used for I/O commands. Hence, add status codes for I/O commands
for consistency.

Command specific status codes for fabrics commands will be considered
later.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I8f549e76420ee72dcaf412c5941d74d8359761c9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15833
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-04 08:22:31 +00:00
Shuhei Matsumoto
8c439a6799 bdev: Add function pointers to display and reset module specific I/O statistics
However, when querying or resetting module specific statistics,
the generic bdev layer have to access it.

For this purpose, add functions pointers.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ie86d0a4a406cec7e0f1e9a62de5982cd3d877eae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14839
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-04 08:22:31 +00:00
Shuhei Matsumoto
53a9a8c4d1 bdev: Add counts per I/O error status into I/O statistics
Define struct spdk_bdev_io_error_stat privately in lib/bdev/bdev.c.

Add a pointer to struct spdk_bdev_io_error_stat to struct
spdk_bdev_io_stat.

Allocate spdk_bdev_io_error_stat for bdev and RPC, but do not allocate
spdk_bdev_io_error_stat for I/O channel.

Dump the contents of spdk_bdev_io_error_stat only if its total is
non-zero.

As a result of these, only spdk_bdev_get_device_stat() can query
spdk_bdev_io_error_stat for the bdev_get_iostat RPC. This will be
acceptable.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Idae868afe65347a96529eedc3dcc692101de4a29
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14826
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-01-04 08:22:31 +00:00
Shuhei Matsumoto
99de60a36f bdev: Add SPDK_MIN_BDEV_IO_STATUS to allocate array for error status
We can allocate an array for error status dynamically via negating
SPDK_MIN_BDEV_IO_STATUS.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id36a92bfaa906b445715c03b69a0fd9a154a49e0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15898
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-04 08:22:31 +00:00
Jim Harris
3327bb4391 histogram_data: check bucket_shift when merging
When merging data from one spdk_histogram_data to
another, the merging is only valid if the bucket_shift
for each structure is the same.  Otherwise we are
combining data points that cover different ranges
of values.

So check that the bucket_shifts are the same before
merging. Change the return type to int to
return -EINVAL if structures with different
bucket_shifts are attempted to be merged.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If98e2d03384d85f478965956da2a42cfcff4713d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-12-21 09:32:40 +00:00
Jim Harris
e39512ec18 nvmf: add completed_nvme_io to nvmf_poll_group_stat
Basic IO completion counting can be done at the common
layer, to enable some level of stat tracking even for
transports that don't have transport-specific tracking
yet.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If04f854b97440089b8ad149b64cb59173c73975c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15912
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
2022-12-16 09:27:50 +00:00
Konrad Sztyber
5a3e64efe4 bdev: replace internal buffer pools with iobuf
The internal mempools were replaced with the newly added iobuf
interface.

To make sure we respect spdk_bdev_opts's (small|large)_buf_pool_size, we
call spdk_iobuf_set_opts() from spdk_bdev_set_opts().  These two options
are now deprecated and users should switch to spdk_iobuf_set_opts().

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib1424dc5446796230d103104e272100fac649b42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-12-16 09:06:07 +00:00
Konrad Sztyber
36df38c059 thread: cache a number of iobuf buffers on each channel
Users can now specify a number of small/large buffers to be cached on
each iobuf channel.  Previously, we relied on the cache of the
underlying spdk_mempool, which has per-core caches. However, since iobuf
channels are tied to a module and an SPDK thread, each module and each
thread is now guaranteed to have a number of buffers available, so it
won't be starved by other modules/threads.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1e29fe29f78a13de371ab21d3e40bf55fbc9c639
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15634
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2022-12-16 09:06:07 +00:00
Konrad Sztyber
3aceb2da6c thread: introduce iobuf buffer pools
The idea behind "iobuf" is to have a single place for allocating data
buffers across different libraries.  That way, each library won't need
to allocate its own mempools, therefore decreasing the memory footprint
of the whole application.

There are two reasons for putting these kind of functions in the thread
library.  Firstly, the code is pretty small, so it doesn't make sense to
create a new library. Secondly, it relies on the IO channel abstraction,
so users will need to pull in the thread library anyway.

It's very much inspired by the way bdev layer handles data buffers (much
of the code was directly copied over).  There are two global mempools,
one for small and one for large buffers, and per-thread queues that hold
requests waiting for a buffer.  The main difference is that we also need
to track which module requested a buffer in order to allow users to
iterate over its pending requests.

The usage is fairly simple:

```
/* Embed spdk_iobuf_channel into an existing IO channel */
struct foo_channel {
	...
	struct spdk_iobuf_channel iobuf;
};

/* Embed spdk_iobuf_entry into objects that will request buffers */
struct foo_object {
	...
	struct spdk_iobuf_entry entry;
};

/* Register the module as iobuf user */
spdk_iobuf_register_module("foo");

/* Initialize iobuf channel in foo_channel's create cb */
spdk_iobuf_channel_init(&foo_channel->iobuf, "foo", 0, 0);

/* Finally, request a buffer... */
buf = spdk_iobuf_get(&foo_channel->iobuf, length,
		     &foo_objet.entry, buf_get_cb);

...

/* ...and release it */
spdk_iobuf_put(&foo_channel->iobuf, buf, length);

```

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ifaa6934c03ed6587ddba972198e606921bd85008
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15326
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-12-16 09:06:07 +00:00
Richael Zhuang
36f8f8da27 bdev: remove bdev parameter
Remove bdev parameter from spdk_bdev_channel_get_histogram since
it's not used.

Change-Id: I89f0b142cc6f80ecf39811976995f738e4cfecdb
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15837
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
2022-12-12 09:42:03 +00:00
Jun Zeng
86431df168 lib/env_dpdk: Add support for vfio-vf-token parameter
The kernel vfio_pci driver module introduced vf_token checking
mechanism since kernel version 5.7, and has been supported by
DPDK. So add support for it to deal with the scenario of VF.

Signed-off-by: Jun Zeng <jun1.zeng@intel.com>
Change-Id: Ie9700fa395327da4e847c6213167284c148a64e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-12-12 09:41:25 +00:00
Michal Berger
3f912cf0e9 misc: Fix spelling mistakes
Found with misspell-fixer.

Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: If062df0189d92e4fb2da3f055fb981909780dc04
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15207
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-12-09 08:16:18 +00:00
Evgeniy Kochetov
b7bfa50468 blob: Use bdev copy command in CoW flow if supported
Copy-on-write happens when cluster is written for the first time for
thin provisioned volume. Currently it is implemented as two separate
requests to underlying bdev: read of the whole cluster to bounce
buffer and then write of this buffer to the new location on the same
underlying bdev.

This patch improves copy-on-write flow by utilizing copy command of
underlying bdev if it is supported. In this case we have just one
request to bdev and don't need the bounce buffer.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I92552e0f18f7a41820d589e7bb1e86160c69183f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:54 +00:00
Evgeniy Kochetov
9e843fdbd1 blob: Add translate_lba operation
New `translate_lba` operation allows to translate blob lba to lba on
the underlying bdev. It recurses down the whole chain of bs_dev's. The
operation may fail to do the translation when blob lba is not backed
by the real bdev. For example, when we eventually hit zeroes device in
the chain.

This operation is used in the next commit to get source LBA for copy
operation.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I89c2d03d1982d66b9137a3a3653a98c361984fab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14528
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:54 +00:00
Shuhei Matsumoto
15040628ec bdev: Add min/max_latency_read/write/unmap_ticks into I/O statistics
Add max/min_read/write/unmap_latency_ticks into the struct
spdk_bdev_io_stat.

When initializing or resetting the instance of the struct
spdk_bdev_io_stat, initialize max to 0 and min to UINT64_MAX.

Then update max if a new value is larger than the current max,
and update min if a new value is smaller than the current min.

For the bdev_get_iostat RPC, it prints max and prints min if min is not
UINT64_MAX or 0 if min is UINT64_MAX.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1b30b3825c15e37e9f0cf20104b866186de788a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14825
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2022-12-08 12:54:23 +00:00
Shuhei Matsumoto
571638b9b9 bdev: Alloc spdk_bdev_io_stat dynamically for spdk_bdev
The following patches will extend I/O statistics to include error
counters and module specific counters to output these via the
bdev_get_iostat RPC.

In this case, the size of the struct spdk_bdev_iostat will be variable.

As a preparation, allocate spdk_bdev_io_stat dynamically.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1979a9d867859d5cb5d05717bfcc677f07fa03f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15479
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
2022-12-08 12:54:23 +00:00
Mike Gerdts
5b50d3e8b7 log: add deprecated tracking API
When use of deprecated featues is encountered, SPDK now calls
SPDK_LOG_DEPRECATED(). This logs the use of deprecated functionality in
a consistent way, making it easy to add further instrumentation to catch
code paths that trigger deprecated behavior.

Change-Id: Idfd33ade171307e5e8235a7aa0d969dc5d93e33d
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
2022-12-07 17:45:53 +00:00
Mike Gerdts
0dc6aac101 bdev: use SPDK spinlocks
Transition from pthread spinlocks to SPDK spinlocks for improved error
checking.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I7877c3a4601d7d5cf03e632df493974f97782272
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15439
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-12-06 21:20:17 +00:00
Mike Gerdts
cd2bcf1061 thread: SPDK spinlocks
This introduces an enhanced spinlock that adds safeguards compared to
the default pthread_spinlock_t. In particular:

- A pthread_spinlock_t is still used, but additional error checking is
  performed to ensure there is no undefined behavior on relock,
  unlocking when not the owner, or destoying a locked lock.
- The SPDK concurrency model allows an SPDK thread to be migrated
  between pthreads. Releasing a pthread spinlock on a different thread
  from where it is taken is undefined behavior. If an SPDK spinlock is
  held at a time that a time when a poller or message returns control to
  thread_poll(), the program will abort.
- SPDK spinlocks can only be obtained from an SPDK thread.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6dd6493ab5f5532ae69e20654546405a507eb594
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2022-12-06 21:20:17 +00:00
Konrad Sztyber
9e647c1f46 bdev: disallow get_buf() calls from other threads
This is unsafe, because we touch need_buf_* queues, which aren't
thread-safe.  Also, documented this requirement in
spdk_bdev_io_get_buf()'s description.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iabc141e051c543fdd51f079ae212f69e980d8148
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15668
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-05 09:51:26 +00:00
Changpeng Liu
296bca08c1 lib/vfu_target: add missed comments to vfu_target.h
Change-Id: Ie43b53e8b84b14ec92622a32052272d310e7145f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15029
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-12-02 10:44:51 +00:00
Jim Harris
bb926e803d nvmf: make poll groups count unassociated qpairs
We make decisions on how to pick a poll group for a new
qpair by looking at each poll group's current_io_qpairs
count.  But this count isn't always accurate since it
doesn't get updated until after the CONNECT has
been received.

This means that if we accept a bunch of connections
all at once, they may all get assigned the same poll
group, because the target poll groups counter doesn't
get immediately incremented.

So add a new counter, current_unassociated_qpairs,
to account for these qpairs.  We protect this counter
with a lock, since the accept thread will increment
the counter, and the poll group thread will
decrement it when the qpair receives the CONNECT
allowing us to associated with a subsystem/controller..

If the qpair gets destroyed before the CONNECT is
received, we can use the qpair->connect_received
flag to decrement current_unassociated_qpairs.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8bba8da2abfe225b3b9f981cd71b6f49e2b87391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15693
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-12-01 10:57:29 +00:00
Jim Harris
f3e197ff18 nvmf: add qpair->connect_received
Currently we use qpair->ctrlr at qpair destroy
time to decide if we need to decrement the
qpair's poll group's qpair count.  But this is
not correct - these counters get incremented
when the CONNECT is received, but qpair->ctrlr
doesn't get set until later.

So add a new connect_received bool to the spdk_nvmf_qpair.
Use this instead to determine when we should decrement
the poll group qpair counters.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I174a0fda36c4558171953bf58f2f5117bc074f76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2022-12-01 10:57:29 +00:00
Evgeniy Kochetov
8305e49b07 nvmf: Add copy command support
NVMf target reports copy command support if all bdevs in the subsystem
support copy IO type. Maximum copy size is reported for each namespace
independently in namespace identify data. For now we support just one
source range.

Note, that command support in the controller is initialized once on
controller create. If another namespace which doesn't support copy
command is added to the subsystem later, it will not be reflected in
the controller data structure and will not be communicated to the
initiator. Attempt to execute copy command on such namespace will
fail. This issue is not specific to copy command and applies also to
write zeroes and unmap (dataset management) commands.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I5f06564eb43d66d2852bf7eeda8b17830c53c9bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-11-30 08:50:06 +00:00
Richael Zhuang
f192c11bbf bdev: support to get histogram per channel
Added new API 'spdk_bdev_histogram_get_channel' to get histogram of
a specified channel for a bdev. A callback function is passed to it
to process the histogram.

Change-Id: If5d56cbb5fe6c39cda7882f887dcc9c6afa769ac
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15539
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-11-29 08:28:57 +00:00
Mike Gerdts
8dbaca1300 bdev: use spinlock instead of mutex
SPDK threads generally run on dedicated cores and locks should be rarely
contended. Thus, putting a thread to sleep while waiting on a mutex does
not free up CPU cycles for other pthreads or processes. Even when
running in interrupt mode, lock contention should be low enough that
spinlocks are a net win by avoiding context switches.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6e2e78b2835bbadb56bbec34918d998d75280dfd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-11-24 10:08:17 +00:00
Jim Harris
8203e68e24 thread: add spdk_thread_is_running()
This function can be useful to query if a thread
had spdk_thread_exit() called on it yet.  Internally
we have both EXITING and EXITED state - so
!spdk_thread_is_running() can be used to detect a
thread that is in either of those states.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2f6fb024a6b1bc895fdc5132c722abc10f5d30f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15512
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-11-23 08:22:04 +00:00
Jim Harris
090b8af12b thread: add spdk_thread_get_app_thread
The "app thread" will always be the first
thread created using spdk_thread_create().  There
are many operations throughout SPDK that implicitly
expect to happen in the context of this app thread,
so by formalizing it we can start to make assertions
on this to help clarify and simplify locking and
synchronization through the code base.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7133b58c311710f1d132ee5f09500ffeb4168b15
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15497
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-11-23 08:22:04 +00:00
Changpeng Liu
b45556e2b2 include/bdev_module.h: add SPDK_ prefix to macros
`BDEV_IO_NUM_CHILD_IOV` and `BDEV_RESET_IO_DRAIN_RECOMMENDED_VALUE`
are public macro definitions without `SPDK_` prefix, so we add the
`SPDK_` prefix to them.

Change-Id: I4be86459f0b6ba3a4636a2c8130b2f12757ea2da
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15425
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-11-22 10:03:57 +00:00
Thanos Makatos
70f185ea51 json: add spdk_json_write_named_double
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Suggested-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2439cd739240fb2d95c5cdaccc557ba9a8f6501b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15490
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-22 10:01:43 +00:00
Thanos Makatos
7f23638550 util: add function spdk_fd_group_get_epoll_event
This patch introduces function spdk_fd_group_get_epoll_event, which
returns the epoll(7) event that caused the file descriptor group
callback function to execute. Rather than changing the signature of
spdk_fd_fn in order to pass the struct epoll_event, which would result
in a gigantic patch where there vast majority of users would simply have
to ignore the new argument, we introduce this new API that allows to
return the epoll_event only when really needed.

Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Suggested-by: John Levon <john.levon@nutanix.com>
Change-Id: I3debe1382d1c2bfec6ae4fea274ee38ed0b135fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14935
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
2022-11-22 10:01:43 +00:00
Konrad Sztyber
86ba16c39c build: compile API functions with missing deps
We should always build all function that are part of the API, even if
some of the libraries they depend on are missing.  In that case, they
can return an error instead.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I72b450b3a1d62e222bd843e45be547d926414775
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-11-18 08:40:05 +00:00
Krzysztof Karas
b5bdbbb959 bdev: enable bdevs based on physical device to generate UUID
Add an option "--generate-uuids" to bdev_nvme_set_options
RPC to enable generation of UUIDs for NVMes devices that
do not provide this value themselves. The identifier is
based on a serial number of the device, so a bdev
using this NVMe will always be assigned the same UUID.

Part of enhancement from #2516.

Change-Id: I86d76274e5702d14ace89d83d1e9129573f543e2
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15151
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2022-11-18 08:38:13 +00:00
Michael Piszczek
1473d3b8c2 env_dpdk: fix check for AMD iommu
Update code for read the virtual address width to use glob to locate the
Intel and AMD iommu capability registers. This code should work for all
AMD numa configurations.

Fixes issue 2730

Signed-off-by: Michael Piszczek <mpiszczek@ddn.com>
Change-Id: Ibf5789087b7e372d892b53101e4c0231809053f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14961
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
2022-11-15 08:31:13 +00:00
John Levon
0d0de8e7d9 lib/rpc: add RPC allow list
Add an optional allowlist for RPC methods: if the method is not listed,
it is not allowed to be called or visible. This can be used to restrict
accidental mis-configurations, and generally helps locking down the
configuration surface.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ied78fc4b14b60cb94ed0852b92deb6df545cbec4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15275
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-15 08:31:02 +00:00
John Levon
1139cb1415 lib/util: add strarray utility functions
Add some basic utilities for handling arrays of strings.

Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I2333f3e4605175b1717a7f289847ff2d48745e8d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15274
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-11-15 08:31:02 +00:00
paul luse
a6dbe3721e update Intel copyright notices
per Intel policy to include file commit date using git cmd
below.  The policy does not apply to non-Intel (C) notices.

git log --follow -C90% --format=%ad --date default <file> | tail -1

and then pull just the 4 digit year from the result.

Intel copyrights were not added to files where Intel either had
no contribution ot the contribution lacked substance (ie license
header updates, formatting changes, etc).  Contribution date used
"--follow -C95%" to get the most accurate date.

Note that several files in this patch didn't end the license/(c)
block with a blank comment line so these were added as the vast
majority of files do have this last blank line.  Simply there for
consistency.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id5b7ce4f658fe87132f14139ead58d6e285c04d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
2022-11-10 08:28:53 +00:00
Richael Zhuang
cabbb25d5d bdev: add API to get submit tsc of a bdev I/O
Add API spdk_bdev_io_get_submit_tsc to get submit tsc of a bdev I/O,
which can be used in bdev modules to avoid calling expensive
spdk_get_ticks().

Change-Id: Ifbcecb1bc663344997c5e73b72a1dfb5d0422946
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14989
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-11-04 10:15:46 +00:00
Thanos Makatos
e51dde79b8 thread: use existing EPOLL defines
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: Ic8ed31c7c2a4b915ec969a0757fca73ebbfb4488
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14934
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-04 10:10:33 +00:00
Thanos Makatos
bad452d25e nvmf/vfio-user: calculate doorbells based on number of queue pairs
It doesn't make sense to have the size of the doorbells fixed and then
calculate the maximum number of queue pairs based on it, do it the other
way round. Also, add some sanity checks based on the spec.

Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I17e3509fb0a011128ca089ce78b7a296262e6f8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-11-04 10:10:33 +00:00
Shuhei Matsumoto
d683d7b792 bdev/part: Modify spdk_bdev_part_submit_request() to use custom completion callback
In the following patches, we will add a feature to inject data
corruption to the error bdev module. For read I/O, we will have
to inject data corruption at completion. However, if we use
spdk_bdev_part_submit_request(), it will not be possible because we
cannot add any custom operation into the completion callback.
To fix the issue, modify spdk_+bdev_part_submit_request() and
rename it to spdk_bdev_part_submit_request_ext().
Fortunately, we can use stored_user_cb in struct spdk_bdev_io.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I46d3c40ea88a3fedd8a8fef6b68ee417c814a7a1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15002
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-11-03 14:54:28 +00:00
Changpeng Liu
e6d258a64b include/vhost.h: fix comments
Change-Id: I0c4f52d8c30a325c9f4aa0626812bf9040292984
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15039
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-03 14:53:55 +00:00
Evgeniy Kochetov
8c3590a983 bdev: Add copy IO statistics
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Id51ac80bce33a27a8ccea273c076f39019b98339
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
2022-11-02 10:33:00 +00:00
Evgeniy Kochetov
d14afd5000 bdev: Add copy IO type
Copy operation is defined by source and destination LBAs and LBA count
to copy. For destiantion LBA and LBA count we reuse exiting fields
`offset_blocks` and `num_blocks` in `struct spdk_bdev_io`. For source
LBA new field `src_offset_blocks` was added.

`spdk_bdev_get_max_copy()` function can be used to retrieve maximum
possible unsplit copy size. Zero values means unlimited. It is allowed
to submit larger copy size but it will be split into several bdev IOs.

Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I2ad56294b6c062595c026ffcf9b435f0100d3d7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
2022-11-02 10:33:00 +00:00