Many parts of the blobstore.c seem to have gone with the assumption that
blob creation, deletion, etc. all happen on the md thread. This
assumption would allow modification of the bs->used_md_pages and
bs->used_clusters bit arrays without holding a lock. Placing
"assert(spdk_get_thread() == bs->md_thread)" in bs_claim_md_page() and
bs_claim_cluster() show that each of these functions are called on other
threads due writes to thin provisioned volumes.
This problem was first seen in the wild with this failed assertion:
bs_claim_md_page: Assertion
`spdk_bit_array_get(bs->used_md_pages, page) == false' failed.
This commit adds "assert(spdk_spin_held(&bs->used_lock))" in those
places where bs->used_md_pages and bs->used_lock are modified, then
holds bs->used_lock in the places needed to satisfy these assertions.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I0523dd343ec490d994352932b2a73379a80e36f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15953
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
A future commit will add to the complexity when returning with a
non-zero value. Rather than further complicating the several error
return locations, all affected error returns are handled after the error
label.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I56e8e338b0560f849399c085d0bb07efb7df26fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15983
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
A future commit may need to release a lock before returning. This
refactors blob_resize() to always return at end of the function using an
out label and goto.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I671fbdbe0e3b766c264c45589dad3a864ba1f192
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15982
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Convert bs->used_lock to a spinlock. This is being done to help with
the debugging and fixing of a race that has led to a failed assertion in
bs_claim_md_page.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I11b80096de022f79a217c65d787ee57ca54240f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15952
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
The bs->used_clusters_mutex protects used_md_pages, used_clusters, and
num_free_clusters. A more generic name is appropraite. The next patch in
this series will convert it from a mutex to a spinlock and having
"mutex" or "spin" in the name is of little help to maintainers, so a
more generic name is used.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5ce7b85b84fdec2a0c5d2ac959e0109e1d80c7f5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15981
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Basic IO completion counting can be done at the common
layer, to enable some level of stat tracking even for
transports that don't have transport-specific tracking
yet.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If04f854b97440089b8ad149b64cb59173c73975c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15912
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
For validation of upcoming DPDK releases, pci_dpdk needs
to initialize and work.
This patch adds support for testing DPDK main branch,
with appropriate notice given when that DPDK version is used.
Change-Id: I5257beac3a3926bd432d9c00e50858facd21e6f5
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15891
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Since SPDK holds copies of local DPDK headers for DPDK PCI API,
the same headers will now be used as includes.
It was already the case for DPDK 22.11, but not for DPDK 22.07.
Change-Id: I5859a630d1fb20b4ebf8628adb962f5e46c23788
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15969
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Shortly after DPDK 22.11 release it was amended with single
patch, which bumped the minor version.
No changes have occurred to the DPDK PCI API.
Change-Id: I94dadb23b3ad79cfbb21e848d718d909493137d1
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15890
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
As spdk_lvs_init() validates arguments, it uses o->cluster_sz in a
comparison but misleadingly prints opts.cluster_sz in the error message.
This changes the error message to print cluster_sz from the proper
structure.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I810bf9ad4a24ed7cc844c2835e0edda988cb2cbe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15970
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
The internal mempools were replaced with the newly added iobuf
interface.
To make sure we respect spdk_bdev_opts's (small|large)_buf_pool_size, we
call spdk_iobuf_set_opts() from spdk_bdev_set_opts(). These two options
are now deprecated and users should switch to spdk_iobuf_set_opts().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib1424dc5446796230d103104e272100fac649b42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This is done in a couple of places, so it makes sense to extract it to a
separate function.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Id34b2545d9912c2b7b65b1277711e9683db92658
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Users can now specify a number of small/large buffers to be cached on
each iobuf channel. Previously, we relied on the cache of the
underlying spdk_mempool, which has per-core caches. However, since iobuf
channels are tied to a module and an SPDK thread, each module and each
thread is now guaranteed to have a number of buffers available, so it
won't be starved by other modules/threads.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1e29fe29f78a13de371ab21d3e40bf55fbc9c639
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15634
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The idea behind "iobuf" is to have a single place for allocating data
buffers across different libraries. That way, each library won't need
to allocate its own mempools, therefore decreasing the memory footprint
of the whole application.
There are two reasons for putting these kind of functions in the thread
library. Firstly, the code is pretty small, so it doesn't make sense to
create a new library. Secondly, it relies on the IO channel abstraction,
so users will need to pull in the thread library anyway.
It's very much inspired by the way bdev layer handles data buffers (much
of the code was directly copied over). There are two global mempools,
one for small and one for large buffers, and per-thread queues that hold
requests waiting for a buffer. The main difference is that we also need
to track which module requested a buffer in order to allow users to
iterate over its pending requests.
The usage is fairly simple:
```
/* Embed spdk_iobuf_channel into an existing IO channel */
struct foo_channel {
...
struct spdk_iobuf_channel iobuf;
};
/* Embed spdk_iobuf_entry into objects that will request buffers */
struct foo_object {
...
struct spdk_iobuf_entry entry;
};
/* Register the module as iobuf user */
spdk_iobuf_register_module("foo");
/* Initialize iobuf channel in foo_channel's create cb */
spdk_iobuf_channel_init(&foo_channel->iobuf, "foo", 0, 0);
/* Finally, request a buffer... */
buf = spdk_iobuf_get(&foo_channel->iobuf, length,
&foo_objet.entry, buf_get_cb);
...
/* ...and release it */
spdk_iobuf_put(&foo_channel->iobuf, buf, length);
```
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ifaa6934c03ed6587ddba972198e606921bd85008
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15326
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Recently while updating the copyright notices
throughout SPDK the headers, env_dpdk copies of
DPDK headers were modified too.
This patch brings them to the exact version as
in DPDK upstream.
Change-Id: If30b8556386a539d81d2fc1a5e42293522ed91f5
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15856
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Copies of headers for DPDK PCI API were created before the
actual DPDK 22.11 release. The rte_bus_pci.h was
modified slightly with addition of rte_compat.h
include.
Please see relevant DPDK patch:
(1094dd9)cleanup compat header inclusions
This patch only makes the two align.
Change-Id: Ieb0397c6cf2d9027cf600bd0e064863b3782b846
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15855
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch adds "--msg-mempool-size" option for spdk app to allow
reactors' msg_mempool_size being configurable via commond line.
We tested the rbd_bdev performance for Ceph CTX sharing with high RBD volume count via bdevperf.
When testing with 256 volumes and limited Ceph CTX (e.g., 2 Ceph ctx for 256 volumes,
which are created though bdev_rbd_register_cluster),
error message "the *ERROR*: msg could not be allocated error message"
keeps showing and the bdev_perf program hangs.
We found the issue from the limited msg_mempool_size size, which is hardcoded
by SPDK_DEFAULT_MSG_MEMPOOL_SIZE in thread.h.
Therefore, we enable the "--msg-mempool-size" option to allow configurable msg_mempool_size.
Signed-off-by: Yue-Zhu <yue.zhu@ibm.com>
Change-Id: I54db7fd46247b2f18112bb994ecce6f4b7e5bf9c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15552
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The description was swapped with removal release, causing the logs to
look like this:
foo_bar: deprecation 'v23.05' scheduled for removal in foo.bar hit 1 times
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I422a35c5ec20c8a817bed0dd5d565dfc53ef6dc9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
The kernel vfio_pci driver module introduced vf_token checking
mechanism since kernel version 5.7, and has been supported by
DPDK. So add support for it to deal with the scenario of VF.
Signed-off-by: Jun Zeng <jun1.zeng@intel.com>
Change-Id: Ie9700fa395327da4e847c6213167284c148a64e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In _free_ctrlr(), ->endpoint can never be NULL, and the code was
self-contradictory; assume it's not NULL.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I81a449123ca05f64460380dc3a8ad8af2143d166
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Use standard "sqid" naming for a log message.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Icca8415cd17272ca7bd82667721c4131dd1df7f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15828
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This fixes the behavior of spdk_vhost_(set|get)_coalescing() on
non-vhost-user devices.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia17cd4c0ed4bad262090e05f83727c1516c21f92
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15772
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The current code for setting/getting coalescing setting only works with
vhost-user devices, while users can create virtio-blk devices with
non-vhost-user transport. Calling spdk_vhost_(set|get)coalescing() on
such device results in a segfault.
So, spdk_vhost_dev_backend interface is extended with methods to
set / get coalescing parameters. In the following patch, the virtio_blk
interface will be also extended with similar callbacks allowing us to
pipe coalescing settings to the appropriate transport.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ide5d5f633b17dcdbedb4b7804d5e45bf41373eca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15771
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Both the length of a request and the number of ranges to copy are
controlled by the user, so we should check them and return an error
instead of asserting that they're correct.
This fixes the `test/nvmf/target/fabrics_fuzz.sh` test.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I3481c4bb1f2c7676df81f41dfc95ef063924222e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15805
Reviewed-by: Pawel Piatek <pawelx.piatek@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Found with misspell-fixer.
Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: If062df0189d92e4fb2da3f055fb981909780dc04
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15207
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use the deprecation API to annotate and log the deprecation of
spdk_nvme_ctrlr_prepare_for_reset() using the tag
"nvme_ctrlr_prepare_for_reset".
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I98fd840aa9acc028a49bb47daf4ab7e88f1eb818
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15756
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
srandomdev is *only* used to emulate arc4random, so only
bother defining it on Linux when it's needed. This avoids
unused errors on newer distros packaging glibc versions
that now defined arc4random.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6e64a697d9633709cedd0198f75cf094d514562d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15814
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This patch fixes issue # 2809, by changing the max
completions processed per poll. A new parameter called
IDXD_MAX_COMPLETIONS is used to set maximum completions
processed per poll to 128 because we observed performance
degradation on a system with 16 NVMe SSDs at a queue depth
of 64 per SSD. When using DSA to compute the data digest,
the target application can issue upto 1024(16x64) request
to compute data digest concurrently to DSA. Limiting the
maximum completions processed per poll to 32 using
DESC_PER_BATCH cause up to 43% IOPS degradation.
Use IDXD_MAX_COMPLETIONS to control the number of
completions proccessed per poll in spdk_idxd_process_event
based on your workload. For example, if your application
is issuing 1000s of concurrent request to DSA you might
want to set IDXD_MAX_COMPLETIONS to a value higher than
128.
Change-Id: I2a1db993283a83a20266f40dac851728d63e6127
Signed-off-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15801
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is in prep for adding a new compressDev accel_fw
module that will contain all of the DPDK compressDev specifics
on it, the vbdev will make calls to the accel_fw instead.
As the accel_fw has SW based compression, we want the configure
option to apply to building the vbdev module but not the accel_sw
software implementation or the upcoming compressdev module.
Renamed to "compress" as reduce is a term specific to the vbdev
implementation of the compression to be provided by the accel_fw
and thus the same reason why we leave the test flag called REDUCE
because it's controlling tests for the reduce library as well as
the vbdev module that is using reduce. The flag does not apply
to the SW implementation of compression.
This does not affect upcoming accel_fw compressdev module, that
will have its own configure option.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: If8ed3e48e1e3dabcaad1cd161289e78122cd9d58
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15179
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Per specification.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic93349c7d3ed50fa6e502e39db0347141804d4c4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
While lvs->destruct is set in a few places, it is never read. Since it
is not used, it is removed.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Iee21e92c9049d143fca13930b4b5f328f9ec38f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15716
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Copy-on-write happens when cluster is written for the first time for
thin provisioned volume. Currently it is implemented as two separate
requests to underlying bdev: read of the whole cluster to bounce
buffer and then write of this buffer to the new location on the same
underlying bdev.
This patch improves copy-on-write flow by utilizing copy command of
underlying bdev if it is supported. In this case we have just one
request to bdev and don't need the bounce buffer.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I92552e0f18f7a41820d589e7bb1e86160c69183f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
New `translate_lba` operation allows to translate blob lba to lba on
the underlying bdev. It recurses down the whole chain of bs_dev's. The
operation may fail to do the translation when blob lba is not backed
by the real bdev. For example, when we eventually hit zeroes device in
the chain.
This operation is used in the next commit to get source LBA for copy
operation.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I89c2d03d1982d66b9137a3a3653a98c361984fab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14528
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
In the following patches, nvme_rdma_poll_group_set_cq() will
touch not only CQ but also SRQ and receive WR objects.
All these resources are of a poller.
Hence for clarification, rename nvme_rdma_poll_group_set_cq()
by nvme_rdma_qpair_set_poller().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic59ba5a45833e39b1b2647c000c8b953f1031d6b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14910
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Factor out reset failed recvs operation into a helper function
nvme_rdma_reset_failed_recvs(). This will make the following
patches simpler.
For send operation, this change is not required yet, but in future
we may support something like shared SQ. Hence, we do this change
for send operation too.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ib44acebe63e97e5a60ea6fa701b49278c7f44b45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14171
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
In the following patches, poll group will have rsps objects and to share
the code between poll group and qpair, option for creation will be used.
As a preparation, merge nvme_rdma_alloc_rsps() and
nvme_rdma_register_rsps() into nvme_rdma_create_rsps(). For consistency,
merge nvme_rdma_alloc_reqs() and nvme_rdma_register_reqs() into
nvme_rdma_create_reqs().
Update unit tests accordingly.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I92ec9e642043da601b38b890089eaa96c3ad870a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14170
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
When SRQ is supported, recv objects will be allocated by poll group
and qpair will associated and use them. In this case, we do not want
qpair to allocate and free recv objects. When connection is established,
it will be decided if SRQ is used or not. Hence, defer recv objects
allocation until connection is established.
Send objects are not affected directly by SRQ, but
nvme_rdma_register_reqs() no longer does any registration and deferring
send objects allocation makes the code more consistent. Hence, defer
send objects allocation until connection is established too.
Even after this patch, we rely on nvme_rdma_ctrlr_delete_io_qpair()
to free resources completely.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ic151fad01009d92a7fc809a730e6e9dff1a365f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14169
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Response objects will be in poll group when SRQ is enabled. But we want
to share the code to allocate and register response objects between SRQ
is enabled or disabled. To do it cleanly, move
nvme_rdma_qpair_submit_recvs() from nvme_rdma_register_rsps() to
nvme_rdma_connect_established(). A few clean up of error handling are
done in this patch. Unregistration will be done when qpair is
disconnected.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I38dc5a6cb84a6bf56c01d5fb7f2cf3d3b63918e0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14168
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This will make the following patches simpler.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id3d7c025525b35c1c2b96027430789a8d8f2697b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14422
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Inline nvme_rdma_post_recv() into the callers.
We do not have any similar helper function for posting send WR.
This will make the following patches simpler and will be reasonable.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ia95a4b350942d20bdb65e84f7575c2dcf67c149b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14421
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Extract and inline the conditional nvme_rdma_qpair_submit_sends()
and nvme_rdma_qpair_submit_recvs() calls.
This will cralify the logic and make the following patches simpler.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibe217c6f4fb2880af1add8c0429f92b4de107da8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
When SRQ is supported, rsp array will be in either qpair or poller.
To make this difference transparent, rdma_req caches rdma_rsp and
rdma_rsp caches recv_wr directly instead of caching indecies.
Additionally, do a very small clean up together.
spdk_rdma_get_translation() gets a translation for a single entry
of a rsps array. It is more intuitive to use rsp.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I61c9d6981227dc69d3e306cf51e08ea1318fac4b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>