Commit Graph

1748 Commits

Author SHA1 Message Date
Damiano Cipriani
c34d4d490d
lvol/blob: add shallow copy over a given device
A shallow copy will copy over the destination device only the
cluster allocated to the blob/lvol discarding those belonging
to the blob/lvol parent snapshot. blob/lvol must be read only.

Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
2023-06-06 15:58:36 +02:00
Damiano Cipriani
980f535d38 module/raid: fix rebase with master
This is a fix of a rebase operation done over the branch
created from Gerrit 16167 patch

Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
2023-05-10 19:59:21 +08:00
Damiano Cipriani
6322b4ce40 module/raid: increase raid name size in superblock
The size of raid name inside raid superblock structure is
increased to 64 chars to let the creation of raid bdev with longer name.

Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
2023-05-09 17:58:11 +08:00
Damiano Cipriani
01e9afd39e raid1: allow creation with a single base bdev
The minimun number of raid1 base bdevs is set to 1, so a raid1 bdev
can be created with only one base bdev.

Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
e325fbafec module/raid: allow assembly of a degraded raid
Add num_base_bdevs_operational to raid_bdev and use it to determine the
required number of base bdevs.

Change-Id: I31b39cc8ea708b6cdce748f015949e4c9fdeb3cd
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
ea4b2f6d75 module/raid: update superblock on base bdev removal
Change-Id: I713053a4928139fdf8aa43ebf47a743bec3d5054
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
432d1a99cc module/raid: show base bdev details in json
Change-Id: I0da3e91e7736bc651e284f68238ace864def87b2
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
e61c1c51be module/raid: check for existing superblock on a base_bdev
When adding a new base bdev to a raid bdev (currently only when creating
a new raid bdev) make sure that there is no existing superblock
stored on the base bdev. This prevents accidentally overwriting a base
bdev belonging to a different raid array.

Change-Id: Id5f6c7e3ed7223f6a8fc7455f75831fbbcac7e43
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
957acd43b9 module/raid: assemble raid bdev from superblock
Change the bdev_raid examine procedure to read the superblock from the
examined base bdev. If a valid superblock is found, re-create the
raid_bdev from it.

Change-Id: I4bd589647a207a216ecf0dec9baf11c5d691f5d5
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
668c5a769e module/raid: write initial superblock
When creating the raid_bdev with enabled superblock option, write the
superblock to the base bdevs before bringing the array online.

Change-Id: I24659202ef3bbe6c87ca8603d514bd81660c9b41
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
69946f9eac module/raid: use raid_bdev_free() for cleanup in raid_bdev_create()
Change-Id: I73e283d4f5d98f2bb95d0dbab7e3abbb007134e4
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
ff9d703946 module/passthru: add uuid option for creating passthru bdev
Change-Id: I1a298161018553feea00248568f2ea786a08ff64
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
882ecb55a8 util/uuid: add API to test/set null uuid
Refactor the code to use these new functions.

Change-Id: I21ee7e9a96f30fbd60106add5e8b071e86bf93c9
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Krzysztof Smolinski
4d783e7255 raid1: read balancing
Reads for raid1 bdevs are balanced. Algorithm tries to evenly distribute
load by sending read I/O to all base bdevs following round-robin, but
skiping base bdev that processed most data so far.

Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: I7d85411a6421bd7352031efb562ee95f2c612011
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
951d766289 raid5f: degraded io support
Change-Id: I1af2ffc3fe1f41b798e15b5194ab5695923737ef
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Krzysztof Smolinski
6680fdf818 module/raid: bdev_raid_remove_base_bdev rpc
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: I4829f6cd0c10bfcd2c6893cf9412fc974c4b338c
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
cdf0959bc4 raid1: degraded mode support
Properly handle IO when one or more base bdevs are missing.

Change-Id: I51161b01a625c20da5156d7db1c5e5d9b62ce298
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
65f2af1dcb module/raid: continue operation after base bdev removal
Don't stop the raid bdev if the minimum number of base bdevs are
available.

When removing a base bdev, first suspend the raid bdev and then perform
the actual removal/cleanup. Finally, resume the raid bdev.

Change-Id: Ie010d3760c32b0dad455a5a2a0ab7adcc602edf9
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
f71c255827 module/raid: add raid_bdev ptr to raid_base_bdev_info
This allows to simplify some code where raid_bdev and base_info are
needed.

Change-Id: I40395204fdcdd0487bdecec1cd47efb347f1310a
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
79eccac059 module/raid: suspend/resume IO
Add functions to suspend and resume IO on all channels. This will be
used to safely change the device state in case of e.g. removing a base
bdev.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: I203c1899bde15101e0c2bc8da7a1066a2fee6dd2
2023-05-09 17:58:11 +08:00
Krzysztof Smolinski
8c591e2d4f module/raid: data offset and data size implementation
When raid bdev is created with superblock parameter then all data on
this bdev should be shifted by some offset. Such space at the beginning
of bdev will be used to store on-disk raid metadata.

Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: I2545a2b00a651ef5332ca1757da0110a63914a43
2023-05-09 17:58:11 +08:00
Krzysztof Smolinski
19c79b0d01 rpc/bdev_raid_create: added superblock parameter
Introduction of superblock parameter for bdev_raid_create rpc. This
parameter determines whether raid bdev should be created with support
for on-disk metadata (support for raid on-disk metadata is going to be
implemented in the future).

Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: Ie8c64f837dd7eb3ba788b7c5d7bc98e8f1368ba7
2023-05-09 17:58:11 +08:00
Artur Paszkiewicz
ba752093db raid5f: convert to use accel framework for xor
Change-Id: Id8fb521549342564bcf4288d74337fb4dd41fa03
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
2023-05-09 17:58:11 +08:00
Amir Haroush
1103ce1d71 bdev/ocf: fix possible memory leak in ctx_data_alloc
Signed-off-by: Amir Haroush <amir.haroush@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Change-Id: I8b33e62bd6e0f297e6fc325942c501100855fd6c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17939
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
2023-05-09 17:58:11 +08:00
Shuhei Matsumoto
ae8eebd680 bdev/nvme: Change if->else to if->return for failover_trid()
This refactroing will reduce the size of the next patch significantly.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I2eb7ec62e6c559d9e69334e73de49e8bf97a35dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17652
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00
Shuhei Matsumoto
610265c9fa bdev/nvme: Reset I/O disables retry when destroying I/O qpairs
As the RBD bdev module does, the upper layer wants the reset command
to abort or complete all I/Os submitted before the reset command.

To satisfy this requirement, return all aborted I/Os by deleting I/O
qpairs to the upper layer without retry. To return all aborted I/Os
by deleting I/O qpairs, enable DNR for I/O qpairs. These I/O qpairs
are deleted and recreated. Hence, we do not have to disable DNR.

No more I/O comes at a reset I/O because the generic bdev layer already
blocks I/O submission. However, some I/Os may be queued for retry even
after deleting I/O qpairs. Hence, abort all queued I/Os for the bdev
before completing the reset I/O.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I9830026ef5f2b9c28aee92e6ce4018ed8541c808
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16836
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00
Shuhei Matsumoto
7ea8a5aae5 bdev/nvme: Reset I/O cancels reconnect timer and starts reconnection
Previously, if a reconnect timer was registered when a reset request
came, the reset request failed with -EBUSY. However, this means the
reset request was queued for a long time until the reconnect timer was
expired.

When a reconnect timer is registered, reset is not actually in progress.
Hence, a new reset request can cancel the reconnect timer and can start
reconnection safely.

Add a unit test case to verify this change.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ied8dd0ad822d2fd6829d88cd56cb36bd4fad13f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16823
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00
Amir Haroush
ec2abc81a2 bdev/ocf: add bdev_ocf_reset_stats RPC
Signed-off-by: Amir Haroush <amir.haroush@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Change-Id: Ife91df62099e14d328a767b1bbb3ddd3ded57264
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
c9b802ca49 lvol: add spdk_lvol_is_degraded
This is mostly a wrapper around spdk_blob_is_degraded(), but it also
performs a NULL check on lvol->blob. Since an lvol without a blob cannot
perform IO, this condition returns true.

The two callers of spdk_blob_is_degraded() in vbdev_lvol.c have been
updated to use spdk_lvol_is_degraded().

Change-Id: I11dc682a26d971c8854aeab280c8199fced358c3
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17896
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
b240c2b103 lvol: lvol destruction race leads to null deref
As an lvolstore is being destroyed, _vbdev_lvs_remove() starts an
interation through the lvols to delete each one, ultimately leading to
the destruction of the lvolstore with a call to lvs_free(). The callback
passed to vbdev_lvs_destruct() is always called asynchronously via
spdk_io_device_unregister() in bs_free().

When the lvolstore resides on bdevs that perform async IO (i.e. most
bdevs other than malloc), this gives a small window when the lvol bdev
is not registered but a lookup with spdk_lvol_get_by_uuid() or
spdk_lvol_get_by_names() will succeed. If rpc_bdev_lvol_delete() runs
during this window, it can get a reference to an lvol that has just been
unregistered and lvol->blob may be NULL. This lvol is then passed to
vbdev_lvol_destroy().

Before this fix, vbdev_lvol_destroy() would call:

   spdk_blob_is_degraded(lvol->blob);

Which would then lead to a NULL pointer dereference, as
spdk_blob_is_degraded() assumes a valid blob is passed. While a NULL
check would avoid this particular problem, a NULL blob is not
necessarily caused by the condition described above. It would better to
flag the lvstore's destruction before returning from
vbdev_lvs_destruct() and use that flag to prevent operations on the
lvolstore that is being deleted. Such a flag already exists in the form
of 'lvs_bdev->req != NULL', but that is set too late to close this race.

This fix introduces lvs_bdev->removal_in_progress which is set prior to
returning from vbdev_lvs_unload() and vbdev_lvs_destruct(). It is
checked by vbdev_lvol_destroy() before trying to destroy the lvol.  Now,
any lvol destruction initiated by something other than
vbdev_lvs_destruct() while an lvolstore unload or destroy is in progress
will fail with -ENODEV.

Fixes issue: #2998

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I4d861879097703b0d8e3180e6de7ad6898f340fd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17891
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Amir Haroush
db09c7265b Revert "deprecation: remove Open CAS Framework"
This reverts commit 32908cbfc8.

OCF deprecation notice has removed as
Huawei is picking up support for the OCF project.

Signed-off-by: Amir Haroush <amir.haroush@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Change-Id: I007e80bc74dc50cfa9b8cde97fc6fdc9608d7ebd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17894
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Amir Haroush
90e1d2b02c Revert "ocf: clarify deprecation notice"
This reverts commit c5224a96ae.

OCF deprecation notice has removed as
Huawei is picking up support for the OCF project.

Signed-off-by: Amir Haroush <amir.haroush@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Change-Id: I80ebfe75eaa1a9b96249ed578fcaff6e9576928f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17893
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2023-05-09 17:58:11 +08:00
Mike Gerdts
57fd506c57 bdev_gpt: use unique partition GUID as bdev UUID
In releases of SPDK prior to v23.01, GPT bdevs had a random UUID. This
ended with commit a1c7ae2d3f, which is OK
because a non-persistent UUID is not all that useful.

Per Table 5.6 in Section 5.3.3 of UEFI Spec 2.3, each partition has a
16-byte UniquePartitionGUID:

  GUID that is unique for every partition entry. Every partition ever
  created will have a unique GUID. This GUID must be assigned when the
  GPT Partition Entry is created.  The GPT Partition Entry is created
  whenever the NumberOfPartitionEntries in the GPT Header is increased
  to include a larger range of addresses.

With this change, GPT bdevs use this unique partition GUID as the bdev's
UUID.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Id8e8aa9e7903d31f199e8cfdb487e45ce1524d7b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2023-05-09 17:58:11 +08:00
Alexey Marchuk
51d7df517c accel/dpdk_cryptodev: Fix use of uninitialized variable
rc might be not initialized and it was not correct to
use it in this place.

Fixes 6b7cca1542 accel/dpdk_cryptodev: Handle OP_STATUS_SUCCESS

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: Ifd2b3032afd6830bd851adb61f68ae4fa9621d33
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17656
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Alexey Marchuk
fdec622361 bdev/crypto: Put accel buffer when write completes
Accel buffer is released when encrypt operation
completes, however it doesn't mean that base
bdev finishes writing encrypted data. As result,
accel buffer might be reused in another IO, that
leads to data corruption.

Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I1acf7c30da2f92989ecc44e96b00f7609058ec5a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17655
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
2023-05-09 17:58:11 +08:00
Mike Gerdts
dcd012e8d0 vbdev_lvol: esnap memdomain support
Return the total number of memory domains supported by the blobstore and
any external snapshot bdev.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I2f8afba6b31e689b8f942e2cf36906a0a30f38c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16430
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Ziye Yang
b7525d2332 bdev/rbd: Do not submit IOs through thread sending.
Currently, we send IOs to the main_td thread.
It is not needed, because all the read/write functions
provided by librbd are thread safe, so we can eliminate the
thread send messaging policy for read/write related functions.

And with this patch, users can observe the load balance
distribution of I/Os on each CPU core owned by spdk applications
through spdk_top tool.

In this patch, we did the following work:

1 Move rbd_open when create the bdev since we will create once.
2 Simplify the channel management.
3 Do not use thread send messaging to do the read/write I/Os.

According to our experiment results showed in
https://github.com/spdk/spdk/issues/2204

There will be more than 15% performance improvment in IOPS aspect
for different write I/O patterns, and it also addresses the I/O Load
balance issues.

Fixes issue: #2204

Change-Id: I9d2851c3d772261c131f9678f4b1bf722328aabb
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17644
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
0dc9169a44 vbdev_lvol: allow degraded lvols to be deleted
An esnap clone is now deletable when its external snapshot is missing.
Likewise, the tree of degraded lvols rooted at a degraded esnap clone
can also be deleted, subject to the normal restrictions.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I711ae25d57f5625a955d1f4cdb2839dd0a6cb095
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17549
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
a0790ea1af vbdev_lvol: load esnaps via examine_config
This introduces an examine_config callback that triggers hotplug of
missing esnap devices.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5ced2ff26bfd393d2df4fd4718700be30eb48063
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16626
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
2e50d2bc46 include: add libgen.h to stdinc.h
A subsequent patch will need to use dirname(3), declared in libgen.h.
Because libgen.h is a POSIX header, the SPDK build requires that it is
defined in spdk/stdinc.h, not in the file that needs it.

libgen.h also declares basename() which has a conflicting declaration in
string.h. A small change is required in bdev_uring_read_sysfs_attr() to
accommodate this.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ib4ded2097881668aabdfd9f1683f933ce418db2e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17557
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
0fa543f209 vbdev_lvol: degraded open of esnap clones
If an esnap clone is missing its snapshot the lvol should still open in
degraded mode. A degraded lvol will not have a bdev registered and as
such cannot perform any IO.

Change-Id: I736194650dfcf1eb78214c8896c31acc7a946b54
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16425
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
ebb9085755 vbdev_lvol: early return in _vbdev_lvs_remove
This replaces nested if statements with equivalent logic that uses
early returns. Now the code fits in 100 columns and will allow the next
patch in this series to avoid adding a fifth level of indentation.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ief74d9fd166b2fe1042c78e12fe79d5f325aa502
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17548
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
a4862f5a56 vbdev_lvol: add bdev_lvol_get_lvols RPC
This provides information about logical volumes without providing
information about the bdevs. It is useful for listing the lvols
associated with specific lvol stores and for listing lvols that are
degraded and have no associated bdev.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I795161ac88d9707831d9fcd2079635c7e46ecc42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17547
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
68cde3b770 vbdev_lvol: external snapshot rpc interface
Add RPC interfaces for creation of esnap clone lvols. This also
exercises esnap clone creation and various operations involving
snapshots and clones of esnap clones to ensure that bdev_get_bdevs
reports state correctly.

Change-Id: Ib87d01026ef6e45203c4d9451759885a7be02d87
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14978
Reviewed-by: Michal Berger <michal.berger@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
be54ccc4a9 vbdev_lvol: allow creation of esnap clones
This adds the ability for create esnap clone lvol bdevs.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ifeef983430153d84d896d282fe914c6671283762
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16590
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
f5e42c6c3f vbdev_lvol: create esnap blobstore device
Register an spdk_bs_esnap_dev_create callback when initializing or
loading an lvstore. This is the first of several commits required to add
support enable lvol bdevs to support external snapshots and esnap
clones.

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I35c4e61fdbe5b93d65b9374e0ad91cb7fb94d1f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16589
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Mike Gerdts
6d71e476ec lvol: keep track of missing external snapshots
If an lvol is opened in degraded mode, keep track of the missing esnap
IDs and which lvols need them. A future commit will make use of this
information to bring lvols out of degraded mode when their external
snapshot device appears.

Change-Id: I55c16ad042a73e46e225369bfff2631958a2ed46
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16427
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
Denis Barakhtanov
52e51a29df bdev/daos: using SPDK_CONTAINEROF instead of container_of
DAOS bdev was implicitly expecting `container_of` to be in daos_event.h
With upcoming DAOS release the location of `container_of` has changed.
`SPDK_CONTAINEROF` is now used in the module.

Signed-off-by: Denis Barakhtanov <denis.barahtanov@croit.io>
Change-Id: Ia88365322fef378af6b1708b8704827bca1b828d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17719
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-05-09 17:58:11 +08:00
Richael Zhuang
2a0b0ba782 bdev_nvme: fix heap-use-after-free when detaching controller
There is heap-use-after-free error when detaching a controller
when "io_path_stat" option set as true.
(if build spdk without asan ubsan, error is free(): corrupted
unsorted chunks)

It's because io_path is accessed in bdev_nvme_io_complete_nvme_status
after the io_path is freed.

io_path is freed when we detach the controller in function
_bdev_nvme_delete_io_path, this function will execute 1 and 2.
And before 4 is executed, 3 may be executed which accesses io_path.

1.spdk_put_io_channel() is called. bdev_nvme_destroy_ctrlr_channel_cb
has not been called.
2.free(io_path->stat); free(io_path);
3.bdev_nvme_poll; nbdev_io1 is success; bdev_nvme_io_complete_nvme_status()
access nbdev_io1->io_path.
4.bdev_nvme_destroy_ctrlr_channel_cb disconnect qpair and abort nbdev_io1.

This patch fixed this by moving 2 down under 4. We don't free io_path in
_bdev_nvme_delete_io_path but just remove from the nbdev_ch->io_path_list.

The processes to reproduce the error:
target: run nvmf_tgt
initiator: (build spdk with asan,ubsan enabled)
sudo ./build/examples/bdevperf --json bdevperf-multipath-rdma-active-active.json  -r tmp.sock -q 128 -o 4096  -w randrw -M 50 -t 120
sudo ./scripts/rpc.py -s tmp.sock  bdev_nvme_detach_controller -t rdma -a 10.10.10.10 -f IPv4 -s 4420 -n nqn.2016-06.io.spdk:cnode1 NVMe0

========
bdevperf-multipath-rdma-active-active.json

{
  "subsystems": [
  {
    "subsystem": "bdev",
    "config": [
       {
         "method":"bdev_nvme_attach_controller",
         "params": {
           "name": "NVMe0",
           "trtype": "tcp",
           "traddr": "10.169.204.201",
           "trsvcid": "4420",
           "subnqn": "nqn.2016-06.io.spdk:cnode1",
           "hostnqn": "nqn.2016-06.io.spdk:init",
           "adrfam": "IPv4"
        }
      },
      {
        "method":"bdev_nvme_attach_controller",
        "params": {
        "name": "NVMe0",
        "trtype": "rdma",
         "traddr": "10.10.10.10",
           "trsvcid": "4420",
           "subnqn": "nqn.2016-06.io.spdk:cnode1",
           "hostnqn": "nqn.2016-06.io.spdk:init",
           "adrfam": "IPv4",
           "multipath": "multipath"
        }
    },
    {
       "method":"bdev_nvme_set_multipath_policy",
       "params": {
         "name": "NVMe0n1",
         "policy": "active_active"
       }
    },
    {
       "method":"bdev_nvme_set_options",
         "params": {
           "io_path_stat": true
         }
    }
    ]
    }
  ]
}
======

Change-Id: I8f4f9dc7195f49992a5ba9798613b64d44266e5e
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17581
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2023-05-09 17:58:11 +08:00
Ben Walker
fe1ea7cebf sock/posix: Fix sendmsg_idx rollover for zcopy
If the idx gets to UINT32_MAX we need to ensure it doesn't wrap around
before we check if we're done iterating.

Fixes #2892

Change-Id: I2c57ed2a6f6eda16e2d1faa63e587dca0b380a17
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17687
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-05-09 17:58:11 +08:00