Get a fragmap for a specific segment of a logical volume using the provided offset and size.
A fragmap is a bitmap that records the allocation status of clusters. A value of "1" indicates
that a cluster is allocated, whereas "0" signifies that a cluster is unallocated.
Longhorn 6138
Signed-off-by: Derek Su <derek.su@suse.com>
Implemented unit test for raid grow adding a new base bdev to an
existing raid bdev operation.
Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
Function raid_bdev_resume takes as input parameters also a function
pointer and a context to be called at the end of the resume operation.
If no callback should be called, NULL parameters can be passed.
Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
A shallow copy will copy over the destination device only the
cluster allocated to the blob/lvol discarding those belonging
to the blob/lvol parent snapshot. blob/lvol must be read only.
Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com>
Add num_base_bdevs_operational to raid_bdev and use it to determine the
required number of base bdevs.
Change-Id: I31b39cc8ea708b6cdce748f015949e4c9fdeb3cd
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
When adding a new base bdev to a raid bdev (currently only when creating
a new raid bdev) make sure that there is no existing superblock
stored on the base bdev. This prevents accidentally overwriting a base
bdev belonging to a different raid array.
Change-Id: Id5f6c7e3ed7223f6a8fc7455f75831fbbcac7e43
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change the bdev_raid examine procedure to read the superblock from the
examined base bdev. If a valid superblock is found, re-create the
raid_bdev from it.
Change-Id: I4bd589647a207a216ecf0dec9baf11c5d691f5d5
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
When creating the raid_bdev with enabled superblock option, write the
superblock to the base bdevs before bringing the array online.
Change-Id: I24659202ef3bbe6c87ca8603d514bd81660c9b41
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
There is a trap set up to kill the spdk process in case of error, there
is no need to delete any bdevs before that.
Change-Id: Ic80e2a48453f718fbc42cabe88d86eefa35c95db
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Refactor the code to use these new functions.
Change-Id: I21ee7e9a96f30fbd60106add5e8b071e86bf93c9
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reads for raid1 bdevs are balanced. Algorithm tries to evenly distribute
load by sending read I/O to all base bdevs following round-robin, but
skiping base bdev that processed most data so far.
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: I7d85411a6421bd7352031efb562ee95f2c612011
Support for multi-stripe requests is not used anymore so remove it.
Change-Id: I8f28817763452674c8a183c640800f3a4b4b3653
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Remove struct test_request_conf and instead do some basic reads on each
chunk. Also remove the io_info splitting because it is not used now.
Change-Id: I4b945b40598670f6ab84fb8066278877fee7fb75
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
This allows to simplify some code where raid_bdev and base_info are
needed.
Change-Id: I40395204fdcdd0487bdecec1cd47efb347f1310a
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Add functions to suspend and resume IO on all channels. This will be
used to safely change the device state in case of e.g. removing a base
bdev.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: I203c1899bde15101e0c2bc8da7a1066a2fee6dd2
When raid bdev is created with superblock parameter then all data on
this bdev should be shifted by some offset. Such space at the beginning
of bdev will be used to store on-disk raid metadata.
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: I2545a2b00a651ef5332ca1757da0110a63914a43
Introduction of superblock parameter for bdev_raid_create rpc. This
parameter determines whether raid bdev should be created with support
for on-disk metadata (support for raid on-disk metadata is going to be
implemented in the future).
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Change-Id: Ie8c64f837dd7eb3ba788b7c5d7bc98e8f1368ba7
The test already checked ENOMEM handling, but it only used bdevs that
support chaining (crypto, malloc), so bdev layer didn't need to execute
any accel operations. So, to force bdev layer to do that, a passthru
bdev was added, as it doesn't support chaining.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I322a65ccebb0f144c759692fff285cfd44bbab4b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17766
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Note that the prepare_for_reset flag in spdk_nvme_ctrlr is
still needed - it's just set now in the nvme_ctrlr_disconnect
path instead of this deprecated and now removed API.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0a6aa1c72767eb67a84b8928a986e06cbac88240
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17936
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
As the RBD bdev module does, the upper layer wants the reset command
to abort or complete all I/Os submitted before the reset command.
To satisfy this requirement, return all aborted I/Os by deleting I/O
qpairs to the upper layer without retry. To return all aborted I/Os
by deleting I/O qpairs, enable DNR for I/O qpairs. These I/O qpairs
are deleted and recreated. Hence, we do not have to disable DNR.
No more I/O comes at a reset I/O because the generic bdev layer already
blocks I/O submission. However, some I/Os may be queued for retry even
after deleting I/O qpairs. Hence, abort all queued I/Os for the bdev
before completing the reset I/O.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I9830026ef5f2b9c28aee92e6ce4018ed8541c808
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16836
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
When I/O error resiliency is supported, most DNR parameters for internal
APIs were cleared. However, for some cases, especially for the reset I/O
command, the upper layer wants the NVMe driver to return I/O errors
immediately without retry even if the upper layer enables I/O error retry.
To satisfy such requirement, add an abort_dnr variable to the spdk_nvme_qpair
structure and internal abort APIs use the abort_dnr variable. A public API
spdk_nvme_qpair_set_abort_dnr() can change the abort_dnr variable dynamically.
The public spdk_nvme_transport_ops structure is not changed to avoid
premature changes.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I486a1b3ad8411f9fa261a2bf3a45aea9da292e9c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17099
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously, if a reconnect timer was registered when a reset request
came, the reset request failed with -EBUSY. However, this means the
reset request was queued for a long time until the reconnect timer was
expired.
When a reconnect timer is registered, reset is not actually in progress.
Hence, a new reset request can cancel the reconnect timer and can start
reconnection safely.
Add a unit test case to verify this change.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ied8dd0ad822d2fd6829d88cd56cb36bd4fad13f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16823
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The llvm_precompile function checks for the CLANG version available on the machine
using bash regex and searches for fuzzer libraries in a path based on the full CLANG
version number. (e.g. /usr/lib64/clang/15.0.3/...)
However, on the newest Fedora distribution, the path has changed and fuzzer libraries
couldn't be found. Currently, CLANG libraries path contains only major version number
(/usr/lib64/clang/16)
To address this issue, the function has been updated to search only for the major
CLANG version number instead of the full version number. Instead of using clang_version,
the function now uses clang_num because in every Fedora distribution there is directory
or symlink that points to the right CLANG version.
e.g. symlinks
/usr/lib64/clang/13 -> /usr/lib64/clang/13.0.1
/usr/lib64/clang/15 -> /usr/lib64/clang/15.0.3
or directory:
/usr/lib64/clang/16
Fixes#3000
Signed-off-by: Kamil Godzwon <kamilx.godzwon@intel.com>
Change-Id: Iaf0dedc2bb3956cf06796e2eb60a5fa6f492b780
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17907
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
on some hosts, it might take 1 or 2 seconds for the
mapper device to appear on /dev
in this case, the test will fail
because we check if the device exists immediately.
by giving it chance to get up the test will pass.
Signed-off-by: Amir Haroush <amir.haroush@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Change-Id: I996d84861333d29d5c9370a2c5a471e7962d91b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17912
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is mostly a wrapper around spdk_blob_is_degraded(), but it also
performs a NULL check on lvol->blob. Since an lvol without a blob cannot
perform IO, this condition returns true.
The two callers of spdk_blob_is_degraded() in vbdev_lvol.c have been
updated to use spdk_lvol_is_degraded().
Change-Id: I11dc682a26d971c8854aeab280c8199fced358c3
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17896
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
As an lvolstore is being destroyed, _vbdev_lvs_remove() starts an
interation through the lvols to delete each one, ultimately leading to
the destruction of the lvolstore with a call to lvs_free(). The callback
passed to vbdev_lvs_destruct() is always called asynchronously via
spdk_io_device_unregister() in bs_free().
When the lvolstore resides on bdevs that perform async IO (i.e. most
bdevs other than malloc), this gives a small window when the lvol bdev
is not registered but a lookup with spdk_lvol_get_by_uuid() or
spdk_lvol_get_by_names() will succeed. If rpc_bdev_lvol_delete() runs
during this window, it can get a reference to an lvol that has just been
unregistered and lvol->blob may be NULL. This lvol is then passed to
vbdev_lvol_destroy().
Before this fix, vbdev_lvol_destroy() would call:
spdk_blob_is_degraded(lvol->blob);
Which would then lead to a NULL pointer dereference, as
spdk_blob_is_degraded() assumes a valid blob is passed. While a NULL
check would avoid this particular problem, a NULL blob is not
necessarily caused by the condition described above. It would better to
flag the lvstore's destruction before returning from
vbdev_lvs_destruct() and use that flag to prevent operations on the
lvolstore that is being deleted. Such a flag already exists in the form
of 'lvs_bdev->req != NULL', but that is set too late to close this race.
This fix introduces lvs_bdev->removal_in_progress which is set prior to
returning from vbdev_lvs_unload() and vbdev_lvs_destruct(). It is
checked by vbdev_lvol_destroy() before trying to destroy the lvol. Now,
any lvol destruction initiated by something other than
vbdev_lvs_destruct() while an lvolstore unload or destroy is in progress
will fail with -ENODEV.
Fixes issue: #2998
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I4d861879097703b0d8e3180e6de7ad6898f340fd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17891
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This automatically cleans up aio files left over from earlier aborted
runs. This helps streamline development of new tests and should have no
impact on CI.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Id65f60cdfc9969fda1dcdd17e60643ad87f45de7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17898
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
LLVMFuzzerRunDriver does not allow to specify minimum input length,
return immediately when data insufficient.
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I306e1774b17b04108f2454b2fdaadb4d912bd274
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17884
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
As same as copy command, calculation of max write_zeroes size for
fallback case includes division and is costly. The result is constant
for each bdev. Hence, we can calculate it only once and store it into
bdev->max_write_zeroes at bdev registration. However, in unit tests,
bdev->blocklen and bdev->md_len can be changed dynamically. Hence,
adjust bdev->max_write_zeroes for such changes.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I16e4980e7a283caa6c995a7dc61f7e77585d464e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17911
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The generic bdev layer has a fallback meachanism for the copy command
used when the backend bdev module does not support it. However, its max
size is limited. To remove the limitation, the fallback supports split by
using the unified split logic rather than following the write zeroes
command.
bdev_copy_should_split() and bdev_copy_split() use spdk_bdev_get_max_copy()
rather then referring bdev->max_copy to include the fallback case.
Then, spdk_bdev_copy_blocks() does the following.
If the copy size is large and should be split, use the generic split
logic regardless of whether copy is supported or not.
If copy is supported, send the copy request, or if copy is not
supported, emulate it using regulard read and write requests.
Add unit test case to verify this addition.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iaf51db56bb4b95f99a0ea7a0237d8fa8ae039a54
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17073
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Calculation of max copy size for fallback case includes division and is
costly. The result is constant for each bdev. Hence we can calculate it
only once and store it into bdev->max_copy at bdev registration.
Calculation of max copy size for fallback case is almost same as
calculation of max write zero size for fallback case. To reuse the
calculation, the helper function is named as bdev_get_max_write() and
has a num_bytes parameter.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iac83a1f16b908d8b36b51d9c51782de40313b6c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17909
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The following patches will change spdk_bdev_register() to access iobuf
and bdev's blocklen and blockcnt.
Hence, we have to configure these correctly for alltest cases.
Move ut_init/fini_bdev() up in a file. Add missing ut_init/fini_bdev()
and allocate/free_bdev() calls for some test cases. Add blockcnt and
blocklen to allocate_vbdev().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iccbb1cfe4dcdc4496f15304b5362d76d5296607f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17908
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In releases of SPDK prior to v23.01, GPT bdevs had a random UUID. This
ended with commit a1c7ae2d3f, which is OK
because a non-persistent UUID is not all that useful.
Per Table 5.6 in Section 5.3.3 of UEFI Spec 2.3, each partition has a
16-byte UniquePartitionGUID:
GUID that is unique for every partition entry. Every partition ever
created will have a unique GUID. This GUID must be assigned when the
GPT Partition Entry is created. The GPT Partition Entry is created
whenever the NumberOfPartitionEntries in the GPT Header is increased
to include a larger range of addresses.
With this change, GPT bdevs use this unique partition GUID as the bdev's
UUID.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Id8e8aa9e7903d31f199e8cfdb487e45ce1524d7b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
This introduces spdk_bdev_part_construct_ext(), which takes an options
structure as an optional parameter. The options structure has one
option: uuid.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5e9fdc8e88b78b303e60a0e721d7a74854ac37a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The IOs will now be retried after ENOMEM is received when doing memory
domain pull or appending an accel copy. The retries are performed using
the mechanism that's already in place for IOs completed with
SPDK_BDEV_IO_STATUS_NOMEM.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I284643bf9971338094e14617974f7511f745f24e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17761
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
After we create the GPT, we change the partition type
GUID to the associated SPDK value. The current
comment just says "change the GUID" which is
ambiguous because there are multiple GUIDs associated
with each partition.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id821c5c5bbd7a72d84d5ddf4d91d633307f2235b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17855
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>