Spdk/module
Mike Gerdts b240c2b103 lvol: lvol destruction race leads to null deref
As an lvolstore is being destroyed, _vbdev_lvs_remove() starts an
interation through the lvols to delete each one, ultimately leading to
the destruction of the lvolstore with a call to lvs_free(). The callback
passed to vbdev_lvs_destruct() is always called asynchronously via
spdk_io_device_unregister() in bs_free().

When the lvolstore resides on bdevs that perform async IO (i.e. most
bdevs other than malloc), this gives a small window when the lvol bdev
is not registered but a lookup with spdk_lvol_get_by_uuid() or
spdk_lvol_get_by_names() will succeed. If rpc_bdev_lvol_delete() runs
during this window, it can get a reference to an lvol that has just been
unregistered and lvol->blob may be NULL. This lvol is then passed to
vbdev_lvol_destroy().

Before this fix, vbdev_lvol_destroy() would call:

   spdk_blob_is_degraded(lvol->blob);

Which would then lead to a NULL pointer dereference, as
spdk_blob_is_degraded() assumes a valid blob is passed. While a NULL
check would avoid this particular problem, a NULL blob is not
necessarily caused by the condition described above. It would better to
flag the lvstore's destruction before returning from
vbdev_lvs_destruct() and use that flag to prevent operations on the
lvolstore that is being deleted. Such a flag already exists in the form
of 'lvs_bdev->req != NULL', but that is set too late to close this race.

This fix introduces lvs_bdev->removal_in_progress which is set prior to
returning from vbdev_lvs_unload() and vbdev_lvs_destruct(). It is
checked by vbdev_lvol_destroy() before trying to destroy the lvol.  Now,
any lvol destruction initiated by something other than
vbdev_lvs_destruct() while an lvolstore unload or destroy is in progress
will fail with -ENODEV.

Fixes issue: #2998

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I4d861879097703b0d8e3180e6de7ad6898f340fd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17891
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-05-09 17:58:11 +08:00
..
accel accel/dpdk_cryptodev: Fix use of uninitialized variable 2023-05-09 17:58:11 +08:00
bdev lvol: lvol destruction race leads to null deref 2023-05-09 17:58:11 +08:00
blob blob_bdev: defer free until all channels destroyed 2023-03-28 03:57:35 +00:00
blobfs module/blobfs: Use error_response() rather than bool_response(false) for JSON RPCs 2023-01-31 21:40:09 +00:00
env_dpdk so_ver: increase all major versions 2023-01-24 08:37:21 +00:00
event nvmf: add spdk_nvmf_request_copy_*_buf() 2023-02-13 13:50:51 +00:00
scheduler module/scheduler: Silence warning about rte_power under clean target 2023-05-09 17:58:11 +08:00
sock sock/posix: Fix sendmsg_idx rollover for zcopy 2023-05-09 17:58:11 +08:00
vfu_device so_ver: increase all major versions 2023-01-24 08:37:21 +00:00
Makefile update Intel copyright notices 2022-11-10 08:28:53 +00:00