ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Ben Walker	fe2d64cecf	event: Nest the thread's spdk_fd_groups in interrupt mode Instead of adding the spdk_fd_group's fd to the reactor fd group, use the new nesting functionality. Change-Id: I00727e836da6ba191d5bf778613f31956c9baacf Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15477 Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Mellanox Build Bot	2023-04-04 17:38:22 +00:00
Ben Walker	d83e476240	util: Add spdk_fd_group_nest() and spdk_fd_group_unnest These provide a way to nest one fd_group into another in a more efficient manner than just adding the fd_group's fd to the parent. It also keeps track of which events belong to which group, so the unnest operation can be implemented. Change-Id: I63d63365f1160cce8b4b6388a0ea2003ef424b9e Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15473 Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2023-04-04 17:38:22 +00:00
Jim Harris	7d44b36e0d	nvme: only prefetch req's stailq when req != NULL It is fine to prefetch an invalid address, but ASAN doesn't like it. So move the prefetch slightly to make ASAN happy. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ib51ab8890e5fe91d30057f65e1399cfc9dd1dd49 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17432 Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-31 17:41:35 +00:00
Szulik, Maciej	7858e18b05	lib/nvme: restore spdk_nvme_ctrlr_get_registers This function was intended to be deleted as unused, however it can be useful for debug and test capabilities. Its declaration was left in header file, so just adding implementation for PCIE and VFIO USER transports. Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com> Change-Id: I670acb53c2f88a844525a0ecea27143b055f117b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17400 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-31 17:41:35 +00:00
Szulik, Maciej	414ff9bc23	nvmf: make async event and error related functions public This patch makes functions related to Asynchronous Event and error handling public, so that they can be used in custom nvmf transport compiled out of SPDK tree. Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com> Change-Id: I253bb7cfc98ea3012c179a709a3337c36b36cb0e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17237 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-31 17:41:35 +00:00
Marcin Spiewak	c6591af05b	lib/thread: fixed potential dereferencing of NULL pointer Fixed issue indicated by Klocwork scan. 'name', which potentially might be NULL, is passed as function parameter. Now the function name will not point to NULL, and will be the same in interrupt structure and in event handler. Change-Id: I5588821139d11288a96f5041703d5b7b71890ad6 Signed-off-by: Marcin Spiewak <marcin.spiewak@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17356 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-30 07:01:26 +00:00
Jim Harris	3b138377e2	nvmf/tcp, nvmf/rdma: default to dynamic buf_cache_size The nvmf generic transport code creates a mempool of I/O buffers, as well as its own per-thread cache of those buffers. The mempool was being created with a non-zero mempool cache, effectively duplicating work - we had a cache in the mempool and then another in the transport layer. So patch `019cbb9` removed the mempool cache, but the tcp transport was significantly affected by it. It uses a default 32 buffers per thread cache which is very small, it was actually mostly relying on the mempool cache (which was 512). Performance regression tests caught this problem, and Karol verified that specifying a higher buf_cache_size fixed the problem. So change both the tcp and rdma transports to specify UINT32_MAX as the default buf_cache_size. If the user does not override this when creating the transport, it will be dynamically sized based on the size of the buffer pool and the number of poll groups. Fixes: `019cbb9` ("nvmf: disable data buf mempool cache") Fixes issue #2934. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Idd43e99312d59940ca68402299e264cc187bfccd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17203 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2023-03-28 20:17:21 +00:00
Jim Harris	3092c61d26	nvmf: enable dynamic buf_cache_size calculation Allow transports to specify a default UINT32_MAX as the buf_cache_size. If user does not override this when creating the transport, calculate the buf_cache_size dynamically using the number of poll groups and the size of the buffer pool (num_shared_buffers). We will allocate 75% of the buffers for the caches, meaning the buf_cache_size will be calculated as: (num_shared_buffers * 3 / 4) / num_poll_groups Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I97768aea701060bbe0ff1925e5322229fa8d051c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17334 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-28 20:17:21 +00:00
Jim Harris	280a3abc9c	nvmf: return early in nvmf_transport_poll_group_create When buf_cache_size is 0, just return early. This allows us to un-indent a large section of code. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I167da677fdcd0504c6f2bfdb8b1a818155642f66 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17333 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 20:17:21 +00:00
Jim Harris	2597ebbede	nvmf: point poll_groups back to their spdk_nvmf_tgt Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ie7eaeb3aa65f0a8f8f9e811d025045fff7f77724 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17332 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 20:17:21 +00:00
Jim Harris	f9424ae73d	nvmf: track num_poll_groups in spdk_nvmf_tgt This will be useful in upcoming patch, where we use the number of poll groups to dynamically pick the buf_cache_size for each transport poll group. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Id166098244287c56f12cdd88ba27a17fa34a4348 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17331 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-28 20:17:21 +00:00
Changpeng Liu	e641e8f3f6	lib/ublk: use page aligned data buffer Kernel `ublk_drv` driver will do memcpy to this data buffer in unit of page size, for a simple 4KiB I/O, it may call memcpy twice if the data buffer isn't page aligned. Moreover, SPDK may also has double buffers with this case, so here, we use page aligned data buffer at initialization. Change-Id: Ica86a9702283327a2bb491e38990b8c00bc77f57 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17283 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-28 10:20:50 +00:00
Mike Gerdts	26037aa3b0	accel: allow libisal and libisal_crypto If CONFIG_ISAL and CONFIG_ISAL_CRYPTO are both defined, the build was only including the LOCAL_SYS_LIBS for libisal_crypto. This fixes that bug using the same technique used in other Makeifles. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I4c0869d60742cd6bdb0812d67db3abbfa7e69122 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17345 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-28 07:07:06 +00:00
Marcin Spiewak	a7cd3a2d57	lib/bdev: fixed potential dereferencing of NULL pointer Fixed issue indicated by Klocwork scan. 'desc->bdev' is assigned to 'bdev' ptr, before verification that 'desc' is not NULL Change-Id: I36e63c27b4d3220e85524133a0ec0e3521770875 Signed-off-by: Marcin Spiewak <marcin.spiewak@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17350 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-28 07:03:56 +00:00
Mike Gerdts	8e612918c2	lvol: allocate lvs before loading it This refactors the code paths that call lvs_load() to allocate the spdk_lvol_store structure before calling lvs_load(). Previously this allocation was done in lvs_load_cb(). This is being done because a later patch requires a pointer to the structure to be passed to lvs_load via the spdk_bs_opts structure. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I2e942d1f7525fa5a16cd34b1b4b3a0a821e13006 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17220 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	9ea88fcb72	blob: refactor parent_id and allocate_all checks The blob's parent_id and allocate_all examined and/or modified in a two places bs_inflate_blob_open_cpl(). This transforms the two if statements scattered around the function into a switch statement to make it easier to understand how these two values are related. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I2cff2d07a0089b52678035b2ece60db6a5f67a8e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17178 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	ecee22738b	lvol: introduce lvs_alloc() There are multiple locations where a struct lvol_store is allocated. This invites inconsistency in initialization, which will become more of a problem as esnap clones have additional initialization. Now all struct lvol_store allocations should be done with lvs_alloc(). Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I07a2f274475375072f80c25ed67cb1fb802cc4e1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16231 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	a30ec964f9	lvol: introduce lvol_alloc() There are several places where new lvols are created and each reproduces much of the same code. Esnap clones will add yet another in lvol.c and more in unit tests. This introduces lvol_alloc() to minimize the chance of unintended skew over time. A side effect of this is that snapshots and clones now inherit clear method from their parent. Previously they would fall back to the default. The old behavior seems to be accidental, hence the change. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: Ibf6f79c567e92354ea73e6589c736b1b946731a0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14976 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2023-03-28 03:57:35 +00:00
Mike Gerdts	0e41ee0b83	lvol: remove unused lvol->thin_provision The thin_provision member of struct spdk_lvol is set but never used. When needed, an lvol's thin provision state is obtained by looking at the lvol's blob. This removes the unused thin_provision member. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I5a2048b5334a26772a25a0bd238e42d3aeb63b49 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17173 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot	2023-03-28 03:57:35 +00:00
Mike Gerdts	aaebaece6d	blob: hotplug new back_bs_dev When an esnap clone blob's external snapshot arrives after the blob is opened, it can now be hot-added to the blob. Presumably the new device replaces a place-holder device that did not really atteempt IO. Change-Id: I622feb84efa66628debf44f7e7cb88b6a012db6d Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16232 Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	55199ed166	blob: abort IO when replacing esnap channel This adds the ability to abort IOs as esnap bs_dev channels are being destroyed. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: Ia63d4cbef5cd4c84dc8d5e2e9e407bacd961385f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16423 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2023-03-28 03:57:35 +00:00
Mike Gerdts	ba91ffbae1	blob: defer unload until channel destroy done As the blobstore is being unlaoded, async esnap channel destructions may be in flight. In such a case, spdk_bs_unload() needs to defer the unload of the blobstore until channel destructions are complete. The following commands lead to the illustrated states. bdev_malloc_create -b malloc0 bdev_lvol_clone_bdev lvs1 malloc0 eclone .---------. .--------. \| malloc0 \|<--\| eclone \| `---------' `--------' bdev_lvol_snapshot lvs1/eclone snap .---------. .------. .--------. \| malloc0 \|<--\| snap \|<--\| eclone \| `---------' `------' `--------' bdev_lvol_clone lvs1/snap eclone .--------. ,-\| eclone \| .---------. .------.<-' `--------' \| malloc0 \|<--\| snap \| `---------' `------'<-. .-------. `-\| clone \| `-------' As the blobstore is preparing to be unloaded spdk_blob_unload(snap) is called once for eclone, once for clone, and once for snap. The last of these calls happens just before spdk_bs_unload() is called. spdk_blob_unload() needs to destroy channels on each thread. During this thread iteration, spdk_bs_unload() starts. The work performed in the iteration maintains a reference to the blob, and as such it spdk_bs_unload() cannot do its work until the iteration is complete. Change-Id: Id9b92ad73341fb3437441146110055c84ee6dc52 Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14975 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	652232ae16	blob: esnap clone inflate and decouple This adds support for inflate and decouple for esnap clones. Since there are no immediate consumers that will provide back_bs_dev->is_zeroes() that can return true, a shortcut is taken in that inflate and decouple of esnap clones are the same. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I4d2e6565126991acd650f073ce876466334e986d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11574 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	94c43313ab	blob: snapshots of esnap clones An esnap clone needs special handling as snapshots are created and removed. In particular: the following must exist on the blob that directly references the external snapshot and must be removed from others: - Ensure SPDK_BLOB_EXTERNAL_SNAPSHOT invalid flag exists only on the esnap clone. - Ensure BLOB_EXTERNAL_SNAPSHOT_ID internal xattr exists only on the esnap clone. - Clean up any esnap IO channels on a blob that is no longer an esnap clone due to snapshot creation or removal. See the diagrams and description in blob_esnap_clone_snapshot() in blob_ut.c for details. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: Ie4125d64d5bac9cfa7d6c7cc9a543d72a169f6ee Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11573 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2023-03-28 03:57:35 +00:00
Mike Gerdts	b47cee6c96	blob: add IO channels for esnap clones The channel passed to blob IO operations is useful for tracking operations within the blobstore and the bs_dev that the blobstore resides on. Esnap clone blobs perform reads from other bs_devs and require per-thread, per-bs_dev channels. This commit augments struct spdk_bs_channel with a tree containing channels for the external snapshot bs_devs. The tree is indexed by blob ID. These "esnap channels" are lazily created on the first read from an external snapshot via each bs_channel. They are removed as bs_channels are destroyed and blobs are closed. Change-Id: I97aebe5a2f3584bfbf3a10ede8f3128448d30d6e Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14974 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 03:57:35 +00:00
Rui Chang	d0516312ff	nvmf: add copy command support in get log page add copy command support in get log page and idenfity tool Change-Id: I8771ffb193fc80ffc12f068993005e5702f41a0d Signed-off-by: Rui Chang <rui.chang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17162 Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-27 11:25:35 +00:00
Rui Chang	4274fe55c9	nvmf/vfio-user: add copy support in vfio-user Fix req length issue in supporting copy command in vfio-user. Signed-off-by: Rui Chang <rui.chang@arm.com> Change-Id: If4ec325777e1a1f00d15edb2fea4dc85016b3b95 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17279 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-24 07:26:14 +00:00
Michal Berger	ede1caf025	lib/vhost: Rename rte_vhost_slave_config_change() As per https://github.com/DPDK/dpdk/commit/71998eb61ff Change-Id: Ie4e5a38976145e1037ef45593b4dc4265091482d Signed-off-by: Michal Berger <michal.berger@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17322 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Pawel Piatek <pawelx.piatek@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-24 07:23:19 +00:00
Peng Lian	b13ee3005d	nvmf: clean sgroup->queued in _nvmf_qpair_destroy when ctrlr is NULL Let us consider the following process: 1. one fabric connect request A comes but the subsystem is paused due to adding/removing ns or other operations, so this request A will be put into sgroup->queued until the subsystem becomes active; 2. the subsystem is paused for a long time until the connect timeout, related qpair is destroyed, the sgroup->queued will not be cleaned because qpair's ctrlr is NULL; 3. if a new request B comes, it is more likely to be allocated to the same memory as the previous fabric command request. And it will be put into sgroup->queued again, where has already exists the exactly same pointer with request B. This leads to the pointer hanging problem and it will cause infinitely loop when traversing sgroup->queued! So this patch avoids the ptr-hanging problem by checking and cleaning all sgroups queued req whose qpair is the being destroyed qpair in _nvmf_qpair_destroy when ctrlr is NULL. This problem is already described in issue #2133. Signed-off-by: Peng Lian<peng.lian@smartx.com> Change-Id: I909d673b5050f21fa193914cc4ffe6634232fa7d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17147 Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2023-03-22 10:11:30 +00:00
Mike Gerdts	c64ce716e4	blob: add spdk_blob_is_esnap_clone Add an API to easily determine if a blob is an esnap clone, similar to what already exists for snapshot, clone, and thin_provisioned. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: Ie07cd09b30513893e82f1c85e94a24a93c79d71e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16862 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2023-03-22 09:39:29 +00:00
Mike Gerdts	2948183f2b	blob: prepare sequences for esnap channels When a sequence is used to perform IO on an esnap clone, differenent channels will be needed for the blobstore device and the esnap device. No special esnap handling is required when a sequence is used to perform IO directly on the blobstore device. This commit splits bs_sequence_start() into bs_sequence_start_bs() and bs_sequence_start_blob() to handle these two scenarios. A later commit introduces special handling of ensap clone blobs. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I3a6f46640cdb7fdc380bf557736638f1b39f05e3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17172 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot	2023-03-22 09:39:29 +00:00
Mike Gerdts	31c2852bb8	blob: prepare sets for esnap channels For the various forms for read_bs_dev() and readv_bs_dev() to perform reads from esnap devices, the spdk_bs_request_set used for the IO needs to keep track of the back_bs_dev IO channel as well as the blobstore's IO channel. This commit has no change in functionality: it is preparation for a change in a later commit. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I8edd9c4bf29bc074194331b42c5ef9d27590ce88 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14973 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-22 09:39:29 +00:00
Mike Gerdts	34d31cdc20	blob: refactor destruction of back_bs_dev External snapshots have a slightly more complicated cleanup of back_bs_dev. This moves all calls to back_bs_dev->destroy() into a function so that this more complicated cleanup can have a single implementation. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I78460aa3877481788118e2b0b76931dcf5c56338 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14972 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-22 09:39:29 +00:00
Mike Gerdts	4d5ee263b1	blob: pass blob context to esnap_bs_dev_create When consumers open a blob with spdk_bs_open_blob_ext(), they can set esnap_ctx in struct spdk_blob_open_opts to have that context passed to bs->external_bs_dev_create(). Change-Id: I0c1a9cec0e5aed5ef2a7143103e822cbe400aabb Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14971 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-22 09:39:29 +00:00
Amir Haroush	6984eff3c5	ocf: fix env_ticks_to_{msec,usec,nsec} precision & accuracy - fix precision when one convert to seconds and then multiply we can have precision errors for example if one have 77ms, it will go to 0 when converted to seconds and then multiply that 0 by 1000 will return 0 instead of 77ms. - fix mismatch nsec/usec nsec was multiplied by 10001000 while usec by 100010001000 it should be the opposite. anyway the implementation had changed. - implementation description env_ticks_to_msec: j / (tick_hz / 1000) this is exactly the same as (j * 1000) / tick_hz (eq #2). but this implementation (eq #2) can only handle 54b in j (before overflowing) because of the multiplication by 1000 (10b). with the correct implementation we use all 64b in j. we assume that tick_hz will be prefectly divisible by 1000 so we are ok. * env_ticks_to_usec: j / (tick_hz / (1000 * 1000)) same as in msec case, we use all 64b in j. here we assume that tick_hz is perfectly divisible by (1000 * 1000) i.e. we assume that CPU frequency is some multiple of 1MHz. * env_ticks_to_nsec: (j * 1000) / (tick_hz / (1000 * 1000)) in this case we can't assume that tick_hz is divisible by 10^9 because there are many CPUs with 2.8GHz or 3.3GHz for example. so we multiply j by 1000 this means that we can only handle correctly j up to 54b. (64b - 10b, 10b for the *1000 operation) Signed-off-by: Amir Haroush <amir.haroush@huawei.com> Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com> Change-Id: Ia8ea7f88b718df206fa0731e3f39f419ee922aa7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17078 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-21 19:08:59 +00:00
Amir Haroush	7c7267e931	ocf: fix env atomic64 functions arguments and return types atomic64 functions should operate with atomic64 and long types. Signed-off-by: Amir Haroush <amir.haroush@huawei.com> Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com> Change-Id: I2ea8f1cc06d6df0f7dd5b9d628839138b78bc412 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17077 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-21 19:08:59 +00:00
Amir Haroush	a0d24145bf	ocf: fix ENV_WARN to use SPDK_WARNLOG instead of SPDK_NOTICELOG Signed-off-by: Amir Haroush <amir.haroush@huawei.com> Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com> Change-Id: Ie5bbdb003573fdca6d56439f6a006749a29e9d6b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17076 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-21 19:08:59 +00:00
Jim Harris	7c3c0b6630	blob: track last md_page index correctly during resize During resize, we correctly determine if we have enough md_pages for new extent pages, before proceeding with actually allocating clusters and associated extent pages. But during actual allocation, we were incrementing the lfmd output parameter, which was incorrect. Technically we should increment it any time bs_allocate_cluster() allocated an md_page. But it's also fine to just not increment it at the call site at all - worst case, we just check that bit index again which isn't going to cause a performance problem. Also add a unit test that demonstrated the original problem, and works fine with this patch. Fixes issue #2932. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iba177a66e880fb99363944ee44d3d060a44a03a4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17150 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: 阿克曼 <lilei.777@bytedance.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Mellanox Build Bot	2023-03-15 09:27:17 +00:00
Jim Harris	037c8b01a1	blob: remove short-circuiting path for blob_freeze If blob_freeze_io() is called twice in a row, and the second time occurs before the for_each_channel for the first completes, the second caller will receive its callback too soon. Instead just simplify the whole process, always do the for_each_channel and don't try to optimize it at all. These are infrequent operations - correctness and simplicity are in order. A few additional changes: 1) Make same changes for unfreeze path. 2) Add blob_verify_md_op() calls, just to be sure these are only called from md_thread. This was already checked in calling functions, but as these functions get called from new code paths (i.e. esnap clones) it can't hurt to add additional checks. 3) Add unit test that failed with original code, but passes with this patch. Fixes issue #2935. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ibefba554547ddf3e26aaabfa4288c8073d3c04ff Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17148 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Mike Gerdts <mgerdts@nvidia.com> Community-CI: Mellanox Build Bot	2023-03-15 09:27:17 +00:00
Konrad Sztyber	3fbe74fd82	accel: don't modify user iovs when allocating buffers It is quite common for a user to use the exact same iovec (in memory) to describe buffers for two different operations. If that iovec was describing accel buffer, accel would modify it replacing it with an actual buffer. This is broken if that iovec was used by some other task in a sequence, as accel wouldn't be aware that it has been changed too. To address this, accel will use a new iovec from the aux_iovs array. It means that accel buffers always must be passed using a single iovec. Theoretically, users could chunk that buffer into several iovecs, but spdk_accel_get_buf() always returns a single buffer, so, in practice, this should never happen, and therefore is unsupported. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I25271bc032987dd6028fb7b3adde061657759b4b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17039 Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	d69e6f64b3	bdev: prevent aborting reqs doing push/pull or accel seq exec Requests that have their data pushed/pulled from a memory domain or have an accel sequence executed aren't handled by a bdev module, so we shouldn't submit an abort request. Those operations cannot be aborted either, so the abort request is failed in this case. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Icd185c4a2951a555d321cd037de0af1ab157f37a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17020 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	250566568a	bdev: delay reset until accel/memory domain ops completion These operations are handled internally by the bdev layer, so it should first wait until they're completed before issuing reset to a bdev module. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I74f0d42dcb9a289aa7c3115ca309cb92870548e2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17019 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	000b9697e7	bdev: track IOs doing memory domain pull/push Similarly to requests executed by accel, we need to track bdev_ios that have their data pushed/pulled. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ie6b0d2c058e9f13916a065acf8e05d1484eae535 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16978 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	2326924683	bdev: track IOs executing accel sequence It will make it possible to check if a request is being processed by accel when doing resets/aborts. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ice07211df316e1eee9640e750ff8e176c8a3ca6f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16977 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2023-03-13 21:02:27 +00:00
Konrad Sztyber	04c222f2db	bdev: accel sequence support for read requests This patch enables passing accel sequence for read requests. The handling is pretty similar to writes, but the sequence is executed after a request is completed by a bdev module. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I79fd7d4873265c81a9f4a66362634a1c4901d0c9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16975 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	22c0e97884	bdev: accel sequence support for write requests It is now possible to submit a write request with a sequence of accel operations that need to be executed before actually writing the data. Such requests will be directly passed to a bdev module (so that it can append subsequent operations to an accel sequence) if that bdev supports accel sequences and the request doesn't need to be split. If either of these conditions are not met, bdev layer will execute all the accumulated accel operations before passing the request to a bdev module. The reason for not submitting split IOs with an accel sequence is that we would need to split that accel sequence too. Currently, there's no such functionality in accel, so we treat this case in the same way as if the underlying bdev module didn't support accel sequences (it's executed before bdev_io is split). Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I66c53b3a1a87a35ea2687292206c899f80aaed4a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16974 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	54a935a669	bdev: cache whether IO needs to be split bdev_io_should_split() adds some non-zero overhead, so checking it multiple times in an IO path is inefficient. So, to avoid that, call bdev_io_should_split() once during IO initialization and cache the result in bdev_io. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I1da6514d409f8a4e4bbb14722dd53b2c88988cac Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17058 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	f555961ff1	bdev: move bdev.submit_request() to a function Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I64556e1ae3241fc69fa68fec7568c50db9152d7f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16973 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	80b22cf314	bdev: allocate accel_channel for each bdev_channel This channel will be used to execute accel operation sequences. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ied4bb57d14a50a923908ffb13ef4ba34ca65175c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16972 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2023-03-13 21:02:27 +00:00
Konrad Sztyber	1be4e82d15	bdev: allow bdevs to report accel_sequnce support Modules can now report that they support accel chaining for specific operations through the accel_sequnce_supported() callback. The support is reported per IO type. This allows modules to support accel sequences for some operations, while relying on the bdev layer to handle them for other IO types. Only bdevs without separate metadata buffers are allowed to support this new mode. That's because metadata in separate buffer is expected to use the same memory domain as data buffers. With an accel sequence, those data memory domains can change, while metadata's memory domain always stays the same. To support bdevs with separate metadata buffers, we'd need to add separate pointers for metadata's memory domain. For now, simply disallow registering bdevs with separate metadata supporting accel sequences. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I0c49cc00096837d70681a69b2633c2cb3dfd4e39 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16971 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2023-03-13 21:02:27 +00:00

1 2 3 4 5 ...

10193 Commits