ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Mike Gerdts	ba91ffbae1	blob: defer unload until channel destroy done As the blobstore is being unlaoded, async esnap channel destructions may be in flight. In such a case, spdk_bs_unload() needs to defer the unload of the blobstore until channel destructions are complete. The following commands lead to the illustrated states. bdev_malloc_create -b malloc0 bdev_lvol_clone_bdev lvs1 malloc0 eclone .---------. .--------. \| malloc0 \|<--\| eclone \| `---------' `--------' bdev_lvol_snapshot lvs1/eclone snap .---------. .------. .--------. \| malloc0 \|<--\| snap \|<--\| eclone \| `---------' `------' `--------' bdev_lvol_clone lvs1/snap eclone .--------. ,-\| eclone \| .---------. .------.<-' `--------' \| malloc0 \|<--\| snap \| `---------' `------'<-. .-------. `-\| clone \| `-------' As the blobstore is preparing to be unloaded spdk_blob_unload(snap) is called once for eclone, once for clone, and once for snap. The last of these calls happens just before spdk_bs_unload() is called. spdk_blob_unload() needs to destroy channels on each thread. During this thread iteration, spdk_bs_unload() starts. The work performed in the iteration maintains a reference to the blob, and as such it spdk_bs_unload() cannot do its work until the iteration is complete. Change-Id: Id9b92ad73341fb3437441146110055c84ee6dc52 Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14975 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	b47cee6c96	blob: add IO channels for esnap clones The channel passed to blob IO operations is useful for tracking operations within the blobstore and the bs_dev that the blobstore resides on. Esnap clone blobs perform reads from other bs_devs and require per-thread, per-bs_dev channels. This commit augments struct spdk_bs_channel with a tree containing channels for the external snapshot bs_devs. The tree is indexed by blob ID. These "esnap channels" are lazily created on the first read from an external snapshot via each bs_channel. They are removed as bs_channels are destroyed and blobs are closed. Change-Id: I97aebe5a2f3584bfbf3a10ede8f3128448d30d6e Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14974 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-28 03:57:35 +00:00
Mike Gerdts	a4a73fec9c	blob: pass bs context with esnap_bs_dev_create When a blobstore consumer creates or loads a blobstore, it should be able to set a per-blobstore context pointer that will be passed back to the consumer via bs->esnap_bs_dev_create(). Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I59c0ebe21eaf65c3d79a4ac3469715283f56313a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14970 Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2023-03-13 07:57:24 +00:00
Mike Gerdts	ce67e0c787	blob: clones of external snapshots This is the beginning of support for external snapshots. An external snapshot is a read-only blobstore device (struct spdk_bs_dev) that can be used as a blob's back device. Normally a blob will have no back device (a normal blob), a zeroes back device (a thin provisioned blob), or a blob back device (a clone blob). When a blob has an external snapshot ("esnap") as its back device, it is called an esnap clone. With this patch, esnap clones can be created but they are not yet useful. Subsequent patches in the series will plumb the IO path, enable various features, and allow lvol bdevs to be esnap clones. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I29206b628a2b03b6386a88532565e228df988e0e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14969 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2023-03-03 11:25:35 +00:00
Mike Gerdts	316cf9ef99	blobstore: convert used_lock to spinlock Convert bs->used_lock to a spinlock. This is being done to help with the debugging and fixing of a race that has led to a failed assertion in bs_claim_md_page. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I11b80096de022f79a217c65d787ee57ca54240f9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15952 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Mellanox Build Bot	2022-12-20 09:19:09 +00:00
Mike Gerdts	2a608d0241	blobstore: rename used_clusters_mutex to used_lock The bs->used_clusters_mutex protects used_md_pages, used_clusters, and num_free_clusters. A more generic name is appropraite. The next patch in this series will convert it from a mutex to a spinlock and having "mutex" or "spin" in the name is of little help to maintainers, so a more generic name is used. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I5ce7b85b84fdec2a0c5d2ac959e0109e1d80c7f5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15981 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Mellanox Build Bot	2022-12-20 09:19:09 +00:00
paul luse	a6dbe3721e	update Intel copyright notices per Intel policy to include file commit date using git cmd below. The policy does not apply to non-Intel (C) notices. git log --follow -C90% --format=%ad --date default <file> \| tail -1 and then pull just the 4 digit year from the result. Intel copyrights were not added to files where Intel either had no contribution ot the contribution lacked substance (ie license header updates, formatting changes, etc). Contribution date used "--follow -C95%" to get the most accurate date. Note that several files in this patch didn't end the license/(c) block with a blank comment line so these were added as the vast majority of files do have this last blank line. Simply there for consistency. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Id5b7ce4f658fe87132f14139ead58d6e285c04d4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15192 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2022-11-10 08:28:53 +00:00
Damiano Cipriani	ddf5a8da90	blobstore: Add function to get io_unit per cluster This function returns the number of io_units per cluster Signed-off-by: Damiano Cipriani <damiano.cipriani@suse.com> Change-Id: I8f33d24a63876a0a918830b9eeaa69a91ff21193 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14431 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Mellanox Build Bot	2022-09-15 08:23:56 +00:00
Jim Harris	ffa823557a	blob: add assert that cluster_sz > 0 Avoids divide-by-zero scanbuild warning on Fedora36. Fixes issue #2667. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ib2793c793725e8bb8ba25fb779ffc14334929da0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14238 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-08-29 11:41:50 +00:00
Jim Harris	488570ebd4	Replace most BSD 3-clause license text with SPDX identifier. Many open source projects have moved to using SPDX identifiers to specify license information, reducing the amount of boilerplate code in every source file. This patch replaces the bulk of SPDK .c, .cpp and Makefiles with the BSD-3-Clause identifier. Almost all of these files share the exact same license text, and this patch only modifies the files that contain the most common license text. There can be slight variations because the third clause contains company names - most say "Intel Corporation", but there are instances for Nvidia, Samsung, Eideticom and even "the copyright holder". Used a bash script to automate replacement of the license text with SPDX identifier which is checked into scripts/spdx.sh. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iaa88ab5e92ea471691dc298cfe41ebfb5d169780 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12904 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: <qun.wan@intel.com>	2022-06-09 07:35:12 +00:00
Alexey Marchuk	1eca87c39c	blobstore: Preallocate md_page for new cluster When a new cluster is added to a thin provisioned blob, md_page is allocated to update extents in base dev This memory allocation reduces perfromance, it can take 250usec - 1 msec on ARM platform. Since we may have only 1 outstainding cluster allocation per io_channel, we can preallcoate md_page on each channel and remove dynamic memory allocation. With this change blob_write_extent_page() expects that md_page is given by the caller. Sicne this function is also used during snapshot deletion, this patch also updates this process. Now we allocate a single page and reuse it for each extent in the snapshot. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I815a4c8c69bd38d8eff4f45c088e5d05215b9e57 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12129 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-18 09:02:02 +00:00
Mike Gerdts	d0149da224	blob: remove unused inline functions bs_back_dev_lba_to_io_unit() and bs_num_pages_to_cluster_boundary() are unused inline functions. The last consumer (by the earlier _spdk_* name) was removed in commit `6609b776`. Change-Id: Ib1babfed8002fb44451b337aa0db66c15a6805d2 Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11561 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-17 17:08:37 +00:00
Mike Gerdts	9b72cda8b2	blob: fix spelling, white space, grammar Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I236c8a1c7f1ae4b0afd0d20175a1a2a647dba758 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11265 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-02-02 08:25:02 +00:00
Liu Xiaodong	7de351f1d7	blobstore: Use RB_TREE to do blob lookup If blobs held in a blobstore are opened a lot, lookup by RB_TREE will be much more efficient. Change-Id: I7075b95c597a958e7bb10890f803191309532021 Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10917 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2021-12-31 09:21:35 +00:00
Josh Soref	cc6920a476	spelling: lib Part of #2256 * accessible * activation * additional * allocate * association * attempt * barrier * broadcast * buffer * calculate * cases * channel * children * command * completion * connect * copied * currently * descriptor * destroy * detachment * doesn't * enqueueing * exceeds * execution * extended * fallback * finalize * first * handling * hugepages * ignored * implementation * in_capsule * initialization * initialized * initializing * initiator * negotiated * notification * occurred * original * outstanding * partially * partition * processing * receive * received * receiving * redirected * regions * request * requested * response * retrieved * running * satisfied * should * snapshot * status * succeeds * successfully * supplied * those * transferred * translate * triggering * unregister * unsupported * urlsafe * virtqueue * volumes * workaround * zeroed Change-Id: I569218754bd9d332ba517d4a61ad23d29eedfd0c Signed-off-by: Josh Soref <jsoref@gmail.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10405 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-12-03 08:12:55 +00:00
Shuhei Matsumoto	320ab72fb5	util: Add macro SPDK_SIZEOF_MEMBER to get size of a member of a struct We find a few files to get the size of a member of a struct. How to do it is a little complex. So add a macro to do it will be helpful to read the current code and develop new features. lib/dif had used member_size() internally but Linux use sizeof_member() as the macro. Besides, SPDK have used upper case letters for similar macros, SPDK_CONTAINEROF() and SPDK_COUNTOF(). Hence spdk_member_size() may be good but propose SPDK_SIZEOF_MEMBER() as the macro. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I2179c845a3b75fb71aa039075cc4dfd30617b898 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8738 Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2021-07-15 07:16:22 +00:00
Tomasz Zawadzki	ceaa0c7fa9	lib/blob: complete multiple persists When blob persist starts, there can already be multiple of such requests pending. It is possible to complete a set of persists at once, if blob state after their execution would be the same. This is the case when persists are already pending when a particular persist request is started. This patch implements such mechanism by introducing persists_to_complete queue, containing entries that were previously queued up before starting the current persist request. If there are any entries in this queue, further requests are put into pending_persists. When first request from persists_to_complete is persisted, completions are issued for all requests on that queue at once. If at that point there are any new entries on pending_persists, all of them are put into persists_to_complete. Persist process is started again with the first request from that queue. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I10063e55d6f821b1863de016d3148da6a719a422 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7643 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-05-24 10:08:00 +00:00
Jim Harris	bd16f57472	blob: switch to bit_pool for tracking used_clusters We still need to be able to explicitly set specific bits in the cluster array during initialization and loading (especially recovery), so we use a bit_array during load, and then convert it to a bit_pool just before calling the user's cmopletion callback. This gives a roughly 300% improvement over baseline on a benchmark which does continuous resize operations. The benefit is primarily from saving the lowest free bit rather than having to always start at bit 0. We may be able to further improve this by saving extents in the bit pool as well, although after this patch, the benchmark shows other hot spots different from the bit search. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Idb1d75d8348bc50560b1f42d49dbe4d79d024619 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3975 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-15 07:12:44 +00:00
Ben Walker	30ee8137cf	blob: Add a bitmask for quickly checking which blobs are open This can speed up the check for whether a blob is already open significantly. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: If32b0b1f168fcdb58e61df6281d7b7520725a195 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2781 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-07-07 07:30:58 +00:00
Seth Howell	b5d68d5934	lib/blob: remove _spdk prefix from all functions. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Idb33816e5b66266987845172c27c87667ac0a596 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2437 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-05-27 07:35:02 +00:00
Tomasz Zawadzki	b3348624e7	blob: add pages_per_cluster_shift Operation of locating right lba from cluster map is done on I/O path. Instead of division and multiplication, perform bit shift operation. Bit shift is only used when pages per cluster is power of 2. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ic3ed7ec0a82867a8a4bc6391785b9d40c800aacb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1724 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-04-24 15:45:21 +00:00
Seth Howell	ad7fdd12b1	lib/blob: remove spdk_ from non-public APIs We have an unofficial naming convention that the spdk_ namespace is reserved for public API functions only. This patch is attempting to bring the blob library into compliance with that naming convention. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Ie298e41d1b741dae01744826c208378ee60f9d0a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1700 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI	2020-04-15 22:10:08 +00:00
Tomasz Zawadzki	030be573f3	lib/blob: queue up blob persists when one already is ongoing It is possible for multiple blob persists to affect one another. Either by blob->state changes or blob mutable data. Safe way to prevent that is to queue up the persists. Next persist will be executed only after previous one completes. Fixes #1170 Fixes #960 Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Iaf95d9238510100b629050bc0d5c2c96c982a60c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/776 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-02-21 09:35:27 +00:00
Tomasz Zawadzki	29bd502046	lib/blob: add invalid flag for extent table With recent changes to extent on-disk metadata format, new format (Extent Pages) is not backwards compatible. Meanwhile old format (Extent RLE) is backwards compatible with older SPDK applications. Summing up: Blobstore created pre SPDK 20.01 can only use Extent RLE. Blobstore created starting with SPDK 20.01 can use both, Extent Pages and Extent RLE specified by use_extent_table opts. When use_extent_table is set to true, invalid flag for it is set. SPDK application pre 20.01, will not load such blob. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: If14ebd03f19eb581d71dcb46191e099336655189 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483220 Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-31 09:28:56 +00:00
Tomasz Zawadzki	42109157f4	lib/blob: add starting cluster index to extent page Size of a blob (thus size of clusters array in mutable data) is known from extent table descriptor. Extent pages were read sequentially in order they were placed in extent table. This meant that cluster array could have been filled up from beginning to end. Yet reading extent pages in any other order, would result in incorrect placement of clusters. This patch adds first cluster index that is contained within each extent page. This will allow to read/write multiple extent pages in parallel, since we will know where in clusters array to put the cluster idxs. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ib6b9332111cd93f990d057dc60624152907dd87f Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482701 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-28 09:15:23 +00:00
Tomasz Zawadzki	78257ab613	lib/blob: rename num_clusters_in_et to remaining_clusters_in_et This is more adequate name, since this value if first read from Extent Table descriptor. Then decreased when iterating over entries in extent table and extent pages are read. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ib188c524b8488b38d4de063a9970dcfdf49c9acd Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482600 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	2bccb7c9b4	lib/blob: use use_extent_table instead of NULL from extent_page Right now output from _spdk_bs_cluster_to_extent_page() is used to determine whether the exten_table is used at all. If NULL pointer was returned this meant that extent table was not allocated, even if the code might suggest just checking if we overran the array. To make it more obvious, the _spdk_bs_cluster_to_extent_page() now only asserts the extent_table_id. blob->use_extent_table is now always used to determine the serialization path. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I9d2630645213539bae5cd1d72e5f9b878f53c2bc Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482599 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	e1ce55158a	lib/blob: require SPDK_EXTENTS_PER_EP to be power of 2 Force number of Extents to fit into Extent Page to be power of 2, in order to simplify calculations on cluster allocations. At this time SPDK_BS_PAGE_SIZE is 4k, which would results in SPDK_EXTENTS_PER_EP to be 512. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I7e09d92b00dfe5c12d7dd10ac0fc5a9a10d526ac Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472041 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	f4e58993f7	lib/blob: add EXTENT descriptor to blobs Similar to EXTENT_RLE, this descriptor holds LBA of clusters. Difference is that EXTENT is kept in separate md pages, and only single EXTENT will be updated on cluster allocation. This patch adds the EXTENT processing, which is not used until following patch. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ifbac23db7ca3e7c8c91cee01018f20071f0d5160 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470014 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	1b23560fcd	lib/blob: add _spdk_bs_cluster_to_extent_page() for easy conversion Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I3e49c398d9bdf9f4eacba65061cc7fe4b300fb56 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479963 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	59f7f3f736	lib/blob: change extent pages array size on blob resize With this patch extent pages array will change it size accordingly to size of the blob. Similar to clusters, only resizing up is done on blob resize. Shrinking is done on persisting the blob. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Id7f7c81efbd96af414fce9fc4045cbb476cc93a6 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479962 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	f60b4a7e28	lib/blob: add EXTENT_TABLE descriptor to blobs Added new descriptor SPDK_MD_DESCRIPTOR_TYPE_EXTENT_TABLE. Extent Table will hold md page offsets for new Extent Page descriptor. Entries in Extent Table are run-length encoded 0's as unallocated Extent Page descriptors. Additionally total number of clusters is persisted in each Extent Table descriptor. This is because there is no guarantee that last Extent Page of a blob will be allocated. Even if number of Extents per Extent Page is always the same, Extent Page can hold less Extents than that. This patch does not add more metadata on disk right now. Only added descriptor parsing/serialization and applicable fields to store it in run time. Following patches are going to implement TODO's added in this patch. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Iac5d8f00ddfc655c507bc26d69d7adf8495074e9 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466920 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	3dadb79e37	lib/blob: add EXTENT_RLE descriptor description Since further patches will be adding new descriptors that are related to cluster layout throughout the blobstore, add description for existing descriptor too. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I722eb633445685789d5185ed59dfc910f76b109f Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481724 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	c33840b7e6	lib/blob: add option to enable extent pages This is an additional option that can be passed when creating a blob. When opts->enable_extent_pages is set to false (current default), only EXTENT_RLE should be persisted on sync. During blob load, when EXTENT_RLE is present in md, blob->extent_rle_found is set to true. When opts->enable_extent_pages is set to true, only EXTENT_TABLE and EXTENT_PAGES should be persisted on sync. During blob load, when EXTENT_TABLE is present in md, blob->extent_table_found is set to true. It is possible to find neither EXTENT_* descriptor when loading a blob. This means that blob length is 0 and EXTENT_RLE was supposed to be used. Yet none were persisted due to lack of clusters. In such case blob->use_extent_table is set to true after finishing blob load. When parsing metadata ends, if extent_table_found is set - then support for extent_table is enabled. All other cases disable it. At this time path for Extent Pages is not implemented, so it should not be used. Later in the series, it will become the default path for serialization. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I2146da6130a0645e686ab02a3b5d2d86a7d35a1f Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479853 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
paul luse	ea69d6d6cc	lib/blob: store clear_method in per blob metadata Accept a clear method option on blob create by adding clear_method to the opts structure passed in to _spdk_bs_create_blob(). Store these 2 bits in md_ro_flags so that earlier versions without an understanding of these bits can not alter metadata. The new metadata values will be used later in the series. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I5440645ca20b426778d13b2e544b65dc2b3b83c7 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472204 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-20 09:57:16 +00:00
Tomasz Zawadzki	4b8db27b2a	lib/blob: add _spdk_bs_md_page_to_lba() function internal to blobstore The _spdk_bs_page_to_lba() [without 'md'] is only for translating the pages on the blobstore to lba they are at. Those pages start at the begining of the device and cover all of it. Thus simple math is enough to translate those. It is used to calculate lba_count for set of pages as well. Meanwhile there are 'md_pages' which are the same pages as for the above, but their count start at bs->md_start. Which is right after super_block and couple pages for bit masks. This patch creates new _spdk_bs_md_page_to_lba() that is more explicit in what page number is passed. Hopefully avoiding confusion when reading which page number refers to which 'type' of page. Exception to the that is _spdk_bs_dump_read_md_page(), where blobstore is not actually loaded (md_start from super block is not copied to bs structure). Additionaly providing assert to catch errors on debug builds. Making the check in _spdk_blob_load_cpl() for max_md_lba obsolete. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I66bbca55b5ca3d6794c462d50177e6037ddbefa6 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479017 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-14 17:13:15 +00:00
Tomasz Zawadzki	3e372f35c3	lib/blob: rename extents to extents_rle In future patches new type of extents will be added, for compatibility the current extent type will be still handled in the code. To signify the difference between those two types, current type is renamed to SPDK_MD_DESCRIPTOR_TYPE_EXTENT_RLE. Along with any variables throughout the code, to make it clear which ones are used. There are no functional changes in this patch. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I7186ccc452d200036188abf1dcea9660dcedee72 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468230 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-10-07 15:07:12 +00:00
Tomasz Zawadzki	69a8877e82	lib/blob: do not allow xattr to exceed maximum descriptor length Length of xattr descriptor is equal to length of xattr struct, xattr name and the len of stored value. There is no limit to how much can be stored in memory for xattr. On disk xattr size is limited to single page and within that to max descriptors that can fit in it. This size is known at compile time. Before this patch it was possible to add xattr exceeding what was possible to be written to disk. This caused issues when serializing the metadata during spdk_blob_sync_md() or spdk_blob_close(). Making those fail without specific info to the user and not actually writting such descriptor. Since maximum length of xattr descriptor is known at compile time, this patch compares against this value when setting the xattr. It will immediately report back to user with error, and will not store xattr in memory (thus not serialize it). This patch should not affect any backward compatibility for blobs. Too large xattrs weren't written to disk before, API for blobstore stays the same - only reporting ENOMEM when it should. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I6f4af4d079e47f084e20d7a4969d9a78ec1f8610 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460450 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-07-11 10:05:41 +00:00
Maciej Szwed	92cafd1586	blobstore: Remove blob on blobstore load when required In some cases user may want to flag blob for removal then do some operations (before removing it) and while it happens there might be power failure. In such cases we should remove this blob on next blobstore load. Example of such usage is delete snapshot functionality that will be introduced in upcoming patch. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I85f396b73762d2665ba8aec62528bb224acace74 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453835 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-24 23:09:56 +00:00
Maciej Szwed	8256cecf39	blobstore: rename resize_in_progress to locked_operation_in_progress This is a part of future changes to block blob operations that may cause race conditions between each other. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Ia728d1fc207375ddcb3b70b5081ddcffa9f99027 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449789 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-04-08 21:39:08 +00:00
Maciej Szwed	adb39585ef	lvol: add option to change default data erase method Some users require to do write zeroes operation when erasing data on lvol. Currently the default method is unmap. This patch adds flag to spdk_rpc_construct_lvol_bdev call that changes default erase method. This is also a base implementation for possible future function for erasing data on lvol bdev. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I8964f170b13c2268fe3c18104f7956c32be96040 Reviewed-on: https://review.gerrithub.io/c/441527 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2019-01-23 22:25:37 +00:00
Piotr Pelplinski	6609b776e4	blobstore: allow I/O operations to use io unit size smaller than page size. Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com> Change-Id: I994b5d46faffd34430cb39e66225929c4cba90ba Reviewed-on: https://review.gerrithub.io/414935 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-10-04 21:35:24 +00:00
Chen Wang	6fa48bbf62	lib: fix typos in the lib directory Change-Id: Idcb60b79d2902bb316facc6f60e0a81e5cf847ed Signed-off-by: Chen Wang <chenx.wang@intel.com> Reviewed-on: https://review.gerrithub.io/423372 Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2018-08-24 17:15:12 +00:00
Ziye Yang	ee9db7dac0	blobstore: adjust order in spdk_xattr It will save the space of spdk_xattr when put uint16_t after uint32_t Change-Id: Ie0712d8c3b16d90fc354847509fd87e1ffd93916 Signed-off-by: Ziye Yang <optimistyzy@gmail.com> Reviewed-on: https://review.gerrithub.io/419453 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2018-07-19 01:45:19 +00:00
Piotr Pelplinski	2c91e91907	blobstore: Save the original size of the disk. Save the original size of the disk to metadata when it is first created. On load verify that the disk did not change size. Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com> Change-Id: I535940ee188425ee3b394effd99653cc073d541e Reviewed-on: https://review.gerrithub.io/410896 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-06-28 17:58:31 +00:00
Jim Harris	f300130872	blob: always use uint64_t to represent page_idx 4KiB page size * UINT32_MAX = 16TiB - so we must use a uint64_t for any blobstores on backing devices of 16TiB or greater. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ief13cf06d413477dc8ab4f9fe0ff4c0631566c00 Reviewed-on: https://review.gerrithub.io/416448 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>	2018-06-21 22:46:30 +00:00
Daniel Verkamp	89426e9bb5	blob: change lba to uint64_t in serialize_extent Make sure we don't truncate the LBA when using it to serialize the cluster array into an extent list. We also need to add an explicit cast in _spdk_bs_cluster_to_lba to ensure the conversion doesn't get truncated. While here, do the same cast for _spdk_bs_cluster_to_page. Change-Id: If4e65ed86550e39dfa39826930dfafac158d519c Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com> Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-on: https://review.gerrithub.io/416231 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-06-21 22:46:30 +00:00
Piotr Pelplinski	69fa57cdf0	blobstore: freeze I/O during resize Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com> Change-Id: I23c34d4dcb542aa9ab3fa8cb734cf9cc0e0fc5da Reviewed-on: https://review.gerrithub.io/409144 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-06-08 19:32:25 +00:00
Piotr Pelplinski	8c45ed3822	blobstore: freeze I/O during snapshoting. Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com> Change-Id: I6182eb3a77d23db7088703492d71349e3a4b6460 Reviewed-on: https://review.gerrithub.io/399366 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-06-06 22:26:04 +00:00
Piotr Pelplinski	bc8f2cd90f	blobstore: Change behaviour of dirty bit The patch disables writing dirty bit during blobstore loading. Instead, dirty bit is written prior to the first metadata update. Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com> Change-Id: I7be81009a99f09048bf23749c8f6ef5e9f7b3751 Reviewed-on: https://review.gerrithub.io/410884 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>	2018-05-30 00:37:54 +00:00

1 2

90 Commits