ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Tomasz Zawadzki	07e31b028a	ut/vhost: select vhost_backend for UT As of right now the UT always used the empty struture of struct spdk_vhost_dev_backend during the test. This meant VHOST_BACKEND_BLK. alloc_vdev() will require further changes to test both types of backends. So for now change it to VHOST_BACKEND_SCSI, since it currently does not touch any fields outside of the struct spdk_vhost_dev. Meanwhile next patch will do so for blk backend. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ib5af7520bc8a21a7af03b810d4cc42726797a331 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12749 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-05-20 19:40:56 +00:00
Tomasz Zawadzki	91426dc600	ut/vhost: add vhost_blk.c and stubs Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I5218d6ea95f6edb6f664bad75b17c68c0760d637 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10977 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-05-20 19:40:56 +00:00
Tomasz Zawadzki	69820927da	ut/vhost: initialize vhost libraries Vhost library was not initialized as part of the test, it will become necessary later in the series. Suite startup/cleanup have no matching CUnit test case, so only assert() can be used. Rather than CU_ASSERT(). Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ieaa3d2f6b6f1899105362181f285f585ff9724d7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10945 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-20 19:40:56 +00:00
Alexey Marchuk	619b4dba8a	lib/reduce: Check if user's buffer crosses huge page boundary If compress driver doesn't support SGL input of output then we need to copy user's buffers into reduce internal buffers Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0c07243a5b668d0e0adcc153e5b573f59c26ab64 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12281 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-20 17:39:57 +00:00
Alexey Marchuk	b86e85f56f	lib/reduce: Properly allocate comp/decomp buffers Reduce library allocates one big chunk of memory and then splits it between requests. The problem is that a chunk of memory assigned to a request may cross huge page boundary and if compress driver doesn't support SGL input of output, operation will be failed. To avoid this problem, align buffer start on 2MiB and check each chunk of memory if it crosses huge page boundary. Fixes issue #2454 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ie730b8ba928f27a43bde1222b6c18d29b797575a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12249 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-20 17:39:57 +00:00
Jonas Pfefferle	192e64bcc5	bdev: spdk_bdev_ext_io_opts missing size check ext_io_opts uses the size member to allow backwards compatibility however currently we only check if it is below or equal the current size of the opts struct and that it is not 0. size is only used when we copy opts because of split or push/pull. This patch introduces size checks to allow safe access to e.g. metadata and memory domain pointers of the user provided opts pointer. The minimum size of the struct passed is now the size of the initial version of spdk_bdev_ext_io_opts. To not introduce additional checks when opts are consumed by a bdev module we now always copy if the size is smaller than the current opts struct size. When introducing new members to opts additional checks might be needed if those are directly accessed through the passed pointer or bdev_io->internal.ext_opts. Change-Id: Ibd181a5840a3d5022018a9f61403df961ffd6e1d Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12550 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-20 15:55:50 +00:00
GangCao	7cfb12f437	Bdev/Lvol: check base bdev's md before examining To fix issue #2514 Change-Id: If507382202e729f5934a354e2515a035ad5aeb0c Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12750 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-20 09:18:18 +00:00
Shuhei Matsumoto	e4584d937e	bdev/nvme: Poll adminq more often during ctrlr disconnection During ctrlr reconnection, spdk_nvme_ctrlr_reconnect_poll_async() is executed by a non-timed poller. We should poll adminq more often during ctrlr disconnection too. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ib1f5b41015aed20deda8df6f2c837981ac233c04 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12615 Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-20 09:17:28 +00:00
Shuhei Matsumoto	fcf52fbff5	bdev/nvme: Reversed orderings for reset between PCIe and NVMe-oF As described in the NVMe specification, a controller level reset includes the following actions: - the controller stops processing any outstanding admin or I/O commands; - all I/O SQs and CQs are deleted. In a full controller reset sequence for a PCIe controller, if we do a controller level reset first, we can abort outstanding commands after the hardware has actually been stopped. For NVMe-oF controller, each I/O qpair is an independent network connection and is disconnected safely. We do not want to change NVMe-oF controller. Fixes the issue #2360 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: If05febac74705bfd3df5abd15064c1203126e027 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12447 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-20 09:17:28 +00:00
Shuhei Matsumoto	736b9da034	nvme: Do Controller Level Reset when disconnecting adminq for PCIe As described in the previous patches, we need to delete all I/O SQ/CQs before aborting trackers when disconnecting a controller. The following patches reorder the operations. This patch changes adminq disconnection to initiate a Controller Level Reset and adminq completion processes it if ctrlr->is_disconnecting is true. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I64f06bae2ce8a9127124029fd042db0028198e3c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12560 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-19 08:23:57 +00:00
Alexey Marchuk	1eca87c39c	blobstore: Preallocate md_page for new cluster When a new cluster is added to a thin provisioned blob, md_page is allocated to update extents in base dev This memory allocation reduces perfromance, it can take 250usec - 1 msec on ARM platform. Since we may have only 1 outstainding cluster allocation per io_channel, we can preallcoate md_page on each channel and remove dynamic memory allocation. With this change blob_write_extent_page() expects that md_page is given by the caller. Sicne this function is also used during snapshot deletion, this patch also updates this process. Now we allocate a single page and reuse it for each extent in the snapshot. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I815a4c8c69bd38d8eff4f45c088e5d05215b9e57 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12129 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-18 09:02:02 +00:00
GangCao	7bcd316de1	bdev: abort all IOs when unregistering the bdev To fix issue: #2484 When unregistering the bdev, will send out the message to each thread to abort all the IOs including IOs from nomem_io queue, need_buf_small queue and need_buf_large queue. The new SPDK_BDEV_STATUS_UNREGISTERING state is newly added to indicate this unregister operation. In this case, the bdev unregister operation becomes the async operation as each thread will be sent the message to abort the IOs and as the last step, it will unregister the required bdev and associted io device. On the other hand, the queued_resets will be handled separately and not aborted in the bdev unregister. New unit test cases are also added: enomem_multi_bdev_unregister: to abort the IO from nomem_io queue during the unregister operation bdev_open_ext_unregister: to handle the events and async operations from the unregister operation Change-Id: Ib1663c0f71ffe87144869cb3a684e18eb956046b Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12573 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com>	2022-05-18 07:30:00 +00:00
Alexey Marchuk	007fb1d3cb	nvme: Fix keyed/unkeyd SGL nvme cmd dump Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0a08518b5c30455a17158aa440715515d0c066fc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12133 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-17 20:11:43 +00:00
Shuhei Matsumoto	00d46b80b2	bdev/nvme: Disable automatic failback in multipath mode By default, failback to the preferred I/O path is done automatically if it is restored. Some users may want to keep using the backup I/O path even if the preferred I/O path is restored. In this case, bdev_nvme_set_preferred_path can be used to do manual failback. We may be able to clear/fill I/O path cache more strictly but it will be complicated and have bugs. This patch does the minimal change, just skips an apparent case. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I78fe5faee6ff04e88ae3d7c6be6da1c20637c912 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12431 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-17 12:54:45 +00:00
Alexey Marchuk	b0262063d3	vbdev_lvol: Report memory domains Update functional test to verify that lvol supports memory domains Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I5e91eedc8879359c3add45d417b6f3eaad4d75b9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11375 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	248ccd8607	lvol: Use blobstore ext API in data path The new blobstore ext API is used when the user provides ext_io_opts in bdev layer. To store blobstore ext_io_opts, vbdev_lvol reports non-zero get_ctx_size in bdev module interface. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I64076b5369533be0c1d69ca48aef9d70a9abe488 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11373 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	a236084542	blob: Add readv/writev_ext functions These function accept optional spdk_blob_ext_io_opts structure. If this structure is provided by the user then readv/writev_ext ops of base dev will be used in data path Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I370dd43f8c56f5752f7a52d0780bcfe3e3ae2d9e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11371 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	ba8f1a9e5d	blob: Add readv/writev ext ops to spdk_bs_dev Introduce spdk_blob_ext_io_opts structure which is used in the new *_ext functions. Zeroes dev is updated with implementation of readv_ext which uses memory domains memzero or regular memset(). Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Id94542196eff999827bf00591fd43804256fccb4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11369 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	5fd9561f54	dma: Add memzero function Add functions to set and call memzero callback to memory domains library. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ia6ddc3c9e0ca6e9172189964d180444e5da71d30 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12343 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-16 10:14:26 +00:00
Shuhei Matsumoto	5e5423de93	nvme: Add DISABLED to ctrlr's state to show completion of Controller Level Reset In the following patches, nvme_ctrlr_process_init() will be used to disable the controller when disconnecting the admin qpair for PCIe transport. In this case, we will have to exit nvme_ctrlr_process_init() after CSTS.RDY is 0. However, spdk_nvme_ctrlr_reset() and spdk_nvme_ctrlr_reconnect_poll_async() have to continue nvme_ctrlr_process_init() until the controller becomes ready. To differentiate stop and continue clearly, add a new state NVME_CTRLR_STATE_DISABLED to enum nvme_ctrlr_state. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic0a5fb7114d4eeb1cefec28bc404184768fb0a96 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12613 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-12 07:28:02 +00:00
paul luse	d58a2f6cc5	lib/accel: support multiple accel modules (aka engines) at once We enable multiple engines by: * getting rid of the globals that point to the one available HW and one available SW engine * adding a submit_tasks() entry point for the SW engine so that it is treated like any other engine allowing us to just call submit_tasks() to the assigned engine for the opcode instead of checking what is supported * changing the definition of engine capabilities from "HW accelerated" to simply "supported" * during init, use a global (g_engines_opc) that contains engines and is indexed by opcode so we know what the best engine is for each op code * future patches will add RPC's to override engine priorities or specifically assign an opcode(s) to an engine. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I9b9f3d5a2e499124aa7ccf71f0da83c8ee3dd9f9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11870 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-05 07:11:32 +00:00
Shuhei Matsumoto	8f9b977504	bdev/nvme: Add active/active policy for multipath mode The NVMe bdev module supported active-passive policy for multipath mode first. By this patch, the NVMe bdev module supports active-active policy for multipath node next. Following the Linux kernel native NVMe multipath, the NVMe bdev module supports round robin algorithm for active-active policy. The multipath policy, active-passive or active-active, is managed per nvme_bdev. The multipath policy is copied to all corresponding nvme_bdev_channels. Different from active-passive, active-active caches even non_optimized path to provide load balance across multiple paths. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ie18b24db60d3da1ce2f83725b6cd3079f628f95b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12001 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-05 07:11:24 +00:00
Shuhei Matsumoto	22b77a3c80	bdev/nvme: Set preferred I/O path in multipath mode If we specify a preferred path manually for each NVMe bdev, we will be able to realize a simple static load balancing and make the failover more controllable in the multipath mode. The idea is to move I/O path to the NVMe-oF controller to the head of the list and then clear the I/O path cache for each NVMe bdev channel. We can set the I/O path to the I/O path cache directly but it must be conditional and make the code very complex. Hence, let find_io_path() do that. However, a NVMe bdev channel may be acquired after setting the preferred path. To cover such case, sort the nvme_ns list of the NVMe bdev too. This feature supports only multipath mode. The NVMe bdev module supports failover mode too. However, to support the latter, the new RPC needs to have trid as parameters and the code and the usage will be come very complex. Add a note for such limitation. To verify one by one exactly, add unit test. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ia51c74f530d6d7dc1f73d5b65f854967363e76b0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12262 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: <tanl12@chinatelecom.cn> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-05 07:11:24 +00:00
Jim Harris	81a3b8a596	nvmf: make nacwu 0-based spdk_bdev_get_acwu() is a 1-based number, so we need to subtract 1 from it before assigning the value to nsdata->nacwu. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I32708b28a35670cba6013a48b79389fa48226285 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12399 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-29 07:29:06 +00:00
Richael Zhuang	9bff828f99	sock: introduce dynamic zerocopy according to data size MSG_ZEROCOPY is not always effective as mentioned in https://www.kernel.org/doc/html/v4.15/networking/msg_zerocopy.html. Currently in spdk, once we enable sendmsg zerocopy, then all data transferred through _sock_flush are sent with zerocopy, and vice versa. Here dynamic zerocopy is introduced to allow data sent with MSG_ZEROCOPY or not according to its size, which can be enabled by setting "enable_dynamic_zerocopy" as true. Test with 16 P4610 NVMe SSD, 2 initiators, target's and initiators' configurations are the same as spdk report: https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2104.pdf For posix socket, rw_percent=0(randwrite), it has 1.9%~8.3% performance boost tested with target 1~40 cpu cores and qdepth=128,256,512. And it has no obvious influence when read percentage is greater than 50%. For uring socket, rw_percent=0(randwrite), it has 1.8%~7.9% performance boost tested with target 1~40 cpu cores and qdepth=128,256,512. And it still has 1%~7% improvement when read percentage is greater than 50%. The following is part of the detailed data. posix: qdepth=128 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 286.5 298.5 4.19% 307 304.15 -0.93% 4 1042.5 1107 6.19% 1135.5 1136 0.04% 8 1952.5 2058 5.40% 2170.5 2170.5 0.00% 12 2658.5 2879 8.29% 3042 3046 0.13% 16 3247.5 3460.5 6.56% 3793.5 3775 -0.49% 24 4232.5 4459.5 5.36% 4614.5 4756.5 3.08% 32 4810 5095 5.93% 4488 4845 7.95% 40 5306.5 5435 2.42% 4427.5 4902 10.72% qdepth=512 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 275 287 4.36% 294.4 295.45 0.36% 4 979 1041 6.33% 1073 1083.5 0.98% 8 1822.5 1914.5 5.05% 2030.5 2018.5 -0.59% 12 2441 2598.5 6.45% 2808.5 2779.5 -1.03% 16 2920.5 3109.5 6.47% 3455 3411.5 -1.26% 24 3709 3972.5 7.10% 4483.5 4502.5 0.42% 32 4225.5 4532.5 7.27% 4463.5 4733 6.04% 40 4790.5 4884.5 1.96% 4427 4904.5 10.79% uring: qdepth=128 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 270.5 287.5 6.28% 295.75 304.75 3.04% 4 1018.5 1089.5 6.97% 1119.5 1156.5 3.31% 8 1907 2055 7.76% 2127 2211.5 3.97% 12 2614 2801 7.15% 2982.5 3061.5 2.65% 16 3169.5 3420 7.90% 3654.5 3781.5 3.48% 24 4109.5 4414 7.41% 4691.5 4750.5 1.26% 32 4752.5 4908 3.27% 4494 4825.5 7.38% 40 5233.5 5327 1.79% 4374.5 4891 11.81% qdepth=512 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 259.95 276 6.17% 286.65 294.8 2.84% 4 955 1021 6.91% 1070.5 1100 2.76% 8 1772 1903.5 7.42% 1992.5 2077.5 4.27% 12 2380.5 2543.5 6.85% 2752.5 2860 3.91% 16 2920.5 3099 6.11% 3391.5 3540 4.38% 24 3697 3912 5.82% 4401 4637 5.36% 32 4256.5 4454.5 4.65% 4516 4777 5.78% 40 4707 4968.5 5.56% 4400.5 4933 12.10% Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Change-Id: I730dcf89ed2bf3efe91586421a89045fc11c81f0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12210 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-28 07:29:28 +00:00
Alex Michon	2bc134eb4b	bdev/nvme: Fix aborting fuse commands When sending a fused compare and write command, we pass a callback bdev_nvme_comparev_and_writev_done that we expect to be called twice before marking the io as completed. In order to detect if a call to bdev_nvme_comparev_and_writev_done is the first or the second one, we currently rely on the opcode in cdw0. However, cdw0 may be set to 0, especially when aborting the command. This may cause use-after-free issues and this may call the user callbacks twice instead of once. Use a bit in the nvme_bdev_io instead to keep track of the number of calls to bdev_nvme_comparev_and_writev_done. Signed-off-by: Alex Michon <amichon@kalrayinc.com> Change-Id: I0474329e87648e44b08998d0552b2a9dd5d34ac2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12180 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-26 07:47:09 +00:00
Konrad Sztyber	3056c8ac02	nvmf/tcp: delay qpair destruction This patch adds an extra spdk_thread_send_msg() call to destroy a qpair to make sure that it isn't freed from the context of a socket write callback. Otherwise, spdk_sock_close() won't abort pending requests, causing their completions to be exected after the qpair is freed. Fixes #2471 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ia510d5d754baccca1e444afdb10696ab9b58e28b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12332 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-25 07:36:05 +00:00
Shuhei Matsumoto	494eb6e58b	bdev: Fix race among bdev_reset(), bdev_close(), and bdev_unregister() There is a race condition when a bdev is unregistered while reset is submitted from the upper layer very frequently. spdk_io_device_unregister() may fail because it is called while spdk_for_each_channel() is processed. spdk_io_device_unregister io_device bdev_Nvme0n1 (0x7f4be8053aa1) has 1 for_each calls outstanding To avoid this failure, defer calling spdk_io_device_unregister() until reset completes if reset is in progress when unregistration is ready to do, and then reset completion calls spdk_io_device_unregister() later. A bdev cannot be opened if it is already deleting. So we do not need to hold mutex. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ida1681ba9f3096670ff62274b35bb3e4fd69398a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12222 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-04-22 09:45:14 +00:00
Shuhei Matsumoto	50b6329ca0	bdev/nvme: Factor out ctrlr info json dump into a helper function Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I7f1e08ff13d890cb780e7b66c18a77ab85c82029 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12311 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-22 09:44:57 +00:00
Shuhei Matsumoto	13ca6e52d3	bdev/nvme: Handle ANA transition (change or inaccessible state) correctly Previously, if a namespace is in ANA inaccessible state, I/O had been queued infinitely. Fix this issue according to the NVMe spec. Add a temporary poller anatt_timer and a flag ana_transition_timedout for each nvme_ns. Start anatt_timer if the nvme_ns enters ANA transition. If anatt_timer is expired, set ana_transition_timedout to true. Cancel anatt_timer or clear ana_transition_timedout if the nvme_ns exits ANA transition. nvme_io_path_become_available() returns false if ana_transition_timedout is true. Add unit test case to verify these addition. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic76933242046b3e8e553de88221b943ad097c91c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12194 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>	2022-04-22 09:44:57 +00:00
Ben Walker	e22c933edb	idxd: Make many internal idxd_user functions take an idxd_user object This reduces a lot of casting. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ibc04f422858642d0e20c9b020bb6c5d1b70256fe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11534 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-04-20 08:20:45 +00:00
Shuhei Matsumoto	4b73223542	nvme_rdma: Wait until lingering qpair becomes quiet before completing disconnection The code to handle the lingering qpair when deleting it was really complicated. The RDMA transport can connect or disconnect qpair asynchronously. Then we can include the code to handle the lingering qpair into the code to disconnect qpair now. If the disconnected qpair is still busy, defer completion of the disconnection until qpair becomes idle. If poll group is not used, we can complete disconnection immediately because cq is already destroyed. The related data and unit test cases are not necessary anymore. So delete them in this patch. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	9717b0c3df	nvme_rdma: Connect and disconnect qpair asynchronously Add three states, INITIALIZING, EXITING, and EXITED to the rqpair state. Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it to opts->async_mode for I/O qpair and true for admin qpair. Replace all nvme_rdma_process_event() calls by nvme_rdma_process_event_start() calls. nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING when starting to process CM events. nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is not admin qpair. nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true or qpair->poll_group is not NULL before polling CM events, or polls CM events until completion otherwise. Add comments to clarify why we do like this. nvme_rdma_poll_group_process_completions() does not process submission for any qpair which is still connecting. Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Tomasz Zawadzki	6f89388ed3	lib/vhost: move vhost_user related fields from spdk_vhost_dev spdk_vhost_dev structure should only contain generic fields that are to be used by either vhost, vhost_blk or vhost_scsi layer. The vhost_user backend can hold its properties in spdk_vhost_user_dev, which is maintained within rte_vhost. Both structures contain references back to each other. The reference in spdk_vhost_dev is a void pointer to allow future transports to keep the reference to their own structures. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I68640c524426d885c20242146365ba242fa9df8e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11813 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-15 07:49:32 +00:00
paul luse	37b68d7287	accel: cleanup by getting rid of capabilties enum In support of upcoming patches and to greatly simplify things, the capabilites enum which held bit positions for each opcode has been removed. Only the opcodes enum remains and thus only opcodes are used throughout. For the capabiltiies bitmap a helper function is added to convert from opcode to bit position. Right now it is used in the IO path but in upcoming patches that goes away and the conversion is only done at init time. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ic4ad15b9f24ad3675a7bba4831f4e81de9b7bc70 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11949 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:32:50 +00:00
Ziv Hirsch	e749fa9c27	nvmf: fix buffer overflow on admin commands When req->iovcnt is bigger than 1, `memset(req->data, 0, req->length)` is wrong. Signed-off-by: Ziv Hirsch <zivhirsch13@gmail.com> Change-Id: Ie53eba686b4c5889bbde3b3644d51acbef303b42 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12216 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:31:35 +00:00
Kamil Godzwon	492cd95440	valgrind: fixed ASAN/Valgrind options Patch for not running tests if ASAN and Valgrind options are both enabled. Fixes #2422 Signed-off-by: Kamil Godzwon <kamilx.godzwon@intel.com> Change-Id: I50c91bede687f0aee571c1f2540530a7fafcb49c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11998 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Karol Latecki <karol.latecki@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-11 13:05:16 +00:00
Tomasz Zawadzki	f9fccbae63	lib/vhost: separate out vhost_user backend callbacks Previously spdk_vhost_dev_backend held callbacks for vhost_blk and vhost_scsi functionality, along with ones that are called by the vhost_user backend. This patch separates out those callbacks into two structures: - spdk_vhost_dev_backend - to be implemented by vhost_blk and vhost_scsi - spdk_vhost_user_dev_backend - is only implemented by vhost_user backend, callbacks for session managment specific to that transport Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I348090df5dddeb2b1945b082b85aec53d03c781b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11812 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-11 07:44:09 +00:00
Ben Walker	3edf1e200e	test/bdev: In bdev_nvme_ut, handle spdk_nvme_poll_group_remove when there is no group The real implementation handles this by returning -ENOENT, so do the same in the test. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I405b6f60bf4dcdb22c57e48bbaf66d57522a49c5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11508 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2022-04-07 07:23:56 +00:00
Ben Walker	2250a441c4	test/bdev: In bdev_nvme_ut, count a disconnect as "activity" Count disconnecting a queue pair as activity so that the unit test poll_threads() calls don't bail out until the disconnectedd_qpair_cb is called at least once. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Idc437d6c589dbf133bfcbb5edba1087f928a718c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11507 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2022-04-07 07:23:56 +00:00
Ben Walker	c86778398b	bdev/nvme: Remove ctrlr from nvme_ctrlr_channel This was neither set nor used. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I3119135843c5fc0b8724e593db40df46e6b5bdb0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12097 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-07 07:23:56 +00:00
yupeng	64eebbd132	bdev/raid: Add concat module The concat module can combine multiple underlying bdevs to a single bdev. It is a special raid level. You can add a new bdev to the end of the concat bdev, then the concat bdev size is increased, and it won't change the layout of the exist data. This is the major difference between concat and raid0. If you add a new underling device to raid0, the whole data layout will be changed. So the concat bdev is extentable. Change-Id: Ibbeeaf0606ff79b595320c597a5605ab9e4e13c4 Signed-off-by: Peng Yu <yupeng0921@gmail.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11070 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:39:00 +00:00
Shuhei Matsumoto	428b17a0a8	bdev: Add spdk_for_each_bdev/bdev_leaf for clean up and further improvements To execute a callback function for each registered bdev or unclaimed bdev, add new public APIs, spdk_for_each_bdev() and spdk_for_each_bdev_leaf(). These functions are safe for race conditions by opening before and closing after executing the provided callback function. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I59b702ffec7b4fc5e9779de5a3a75d44922b829b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12088 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:30:47 +00:00
Alexey Marchuk	be440c01c9	raid: Report memory domains Use spdk_bdev_readv/writev_block_ext even when there is no ext opts passed by bdev layer Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0b9f17150cdba1a1023478bae745ab4438ea99bb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10070 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-04 09:57:56 +00:00
Alexey Marchuk	99719ef049	raid0: Use extended bdev rw API That is a preparation for support of memory domains in bdev_raid Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I3a6e01eccd4d7e4bc197dc5ffe268d42081d41de Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11429 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-04 09:57:56 +00:00
Alexey Marchuk	1299439f3d	bdev: pull/push data if bdev doesn't support memory domains If bdev doesn't support any memory domain then allocate internal bounce buffer, pull data for write operation before IO submission, push data to memory domain once IO completes for read operation. Update test tool, add simple pull/push functions implementation. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ie9b94463e6a818bcd606fbb898fb0d6e0b5d5027 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10069 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-04-04 09:57:56 +00:00
Shuhei Matsumoto	4573e4cc23	module/bdev: Use spdk_bdev_unregister_by_name() if possible Replace spdk_bdev_get_by_name() + spdk_bdev_unregister() by spdk_bdev_unregister_by_name() wherever possible. This simplifies the code and makes the code more reliable. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I91388c9d0b2e244cb745720a480803b03c42a226 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12066 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-04 09:57:43 +00:00
Shuhei Matsumoto	96c007d301	bdev: Add spdk_bdev_unregister_by_name() to handle race condtions To unregister a bdev more correctly, we had to call spdk_bdev_open_ext(), spdk_bdev_desc_get_bdev(), spdk_bdev_unregister(), and then spdk_bdev_close(). This was correct but complicated. Hence add a new public API spdk_bdev_unregister_by_name() which does the whole correct sequence of bdev unregistration. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I9068d4ac49dca944436e0ba587308fd356dfef75 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12065 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-04 09:57:43 +00:00
Tomasz Zawadzki	6301f8915d	lib/sock: provide a hint to picking optimal poll group The process of matching qpair to poll group is split into two distinct parts that occur on different threads. See spdk_nvmf_tgt_new_qpair(). This results in a race condition for TCP between spdk_sock_map_lookup() and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group() and spdk_nvmf_poll_group_add() respectively. Fixes #2113 This patch picks a hint from nvmf_tcp for next poll group, which is then passed down to spdk_sock_map_lookup(). When matching placement_id exists, but does not have a poll group assigned - the hint will be used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-01 12:41:26 +00:00
Chunsong Feng	0db0c443df	nvmf/rdma: Improve read performance in DIF strip mode The rdma buffer for stripping DIF metadata is added. CPU strips the DIF metadata and copies it to the rdma buffer, improving the rdma write bandwith. The network bandwidth during 4KB random read test is increased from 79 Gbps to 99 Gbps, the IOPS is increased from 2075K to 2637K. Fixes issue #2418 Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Change-Id: If1c31256f0390f31d396812fa33cd650bf52b336 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11861 Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-01 11:19:18 +00:00

1 2 3 4 5 ...

2592 Commits