ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Artur Paszkiewicz	293cdc484b	ftl: management framework Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com> Change-Id: I8261863e80a53a37183b0148d4a08fa97e208dda Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13289 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2022-07-11 07:23:58 +00:00
Wojciech Malikowski	81dca28884	ftl: remove deprecated ftl library Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com> Change-Id: I3ebb05be3f1b9864b238cb74f469b4fdf573cd0d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11120 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-11 07:23:58 +00:00
Jim Harris	e415bf0033	nvme: add cmd/cpl printing for rdma errors This follows similar logic in the pcie and tcp completion paths, including omitting error messages when aborting aers by adding a print_on_error parameter to the completion function. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Id558d0af2cdd705dfb60abb842bd567a0949ccce Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13525 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	05dce1ee78	nvme: don't try to enable intel log pages on fabrics ctrlrs By default, the SPDK nvmf target reports vid==INTEL, which results in the SPDK nvme driver trying to enable Intel vendor-specific log page. Fix this by trying to enable those log pages only for PCIE transport controllers. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I78ebf365d4fa6295d1f610697266c3ead765988d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13524 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	988ce2ecaa	nvme: use assert for INTEL_VID check on log pages We can only get to this code path if the controller has vid==INTEL, so make that more clear by changing the check to an assert. Remove unit test that calls nvme_ctrlr_construct_intel_support_log_page_list() for a controller that is not VID==INTEL - this is no longer valid. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I3b58451bc95992bf641e7452f0ac4c2bac9fe31c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13523 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	4a24f581d6	nvme: add cmd/cpl printing for tcp errors This follows similar logic in the pcie completion path, including omitting error messages when aborting aers by adding a print_on_error parameter to the completion function. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I96df72280bb8fcbee3847fdc27f38e14a1bf3251 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13522 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Shuhei Matsumoto	4be6d30438	nvme: Add ctrlr_abort_queued_aborts() into qpair_abort_all_queued_reqs() nvme_qpair_abort_all_queued_reqs() aborts error injections, queued requests, aborting queued requests, and outstanding requests. (Aborting outstanding requests depends on transports.) However, it did not abort queued aborts. Include nvme_ctrlr_abort_queued_aborts() into nvme_qpair_abort_all_queued_reqs() to do really the name of the function indicates. nvme_ctrlr_abort_queued_aborts() has been called in a few cases, but we do not care duplication. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I19102cc6603a72ce5c398a7947cb4d606b692991 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12849 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-06-30 07:51:23 +00:00
GangCao	48ce2c978e	Bdev: remove the QD poller at the time of Bdev unregister Fix issue: #2561 The issue here is that in the bdev_set_qd_sampling_period RPC command, the QD sampling period has been set. Then later the related Desc is closed and in the bdev_close() function the QD sampling period is reset to 0. A new QD desc is added as the QD sampling period update could be handled properly. Meanwhile, a new QD Poll In Progress flag is also added so as to indicate there are ongoing events of QD sampling and the Bdev unregister will be handled in the proper way. Related test case and unit test also updated for this change. Change-Id: Iac86c2c6447fe338c7480cf468897fc8f41f8741 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13016 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-06-28 18:13:02 +00:00
yupeng	1f0b8df7b0	blobstore: implement spdk_bs_grow and bdev_lvol_grow_lvstore RPC The bdev_lvol_grow_lvstore will grow the lvstore size if the undering bdev size is increased. It invokes spdk_bs_grow internally. The spdk_bs_grow will extend the used_clusters bitmap. If there is no enough space resereved for the used_clusters bitmap, the api will fail. The reserved space was calculated according to the num_md_pages at blobstore creating time. Signed-off-by: Peng Yu <yupeng0921@gmail.com> Change-Id: If6e8c0794dbe4eaa7042acf5031de58138ce7bca Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9730 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-06-28 17:55:43 +00:00
yupeng	88833020eb	blobstore: reserve space for growing blobstore Reserve space for used_cluster bitmap. The reserved space is calculated according to the num_md_pages. The reserved space would be used when the blobstore is extended in the future. Add the num_md_pages_per_cluster_ratio parameter to the bdev_lvol_create_lvstore API. Then calculate the num_md_pages according to the num_md_pages_per_cluster_ratio and bdev total size, then pass the num_md_pages to the blobstore. Signed-off-by: Peng Yu <yupeng0921@gmail.com> Change-Id: I61a28a3c931227e0fd3e1ef6b145fc18a3657751 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9517 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-06-28 17:55:43 +00:00
Ben Walker	8dd1cd2104	check_format: For C files only, fix return type breaks In SPDK, declarations have the return type on the same line. Definitions have the return type on a separate line. Astyle has an option for enforcing this. Unfortunately, it seems to have two bugs: 1) It doesn't work correctly at all on C++ files. 2) It often fails on functions that return enums, or long type names Deal with 1) by adjusting the check_format.sh script to only tell astyle to fix return type line breaks for C files and not C++. Deal with 2) by adding a few typedefs to work around the problem. Change-Id: Idf28281466cab8411ce252d5f02ab384166790c6 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13437 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-06-27 09:33:48 +00:00
Tomasz Zawadzki	862bdb53b9	ut/lvol: silence gcc 12 strnlen errors gcc 12 reports that strnlen() exceeds the bound set by maxlen argument. This patch changes to strlen() to silence the following error: lvol_ut.c: In function ‘lvs_load’: lvol_ut.c:1086:56: error: ‘strnlen’ specified bound 64 exceeds source size 4 [-Werror=stringop-overread] 1086 \| spdk_blob_set_xattr(super_blob, "name", "lvs", strnlen("lvs", SPDK_LVS_NAME_MAX) + 1); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I56caf5bbb06fa0ea2cc61a9eef145fc275a416b2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13413 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-06-23 08:08:30 +00:00
xiaoxiangxzhang	7386b6ed09	UT/nvme_rdma: test_nvme_rdma_poll_group_set_cq Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Change-Id: Ia366420a8cea9169fbfa0dbf3a5747f7bc39f071 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12425 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-06-23 07:45:13 +00:00
xiaoxiangxzhang	21b8978760	UT/nvme_rdma:test_nvme_rdma_poll_group_get_stats Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Change-Id: I53851927fe5b870c773974f88fc762aa8eb22dab Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12419 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-06-23 07:45:13 +00:00
xiaoxiangxzhang	237ae71034	UT/nvmf_transport:test_spdk_nvmf_transport_opts_init Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Change-Id: I6d254331dcb362dfe1b6a3738ced123bf71e15e5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12350 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-06-21 23:47:21 +00:00
yidong0635	dabca25646	util: Extract a common lib between iovs and buf. It's useful to add these APIs. spdk_copy_iovs_to_buf and spdk_copy_buf_to_iovs. It prepares that other ones can call these. We don't need to define them in static state repeatedly. And add corresponding unit tests. Change-Id: Ife40fec8d047a48af67b04e6c055e4932282abfb Signed-off-by: yidong0635 <dongx.yi@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12075 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-06-20 10:01:15 +00:00
Anton Eidelman	0b9100e8a5	bdev/nvme: replace nn with mnan in ana_log size calculation Calculation of the ANA log page size should use the identify ctrl MNAN field (maximum number of allowed namespaces) not the NN (maximum valid nsid value). An ANA-enabled controller must have a non-zero MNAN value, see NVMe Base Specification, Figure 251, therefore nvme_ctrlr_init_ana_log_page() may safely use MNAN. Since NN might be much higher than MNAN, ANA log size based on NN may results in a very large log page and cause a failure to get ANA log, e.g. if it is larger than the controller's MDTS. Fix: replace cdata->nn with cdata->mnan in nvme_ctrlr_init_ana_log_page() Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Change-Id: I2a522dca833a27dddad25848d7688efa23d23091 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13039 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>	2022-06-15 08:10:48 +00:00
Jim Harris	6e9cf0c7b4	test: add ENV_CFLAGS after includes DPDK CFLAGS get put into CFLAGS in mk/cc.flags.mk, which for system package installed DPDK will include extra paths like /usr/include/<arch-3-tuple>/dpdk. If a Makefile adds its own CFLAGS before including the .mk fragment that pulls in these CFLAGS, we won't actually get those cc.flags.mk applied since they are defined with ?=. This may need to be revisited - using ?= for these has evolved through several iterations of our SPDK configured flag files - starting with commit `08ec96eb`. But for now, let's just fix these few Makefiles. Fixes issue #2548. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I9863db1b37b31907b4088f58cc13b81ed1bb8632 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12982 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-06-10 07:56:02 +00:00
Jim Harris	488570ebd4	Replace most BSD 3-clause license text with SPDX identifier. Many open source projects have moved to using SPDX identifiers to specify license information, reducing the amount of boilerplate code in every source file. This patch replaces the bulk of SPDK .c, .cpp and Makefiles with the BSD-3-Clause identifier. Almost all of these files share the exact same license text, and this patch only modifies the files that contain the most common license text. There can be slight variations because the third clause contains company names - most say "Intel Corporation", but there are instances for Nvidia, Samsung, Eideticom and even "the copyright holder". Used a bash script to automate replacement of the license text with SPDX identifier which is checked into scripts/spdx.sh. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iaa88ab5e92ea471691dc298cfe41ebfb5d169780 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12904 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: <qun.wan@intel.com>	2022-06-09 07:35:12 +00:00
Jim Harris	89c939e0a6	Eliminate license header differences. There are a few places where a typo, extra character or newline was added to the BSD 3-clause license text which made it differ very slightly from the rest of the license headers in the source tree. Remove those differences in this patch, to help with automation of SPDX identifier replacement in the next patch. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I542dc53cd252b1699253fd6dcc3ccac9643d7878 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12905 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-06-06 07:34:55 +00:00
xiaoxiangxzhang	abdefd22e3	UT/nvmf_rdma:test_nvmf_rdma_resize_cq Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Change-Id: I71171cf37b2bce09c4e61dc342024dc6de3ad60d Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12323 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: GangCao <gang.cao@intel.com>	2022-06-01 13:20:53 +00:00
xiaoxiangxzhang	acfd87ca96	UT/nvmf_ctrlr:test for get/set features_host_behavior_support Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Change-Id: I3b23d11d8dd2be6f260b5be6d92afe8116cbc0c8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12287 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: GangCao <gang.cao@intel.com>	2022-06-01 08:59:05 +00:00
Jim Harris	64df311eba	nvme: add KEYED_DATA_BLOCK to sgl_types This SGL type was missed in the original commit that added the pretty printing. Fixes: `4d9ab1e9a1` ("nvme: pretty print dptr") Reported-by: Ramanjaneya Burugula <burugula@gmail.com> Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ibc655db4e65009071f39f55f691c94a094cea0bc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12705 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-25 07:43:03 +00:00
Tomasz Zawadzki	b727e804d6	vhost: add virtio_blk abstraction This patch adds virtio_blk abstraction for custom transports, with the 'vhost_user_blk' first one being used. Added spdk_virtio_blk_transport_ops describing the nessecary callbacks to be implemented by each transport. Please use SPDK_VIRTIO_BLK_TRANSPORT_REGISTER to register the transport. Transports can use virtio_blk_process_request() to process the incoming I/O from their queues. virtio_blk_create_transport RPC was added to create one of the registered transports, possibly with custom JSON arguments. Added 'transport' argument to vhost_create_blk_controller RPC, to specify which transport should create the controller. By default the vhost_user_blk transport is used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ic9d93a6e0f483796eb56b7174a678e41a6ea4808 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9540 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-23 17:31:16 +00:00
Tomasz Zawadzki	07e31b028a	ut/vhost: select vhost_backend for UT As of right now the UT always used the empty struture of struct spdk_vhost_dev_backend during the test. This meant VHOST_BACKEND_BLK. alloc_vdev() will require further changes to test both types of backends. So for now change it to VHOST_BACKEND_SCSI, since it currently does not touch any fields outside of the struct spdk_vhost_dev. Meanwhile next patch will do so for blk backend. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ib5af7520bc8a21a7af03b810d4cc42726797a331 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12749 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-05-20 19:40:56 +00:00
Tomasz Zawadzki	91426dc600	ut/vhost: add vhost_blk.c and stubs Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I5218d6ea95f6edb6f664bad75b17c68c0760d637 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10977 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-05-20 19:40:56 +00:00
Tomasz Zawadzki	69820927da	ut/vhost: initialize vhost libraries Vhost library was not initialized as part of the test, it will become necessary later in the series. Suite startup/cleanup have no matching CUnit test case, so only assert() can be used. Rather than CU_ASSERT(). Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ieaa3d2f6b6f1899105362181f285f585ff9724d7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10945 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-20 19:40:56 +00:00
Alexey Marchuk	619b4dba8a	lib/reduce: Check if user's buffer crosses huge page boundary If compress driver doesn't support SGL input of output then we need to copy user's buffers into reduce internal buffers Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0c07243a5b668d0e0adcc153e5b573f59c26ab64 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12281 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-20 17:39:57 +00:00
Alexey Marchuk	b86e85f56f	lib/reduce: Properly allocate comp/decomp buffers Reduce library allocates one big chunk of memory and then splits it between requests. The problem is that a chunk of memory assigned to a request may cross huge page boundary and if compress driver doesn't support SGL input of output, operation will be failed. To avoid this problem, align buffer start on 2MiB and check each chunk of memory if it crosses huge page boundary. Fixes issue #2454 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ie730b8ba928f27a43bde1222b6c18d29b797575a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12249 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-20 17:39:57 +00:00
Jonas Pfefferle	192e64bcc5	bdev: spdk_bdev_ext_io_opts missing size check ext_io_opts uses the size member to allow backwards compatibility however currently we only check if it is below or equal the current size of the opts struct and that it is not 0. size is only used when we copy opts because of split or push/pull. This patch introduces size checks to allow safe access to e.g. metadata and memory domain pointers of the user provided opts pointer. The minimum size of the struct passed is now the size of the initial version of spdk_bdev_ext_io_opts. To not introduce additional checks when opts are consumed by a bdev module we now always copy if the size is smaller than the current opts struct size. When introducing new members to opts additional checks might be needed if those are directly accessed through the passed pointer or bdev_io->internal.ext_opts. Change-Id: Ibd181a5840a3d5022018a9f61403df961ffd6e1d Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12550 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-20 15:55:50 +00:00
GangCao	7cfb12f437	Bdev/Lvol: check base bdev's md before examining To fix issue #2514 Change-Id: If507382202e729f5934a354e2515a035ad5aeb0c Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12750 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-20 09:18:18 +00:00
Shuhei Matsumoto	e4584d937e	bdev/nvme: Poll adminq more often during ctrlr disconnection During ctrlr reconnection, spdk_nvme_ctrlr_reconnect_poll_async() is executed by a non-timed poller. We should poll adminq more often during ctrlr disconnection too. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ib1f5b41015aed20deda8df6f2c837981ac233c04 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12615 Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-20 09:17:28 +00:00
Shuhei Matsumoto	fcf52fbff5	bdev/nvme: Reversed orderings for reset between PCIe and NVMe-oF As described in the NVMe specification, a controller level reset includes the following actions: - the controller stops processing any outstanding admin or I/O commands; - all I/O SQs and CQs are deleted. In a full controller reset sequence for a PCIe controller, if we do a controller level reset first, we can abort outstanding commands after the hardware has actually been stopped. For NVMe-oF controller, each I/O qpair is an independent network connection and is disconnected safely. We do not want to change NVMe-oF controller. Fixes the issue #2360 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: If05febac74705bfd3df5abd15064c1203126e027 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12447 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-20 09:17:28 +00:00
Shuhei Matsumoto	736b9da034	nvme: Do Controller Level Reset when disconnecting adminq for PCIe As described in the previous patches, we need to delete all I/O SQ/CQs before aborting trackers when disconnecting a controller. The following patches reorder the operations. This patch changes adminq disconnection to initiate a Controller Level Reset and adminq completion processes it if ctrlr->is_disconnecting is true. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I64f06bae2ce8a9127124029fd042db0028198e3c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12560 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-19 08:23:57 +00:00
Alexey Marchuk	1eca87c39c	blobstore: Preallocate md_page for new cluster When a new cluster is added to a thin provisioned blob, md_page is allocated to update extents in base dev This memory allocation reduces perfromance, it can take 250usec - 1 msec on ARM platform. Since we may have only 1 outstainding cluster allocation per io_channel, we can preallcoate md_page on each channel and remove dynamic memory allocation. With this change blob_write_extent_page() expects that md_page is given by the caller. Sicne this function is also used during snapshot deletion, this patch also updates this process. Now we allocate a single page and reuse it for each extent in the snapshot. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I815a4c8c69bd38d8eff4f45c088e5d05215b9e57 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12129 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-18 09:02:02 +00:00
GangCao	7bcd316de1	bdev: abort all IOs when unregistering the bdev To fix issue: #2484 When unregistering the bdev, will send out the message to each thread to abort all the IOs including IOs from nomem_io queue, need_buf_small queue and need_buf_large queue. The new SPDK_BDEV_STATUS_UNREGISTERING state is newly added to indicate this unregister operation. In this case, the bdev unregister operation becomes the async operation as each thread will be sent the message to abort the IOs and as the last step, it will unregister the required bdev and associted io device. On the other hand, the queued_resets will be handled separately and not aborted in the bdev unregister. New unit test cases are also added: enomem_multi_bdev_unregister: to abort the IO from nomem_io queue during the unregister operation bdev_open_ext_unregister: to handle the events and async operations from the unregister operation Change-Id: Ib1663c0f71ffe87144869cb3a684e18eb956046b Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12573 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com>	2022-05-18 07:30:00 +00:00
Alexey Marchuk	007fb1d3cb	nvme: Fix keyed/unkeyd SGL nvme cmd dump Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0a08518b5c30455a17158aa440715515d0c066fc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12133 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-17 20:11:43 +00:00
Shuhei Matsumoto	00d46b80b2	bdev/nvme: Disable automatic failback in multipath mode By default, failback to the preferred I/O path is done automatically if it is restored. Some users may want to keep using the backup I/O path even if the preferred I/O path is restored. In this case, bdev_nvme_set_preferred_path can be used to do manual failback. We may be able to clear/fill I/O path cache more strictly but it will be complicated and have bugs. This patch does the minimal change, just skips an apparent case. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I78fe5faee6ff04e88ae3d7c6be6da1c20637c912 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12431 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-17 12:54:45 +00:00
Alexey Marchuk	b0262063d3	vbdev_lvol: Report memory domains Update functional test to verify that lvol supports memory domains Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I5e91eedc8879359c3add45d417b6f3eaad4d75b9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11375 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	248ccd8607	lvol: Use blobstore ext API in data path The new blobstore ext API is used when the user provides ext_io_opts in bdev layer. To store blobstore ext_io_opts, vbdev_lvol reports non-zero get_ctx_size in bdev module interface. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I64076b5369533be0c1d69ca48aef9d70a9abe488 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11373 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	a236084542	blob: Add readv/writev_ext functions These function accept optional spdk_blob_ext_io_opts structure. If this structure is provided by the user then readv/writev_ext ops of base dev will be used in data path Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I370dd43f8c56f5752f7a52d0780bcfe3e3ae2d9e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11371 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	ba8f1a9e5d	blob: Add readv/writev ext ops to spdk_bs_dev Introduce spdk_blob_ext_io_opts structure which is used in the new *_ext functions. Zeroes dev is updated with implementation of readv_ext which uses memory domains memzero or regular memset(). Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Id94542196eff999827bf00591fd43804256fccb4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11369 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-05-16 10:14:26 +00:00
Alexey Marchuk	5fd9561f54	dma: Add memzero function Add functions to set and call memzero callback to memory domains library. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ia6ddc3c9e0ca6e9172189964d180444e5da71d30 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12343 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-16 10:14:26 +00:00
Shuhei Matsumoto	5e5423de93	nvme: Add DISABLED to ctrlr's state to show completion of Controller Level Reset In the following patches, nvme_ctrlr_process_init() will be used to disable the controller when disconnecting the admin qpair for PCIe transport. In this case, we will have to exit nvme_ctrlr_process_init() after CSTS.RDY is 0. However, spdk_nvme_ctrlr_reset() and spdk_nvme_ctrlr_reconnect_poll_async() have to continue nvme_ctrlr_process_init() until the controller becomes ready. To differentiate stop and continue clearly, add a new state NVME_CTRLR_STATE_DISABLED to enum nvme_ctrlr_state. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic0a5fb7114d4eeb1cefec28bc404184768fb0a96 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12613 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-12 07:28:02 +00:00
paul luse	d58a2f6cc5	lib/accel: support multiple accel modules (aka engines) at once We enable multiple engines by: * getting rid of the globals that point to the one available HW and one available SW engine * adding a submit_tasks() entry point for the SW engine so that it is treated like any other engine allowing us to just call submit_tasks() to the assigned engine for the opcode instead of checking what is supported * changing the definition of engine capabilities from "HW accelerated" to simply "supported" * during init, use a global (g_engines_opc) that contains engines and is indexed by opcode so we know what the best engine is for each op code * future patches will add RPC's to override engine priorities or specifically assign an opcode(s) to an engine. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I9b9f3d5a2e499124aa7ccf71f0da83c8ee3dd9f9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11870 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-05 07:11:32 +00:00
Shuhei Matsumoto	8f9b977504	bdev/nvme: Add active/active policy for multipath mode The NVMe bdev module supported active-passive policy for multipath mode first. By this patch, the NVMe bdev module supports active-active policy for multipath node next. Following the Linux kernel native NVMe multipath, the NVMe bdev module supports round robin algorithm for active-active policy. The multipath policy, active-passive or active-active, is managed per nvme_bdev. The multipath policy is copied to all corresponding nvme_bdev_channels. Different from active-passive, active-active caches even non_optimized path to provide load balance across multiple paths. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ie18b24db60d3da1ce2f83725b6cd3079f628f95b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12001 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-05 07:11:24 +00:00
Shuhei Matsumoto	22b77a3c80	bdev/nvme: Set preferred I/O path in multipath mode If we specify a preferred path manually for each NVMe bdev, we will be able to realize a simple static load balancing and make the failover more controllable in the multipath mode. The idea is to move I/O path to the NVMe-oF controller to the head of the list and then clear the I/O path cache for each NVMe bdev channel. We can set the I/O path to the I/O path cache directly but it must be conditional and make the code very complex. Hence, let find_io_path() do that. However, a NVMe bdev channel may be acquired after setting the preferred path. To cover such case, sort the nvme_ns list of the NVMe bdev too. This feature supports only multipath mode. The NVMe bdev module supports failover mode too. However, to support the latter, the new RPC needs to have trid as parameters and the code and the usage will be come very complex. Add a note for such limitation. To verify one by one exactly, add unit test. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ia51c74f530d6d7dc1f73d5b65f854967363e76b0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12262 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: <tanl12@chinatelecom.cn> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-05 07:11:24 +00:00
Jim Harris	81a3b8a596	nvmf: make nacwu 0-based spdk_bdev_get_acwu() is a 1-based number, so we need to subtract 1 from it before assigning the value to nsdata->nacwu. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I32708b28a35670cba6013a48b79389fa48226285 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12399 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-29 07:29:06 +00:00
Richael Zhuang	9bff828f99	sock: introduce dynamic zerocopy according to data size MSG_ZEROCOPY is not always effective as mentioned in https://www.kernel.org/doc/html/v4.15/networking/msg_zerocopy.html. Currently in spdk, once we enable sendmsg zerocopy, then all data transferred through _sock_flush are sent with zerocopy, and vice versa. Here dynamic zerocopy is introduced to allow data sent with MSG_ZEROCOPY or not according to its size, which can be enabled by setting "enable_dynamic_zerocopy" as true. Test with 16 P4610 NVMe SSD, 2 initiators, target's and initiators' configurations are the same as spdk report: https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2104.pdf For posix socket, rw_percent=0(randwrite), it has 1.9%~8.3% performance boost tested with target 1~40 cpu cores and qdepth=128,256,512. And it has no obvious influence when read percentage is greater than 50%. For uring socket, rw_percent=0(randwrite), it has 1.8%~7.9% performance boost tested with target 1~40 cpu cores and qdepth=128,256,512. And it still has 1%~7% improvement when read percentage is greater than 50%. The following is part of the detailed data. posix: qdepth=128 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 286.5 298.5 4.19% 307 304.15 -0.93% 4 1042.5 1107 6.19% 1135.5 1136 0.04% 8 1952.5 2058 5.40% 2170.5 2170.5 0.00% 12 2658.5 2879 8.29% 3042 3046 0.13% 16 3247.5 3460.5 6.56% 3793.5 3775 -0.49% 24 4232.5 4459.5 5.36% 4614.5 4756.5 3.08% 32 4810 5095 5.93% 4488 4845 7.95% 40 5306.5 5435 2.42% 4427.5 4902 10.72% qdepth=512 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 275 287 4.36% 294.4 295.45 0.36% 4 979 1041 6.33% 1073 1083.5 0.98% 8 1822.5 1914.5 5.05% 2030.5 2018.5 -0.59% 12 2441 2598.5 6.45% 2808.5 2779.5 -1.03% 16 2920.5 3109.5 6.47% 3455 3411.5 -1.26% 24 3709 3972.5 7.10% 4483.5 4502.5 0.42% 32 4225.5 4532.5 7.27% 4463.5 4733 6.04% 40 4790.5 4884.5 1.96% 4427 4904.5 10.79% uring: qdepth=128 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 270.5 287.5 6.28% 295.75 304.75 3.04% 4 1018.5 1089.5 6.97% 1119.5 1156.5 3.31% 8 1907 2055 7.76% 2127 2211.5 3.97% 12 2614 2801 7.15% 2982.5 3061.5 2.65% 16 3169.5 3420 7.90% 3654.5 3781.5 3.48% 24 4109.5 4414 7.41% 4691.5 4750.5 1.26% 32 4752.5 4908 3.27% 4494 4825.5 7.38% 40 5233.5 5327 1.79% 4374.5 4891 11.81% qdepth=512 rw_percent 0 \| 30 cpu origin thisPatch opt \| origin thisPatch opt 1 259.95 276 6.17% 286.65 294.8 2.84% 4 955 1021 6.91% 1070.5 1100 2.76% 8 1772 1903.5 7.42% 1992.5 2077.5 4.27% 12 2380.5 2543.5 6.85% 2752.5 2860 3.91% 16 2920.5 3099 6.11% 3391.5 3540 4.38% 24 3697 3912 5.82% 4401 4637 5.36% 32 4256.5 4454.5 4.65% 4516 4777 5.78% 40 4707 4968.5 5.56% 4400.5 4933 12.10% Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Change-Id: I730dcf89ed2bf3efe91586421a89045fc11c81f0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12210 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-28 07:29:28 +00:00
Alex Michon	2bc134eb4b	bdev/nvme: Fix aborting fuse commands When sending a fused compare and write command, we pass a callback bdev_nvme_comparev_and_writev_done that we expect to be called twice before marking the io as completed. In order to detect if a call to bdev_nvme_comparev_and_writev_done is the first or the second one, we currently rely on the opcode in cdw0. However, cdw0 may be set to 0, especially when aborting the command. This may cause use-after-free issues and this may call the user callbacks twice instead of once. Use a bit in the nvme_bdev_io instead to keep track of the number of calls to bdev_nvme_comparev_and_writev_done. Signed-off-by: Alex Michon <amichon@kalrayinc.com> Change-Id: I0474329e87648e44b08998d0552b2a9dd5d34ac2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12180 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-26 07:47:09 +00:00
Konrad Sztyber	3056c8ac02	nvmf/tcp: delay qpair destruction This patch adds an extra spdk_thread_send_msg() call to destroy a qpair to make sure that it isn't freed from the context of a socket write callback. Otherwise, spdk_sock_close() won't abort pending requests, causing their completions to be exected after the qpair is freed. Fixes #2471 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ia510d5d754baccca1e444afdb10696ab9b58e28b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12332 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-25 07:36:05 +00:00
Shuhei Matsumoto	494eb6e58b	bdev: Fix race among bdev_reset(), bdev_close(), and bdev_unregister() There is a race condition when a bdev is unregistered while reset is submitted from the upper layer very frequently. spdk_io_device_unregister() may fail because it is called while spdk_for_each_channel() is processed. spdk_io_device_unregister io_device bdev_Nvme0n1 (0x7f4be8053aa1) has 1 for_each calls outstanding To avoid this failure, defer calling spdk_io_device_unregister() until reset completes if reset is in progress when unregistration is ready to do, and then reset completion calls spdk_io_device_unregister() later. A bdev cannot be opened if it is already deleting. So we do not need to hold mutex. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ida1681ba9f3096670ff62274b35bb3e4fd69398a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12222 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-04-22 09:45:14 +00:00
Shuhei Matsumoto	50b6329ca0	bdev/nvme: Factor out ctrlr info json dump into a helper function Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I7f1e08ff13d890cb780e7b66c18a77ab85c82029 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12311 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-22 09:44:57 +00:00
Shuhei Matsumoto	13ca6e52d3	bdev/nvme: Handle ANA transition (change or inaccessible state) correctly Previously, if a namespace is in ANA inaccessible state, I/O had been queued infinitely. Fix this issue according to the NVMe spec. Add a temporary poller anatt_timer and a flag ana_transition_timedout for each nvme_ns. Start anatt_timer if the nvme_ns enters ANA transition. If anatt_timer is expired, set ana_transition_timedout to true. Cancel anatt_timer or clear ana_transition_timedout if the nvme_ns exits ANA transition. nvme_io_path_become_available() returns false if ana_transition_timedout is true. Add unit test case to verify these addition. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic76933242046b3e8e553de88221b943ad097c91c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12194 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>	2022-04-22 09:44:57 +00:00
Ben Walker	e22c933edb	idxd: Make many internal idxd_user functions take an idxd_user object This reduces a lot of casting. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ibc04f422858642d0e20c9b020bb6c5d1b70256fe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11534 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-04-20 08:20:45 +00:00
Shuhei Matsumoto	4b73223542	nvme_rdma: Wait until lingering qpair becomes quiet before completing disconnection The code to handle the lingering qpair when deleting it was really complicated. The RDMA transport can connect or disconnect qpair asynchronously. Then we can include the code to handle the lingering qpair into the code to disconnect qpair now. If the disconnected qpair is still busy, defer completion of the disconnection until qpair becomes idle. If poll group is not used, we can complete disconnection immediately because cq is already destroyed. The related data and unit test cases are not necessary anymore. So delete them in this patch. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	9717b0c3df	nvme_rdma: Connect and disconnect qpair asynchronously Add three states, INITIALIZING, EXITING, and EXITED to the rqpair state. Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it to opts->async_mode for I/O qpair and true for admin qpair. Replace all nvme_rdma_process_event() calls by nvme_rdma_process_event_start() calls. nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING when starting to process CM events. nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is not admin qpair. nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true or qpair->poll_group is not NULL before polling CM events, or polls CM events until completion otherwise. Add comments to clarify why we do like this. nvme_rdma_poll_group_process_completions() does not process submission for any qpair which is still connecting. Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Tomasz Zawadzki	6f89388ed3	lib/vhost: move vhost_user related fields from spdk_vhost_dev spdk_vhost_dev structure should only contain generic fields that are to be used by either vhost, vhost_blk or vhost_scsi layer. The vhost_user backend can hold its properties in spdk_vhost_user_dev, which is maintained within rte_vhost. Both structures contain references back to each other. The reference in spdk_vhost_dev is a void pointer to allow future transports to keep the reference to their own structures. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I68640c524426d885c20242146365ba242fa9df8e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11813 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-15 07:49:32 +00:00
paul luse	37b68d7287	accel: cleanup by getting rid of capabilties enum In support of upcoming patches and to greatly simplify things, the capabilites enum which held bit positions for each opcode has been removed. Only the opcodes enum remains and thus only opcodes are used throughout. For the capabiltiies bitmap a helper function is added to convert from opcode to bit position. Right now it is used in the IO path but in upcoming patches that goes away and the conversion is only done at init time. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ic4ad15b9f24ad3675a7bba4831f4e81de9b7bc70 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11949 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:32:50 +00:00
Ziv Hirsch	e749fa9c27	nvmf: fix buffer overflow on admin commands When req->iovcnt is bigger than 1, `memset(req->data, 0, req->length)` is wrong. Signed-off-by: Ziv Hirsch <zivhirsch13@gmail.com> Change-Id: Ie53eba686b4c5889bbde3b3644d51acbef303b42 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12216 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:31:35 +00:00
Tomasz Zawadzki	f9fccbae63	lib/vhost: separate out vhost_user backend callbacks Previously spdk_vhost_dev_backend held callbacks for vhost_blk and vhost_scsi functionality, along with ones that are called by the vhost_user backend. This patch separates out those callbacks into two structures: - spdk_vhost_dev_backend - to be implemented by vhost_blk and vhost_scsi - spdk_vhost_user_dev_backend - is only implemented by vhost_user backend, callbacks for session managment specific to that transport Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I348090df5dddeb2b1945b082b85aec53d03c781b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11812 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-11 07:44:09 +00:00
Ben Walker	3edf1e200e	test/bdev: In bdev_nvme_ut, handle spdk_nvme_poll_group_remove when there is no group The real implementation handles this by returning -ENOENT, so do the same in the test. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I405b6f60bf4dcdb22c57e48bbaf66d57522a49c5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11508 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2022-04-07 07:23:56 +00:00
Ben Walker	2250a441c4	test/bdev: In bdev_nvme_ut, count a disconnect as "activity" Count disconnecting a queue pair as activity so that the unit test poll_threads() calls don't bail out until the disconnectedd_qpair_cb is called at least once. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Idc437d6c589dbf133bfcbb5edba1087f928a718c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11507 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2022-04-07 07:23:56 +00:00
Ben Walker	c86778398b	bdev/nvme: Remove ctrlr from nvme_ctrlr_channel This was neither set nor used. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I3119135843c5fc0b8724e593db40df46e6b5bdb0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12097 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-07 07:23:56 +00:00
yupeng	64eebbd132	bdev/raid: Add concat module The concat module can combine multiple underlying bdevs to a single bdev. It is a special raid level. You can add a new bdev to the end of the concat bdev, then the concat bdev size is increased, and it won't change the layout of the exist data. This is the major difference between concat and raid0. If you add a new underling device to raid0, the whole data layout will be changed. So the concat bdev is extentable. Change-Id: Ibbeeaf0606ff79b595320c597a5605ab9e4e13c4 Signed-off-by: Peng Yu <yupeng0921@gmail.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11070 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:39:00 +00:00
Shuhei Matsumoto	428b17a0a8	bdev: Add spdk_for_each_bdev/bdev_leaf for clean up and further improvements To execute a callback function for each registered bdev or unclaimed bdev, add new public APIs, spdk_for_each_bdev() and spdk_for_each_bdev_leaf(). These functions are safe for race conditions by opening before and closing after executing the provided callback function. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I59b702ffec7b4fc5e9779de5a3a75d44922b829b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12088 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:30:47 +00:00
Alexey Marchuk	be440c01c9	raid: Report memory domains Use spdk_bdev_readv/writev_block_ext even when there is no ext opts passed by bdev layer Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0b9f17150cdba1a1023478bae745ab4438ea99bb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10070 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-04 09:57:56 +00:00
Alexey Marchuk	99719ef049	raid0: Use extended bdev rw API That is a preparation for support of memory domains in bdev_raid Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I3a6e01eccd4d7e4bc197dc5ffe268d42081d41de Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11429 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-04 09:57:56 +00:00
Alexey Marchuk	1299439f3d	bdev: pull/push data if bdev doesn't support memory domains If bdev doesn't support any memory domain then allocate internal bounce buffer, pull data for write operation before IO submission, push data to memory domain once IO completes for read operation. Update test tool, add simple pull/push functions implementation. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ie9b94463e6a818bcd606fbb898fb0d6e0b5d5027 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10069 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-04-04 09:57:56 +00:00
Shuhei Matsumoto	4573e4cc23	module/bdev: Use spdk_bdev_unregister_by_name() if possible Replace spdk_bdev_get_by_name() + spdk_bdev_unregister() by spdk_bdev_unregister_by_name() wherever possible. This simplifies the code and makes the code more reliable. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I91388c9d0b2e244cb745720a480803b03c42a226 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12066 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-04 09:57:43 +00:00
Shuhei Matsumoto	96c007d301	bdev: Add spdk_bdev_unregister_by_name() to handle race condtions To unregister a bdev more correctly, we had to call spdk_bdev_open_ext(), spdk_bdev_desc_get_bdev(), spdk_bdev_unregister(), and then spdk_bdev_close(). This was correct but complicated. Hence add a new public API spdk_bdev_unregister_by_name() which does the whole correct sequence of bdev unregistration. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I9068d4ac49dca944436e0ba587308fd356dfef75 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12065 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-04 09:57:43 +00:00
Tomasz Zawadzki	6301f8915d	lib/sock: provide a hint to picking optimal poll group The process of matching qpair to poll group is split into two distinct parts that occur on different threads. See spdk_nvmf_tgt_new_qpair(). This results in a race condition for TCP between spdk_sock_map_lookup() and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group() and spdk_nvmf_poll_group_add() respectively. Fixes #2113 This patch picks a hint from nvmf_tcp for next poll group, which is then passed down to spdk_sock_map_lookup(). When matching placement_id exists, but does not have a poll group assigned - the hint will be used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-01 12:41:26 +00:00
Chunsong Feng	0db0c443df	nvmf/rdma: Improve read performance in DIF strip mode The rdma buffer for stripping DIF metadata is added. CPU strips the DIF metadata and copies it to the rdma buffer, improving the rdma write bandwith. The network bandwidth during 4KB random read test is increased from 79 Gbps to 99 Gbps, the IOPS is increased from 2075K to 2637K. Fixes issue #2418 Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Change-Id: If1c31256f0390f31d396812fa33cd650bf52b336 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11861 Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-01 11:19:18 +00:00
Chunsong Feng	05dd3c0bb2	dif: enhance copy API to support block-aligned bounce_iov When iovs are copied from bounce or to bounce, the bounce is usually alloced from data_buf_pool for better performance, and is multi iovs instead of a single buffer. Therefore, block-aligned bounce are supported. Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Change-Id: If56b21d9e46c73d4c956c227bec33ddd0ab9745b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11860 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:29:12 +00:00
Shuhei Matsumoto	0a61427ecc	nvme_rdma: Start qpair after resolving address and route when poll group is used Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I0b0f314c98368247582f2dfcaf69f78e24d715f9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11366 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Tomasz Zawadzki	91f2725291	lib/sock: fix lookup on placement_id with NULL sock_group spdk_sock_map_insert() allows for allocating a sock_map entry, without assigning any sock_group. This is useful for cases where placement_id determined by the component using spdk_sock_map_*. See PLACEMENT_MARK mode. Placement_id's are allocated first, then an empty one is found using spdk_sock_map_find_free(). Since the above is a valid use case, then entry in sock_map can exist without a group assigned. spdk_sock_map_lookup() has to handle such cases, rather than trigger an assert. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ia717c38fef5e71fe44471ea12f61a5548463f0cf Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10725 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:25 +00:00
Tomasz Zawadzki	dfeab17ef6	ut/sock: add unit tests for spdk_sock_map The usage of internal API for sock_map was not unit tested, so far. This patch adds first set of UT for the sock_map, expanding it and fixing some issues later in the series. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Idfce8e19668a87f1d45d73310edb17d71d9f8bd8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10724 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-01 08:28:25 +00:00
paul luse	b9d44da07d	lib/idxd: Further simplify WQ configuration code As we now only support a single WQ, there's no need for a teble of them and no need to assert that the stride from WQ to WQ is the same as the WQ struct size. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I205f36aae22070f532653726dd75249bbafbe3ef Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12081 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-03-31 17:59:21 +00:00
paul luse	e68aebd50b	lib/accel: remove public API for getting capabilities First in a series of patches that will enable multiple engines to exist at once and choose the best one based on their priorities and capabilites, the public API will no longer be needed. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ia87b83aa2263745a94a822a160b6e97bb2e0dc19 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11948 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-31 09:36:25 +00:00
Alexey Marchuk	f530abcab3	bdev/compress: Verify mbuf chain if the driver doesn't suppot SGL With recent changes libreduce should provide correct buffers if the driver doesn't support SGL in/out. This patch verifies that we don't use SGLs when they are not supported. Since even a single buffer can be split on 2MB page boundary, it is not enough just to check iovs count. Added asserts that the first elements of mbufs are not null to avoid scan build errors Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I620e43bf5b1abd25cab412fe08346a6d767c9be9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11973 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-03-31 09:34:52 +00:00
Alexey Marchuk	731ddc7107	bdev/compress: Correctly free mbufs in error case rte_pktmbuf_free frees the given mbuf and any chained mbufs. It can cause double free of some mbuf if we free every mbuf in a loop. Instead use rte_pktmbuf_free_bulk which correctly release chained mbufs. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I55fd7832ff656f519a4ed2f02de8ef1a0f637a02 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11972 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-31 09:34:52 +00:00
Alexey Marchuk	42f59f5006	lib/reduce: Copy user's buffers if SGL is not supported In the compression operation we may have SGL input if user's buffer is fragmented or less than chunk_size. If the backing device doesn't support SGL input then we should copy user's buffers into decomp_buffer (including paddings if any). In the decompression operation, if the backing device doesn't support SGL output, we use a single output buffer which is pointing to decomp_buffer. Once the operation completes, we should copy the result into user's buffers. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ic7fddd38374bb6898256633eacd192dbaf36541a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11970 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-31 09:34:52 +00:00
Ben Walker	7dfe90df60	idxd: Remove idxd_group altogether The driver always creates a single group containing all of the engines and a single work queue. Change-Id: I83f170f966abbd141304c49bd75ffe4608f5ad03 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11533 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-03-25 12:49:22 +00:00
Ben Walker	9de35e7fc8	idxd: Remove idxd_wq It is not used for anything. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I1d967b2d0e404756f7ceda98ddc4ee9017ec83f7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11489 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-03-25 12:49:22 +00:00
Ben Walker	225cf4b6ed	idxd: Remove idxd_wqcfg from idxd_wq It turns out that this can stay on the stack. Change-Id: I961366307dae5ec7413a86271cd1dfb370b8f9f3 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11488 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-03-25 12:49:22 +00:00
Ben Walker	7a9b023008	idxd: Don't cache any register values These aren't ever accessed in the main I/O path, so we can read them in whenever we need them and make the code a lot simpler. Change-Id: Icfdbfe9f2d9db13f4d0d28b2b4103cd0c443bcf4 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11485 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-03-25 12:49:22 +00:00
Ben Walker	b3d3f2028b	idxd: Eliminate config struct from idxd_user This is no longer needed. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I08c788ca0451e739804b568d613c1e52e071c61f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11794 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-03-25 08:20:08 +00:00
paul luse	dacb66d7f4	module/accel/ioat: fix bug with 'fill' handling Fill is sent in as a uint8, we need to populate the full uint64 input with the uint8 pattern or we'll get a miscompare. This is how idxd was doing it, instead of adding the same code to ioat just move it up a layer. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ia4aab1c6230f35ab88bb8a0e3b8e16dbd93007c7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11947 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-25 08:18:16 +00:00
Yuriy Umanets	5ba9b78e17	bdev/crypto: Cleanup with crypto opts fields duplication - Fixed duplication of key, key2, drv_name, cipher, etc., fields in struct bdev_names and struct vbdev_crypto. Moved all of them into the new struct vbdev_crypto_opts, which is re-used by both structs. This aslo removes duplication in error handling and fininalization logic that checks the keys are zeroed out and properly freed. - Moved unhexlify into vbdev rpc code. All keys passed to vbdev already in the binary form. - Provide meaningful error messages in the rpc response on keys validation issues during setup of crypto vbdev. - Updated unit tests. Signed-off-by: Yuriy Umanets <yumanets@nvidia.com> Change-Id: I1fab8771bbbc0cd2f359f0d105fec28fb86893b3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11631 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-24 09:21:35 +00:00
Yuriy Umanets	45f24aebe7	bdev/crypto: MLX5 AES_XTS general support - General MLX5 crypto support. - Unit-tests MLX5 crypto support. - Documentation update to list the MLX5 driver as supported, enumerate the cipher algorithms and provide some configuration hints. Signed-off-by: Yuriy Umanets <yumanets@nvidia.com> Change-Id: I0da1f49f4acd068d75a4d8633f84fe626d774431 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11630 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-03-24 09:21:35 +00:00
Alexis Lescouet	dc52f23536	test/nvme: Fix unittest_nvme failure Fix conditional jump depending on uninitialized qpair->reserved_req value. Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com> Change-Id: I8d3078aa66fa0e2a3bdb06f2c5cdaf3b8ad112bd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11640 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-23 09:02:21 +00:00
Alexey Marchuk	8e7688c5a7	bdev: Copy ext_opts when request is split That is done to correctly handle metadata pointer which is part of ext_opts structure. It will also be used by the next patch to remove memory_domain pointer if request which uses local buffers is split Force the user to set correct ext_opts size, update API functions description. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I77517d70df34a998d718cc6474fb4c538a42918f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11349 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-03-23 09:01:40 +00:00
Alexey Marchuk	c20dd8afee	bdev: Add ext_opts in public bdev_io section Bdev modules must not access internal bdev_io structure, so add a new pointer in a public section. Pointer in internal section will be used in next patch Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ib631563015b3e5fa9300d22b7ae59d8db43c8275 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10421 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-23 09:01:40 +00:00
Alexey Marchuk	c03985a068	bdev: Copy data asynchronously when bounce buffer is used This patch is a preparation for enabling of memory domains pull/psuh functionality. Since memory domains API is asynchronous, this patch makes asynchronous operations with bounce buffers. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ieb1f2a0c151149af58cfd7607dbde4c76c3c288d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10420 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-03-23 09:01:40 +00:00
Ben Walker	f0bf4e75f5	idxd: Eliminate configs SPDK has settled on what the optimal DSA configuration is, so let's always use it. Change-Id: I24b9b717709d553789285198b1aa391f4d7f0445 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11532 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-03-21 11:05:28 +00:00
Ben Walker	049bb9e41f	idxd: Update register names for idxd_group_flags The names on these were changed. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ib75a60342c08f72dad39635a9244421c1cca5485 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11793 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-21 11:05:28 +00:00
Shuhei Matsumoto	df7c2a2253	nvme: Call ctrlr_disconnect_done() after qpair_process_comletions() returns -ENXIO Add a new flag is_disconnecting to struct spdk_nvme_ctrlr. Separate calling nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done() by using the flag is_disconnecting. Additionally, change nvme_ctrlr_fail() to skip setting ctrlr->is_failed to true if ctrlr->is_disconnecting is true. Change-Id: Ie2c74ba41f120662a30f6198751d07005d23abcf Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11000 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	cfe11bd1db	nvme: Factor out operations done after disconnect qpair completes This is a preparation to make nvme_transport_ctrlr_disconnect_qpair() asynchronous. For nvme_transport_ctrlr_disconnect_qpair(), factor out operations after returning from transport's specific ctrlr_disconnect_qpair() into a helper function nvme_transport_ctrlr_disconnect_qpair_done(). Then move nvme_transport_ctrlr_disconnect_qpair_done() into the end of the transport specific ctrlr_disconnect_qpair(). Additionally remove the operation to overwrite the qpair state to DISCONNECTED from nvme_transport_connect_qpair_fail() because this is duplicated and nvme_transport_ctrlr_disconnect_qpair() is responsible to make the qpair disconnected even after it completes asynchronously. Change-Id: I9c8faa7039d306d3e31a8f51826755ce8840a8aa Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10851 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	0b32309bf6	bdev/nvme: Check not only I/O qpair but also adminq when finding optimal I/O path For RDMA transport, adminq will find transport error first because usually only adminq polls CM events. Change-Id: I7b22cc8883bf02198f1a90d2654c1de6f2e736e6 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11331 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	3182be6d26	bdev/nvme: Fail fast I/O qpair if poll_group_process_completions() returns negated errno If qpair is disconnected asynchronously, it takes time from detecting transport error to actually disconnected. We should avoid using the path as soon as possible after detecting any transport error. Poll group clears I/O path cache if it finds transport error and avoid using the path which had transport error. These changes will reduce the failover time. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I00580159a84372a115ed5e62a6ce13eed4368999 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11329 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	aca0d56e3d	bdev/nvme: Reconnect ctrlr after it is disconnected at completion poller spdk_nvme_ctrlr_disconnect() will be made asynchronous in the following patches and so we will need to have some changes. spdk_nvme_ctrlr_disconnect() disconnects adminq and ctrlr synchronously now. If spdk_nvme_ctrlr_disconnect() is made asynchronous, spdk_nvme_ctrlr_process_admin_completions() will complete to disconnect adminq and ctrlr, and will return -ENXIO only if adminq is disconnected. However even now spdk_nvme_ctrlr_process_admin_completions() returns -ENXIO if adminq is disconnected. So as a preparation, set a callback before calling spdk_nvme_ctrlr_disconnect() and call the callback if it is set and spdk_nvme_ctrlr_process_admin_completions() returns -ENXIO. Besides, fix the return value of bdev_nvme_poll_adminq() in this patch. Change-Id: I2559f86bb8cf9a92b5b386ed816c00b08c9832df Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10950 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	a76bbe3553	bdev/nvme: Disconnect and then free I/O qpair in a ctrlr reset sequence As we do when deleting ctrlr_channel, disconnect and then free I/O qpair in a ctrlr reset sequence. Deleting ctrlr_channel and resetting ctrlr_channel may cause conflicts. This patch processes such conflicts correctly. If destroy_ctrlr_channel_cb() is executed between pending and executing reset_destroy_qpair(), reset_destroy_qpair() is not executed because ctrlr_channel is not found. In this case, destroy_qpair_channel() starts disconnecting qpair and deletes ctrlr_channel. Then disconnected_qpair_cb() releases a reference to poll group. If destroy_ctrlr_channel_cb() is excuted between executing reset_destroy_qpair() and disconnected_qpair_cb(), destroy_ctrlr_channel_cb() skips ctrlr_channel for a reset sequence. Change-Id: I1f49f74b94aefbea178680aa53ded3a12876c676 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10766 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Yuriy Umanets	8ec34933e9	bdev/crypto: Add qp_desc_nr to struct vbdev_crypto At the moment MLX5 uses different number of qp descriptors than the other pmd crypto drivers. Adding it to vbdev_crypto on init and re-use everywhere we need it. Signed-off-by: Yuriy Umanets <yumanets@nvidia.com> Change-Id: Iea4d4787fc5fd91f27c4a70cf78c5660f09bc854 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11878 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-16 08:20:03 +00:00
Yuriy Umanets	15a5bd8264	bdev/crypto: Rename AES_CBC_IV_LENGTH to IV_LENGTH Since IV length is the same for all pmd crypto drivers, AES_CBC_IV_LENGTH is renamed to IV_LENGTH. Signed-off-by: Yuriy Umanets <yumanets@nvidia.com> Change-Id: If8769db119eb599a17c267e8950f18f5a0ea995b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11875 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-16 08:20:03 +00:00
Shuhei Matsumoto	1285481917	nvme: Free I/O qpair now even if it is in poll group completion spdk_nvme_poll_group has followed spdk_nvme_qpair about how to process I/O qpair deletion inside of a completion context. spdk_nvme_qpair_process_completions() accesses qpair after returning from nvme_transport_qpair_process_completions(). So this is reasonable. On the other hand, if spdk_nvme_poll_group_process_completions() can execute spdk_nvme_ctrlr_free_io_qpair() inside of a completion context, the target qpair is ensured to be deleted after returning from spdk_nvme_ctrlr_free_io_qpair(). Then the target qpair is not accessed anymore in spdk_nvme_poll_group_process_completions(). Remove two variables, in_completion_context and num_qpairs_to_delete, of spdk_nvme_transport_poll_group and the related code. This change is really necessary to support the following case. In the NVMe bdev module, a nvme_qpair has a qpair and a poll_group channel. disconnected_qpair_cb calls spdk_nvme_ctrlr_free_io_qpair() for the qpair and spdk_put_io_channel() to the poll_group_channel. spdk_nvme_ctrlr_free_io_qpair() is executed after unwinding stack but spdk_put_io_channel() is executed now. The callback to spdk_put_io_channel() calls spdk_nvme_poll_group_destroy(). However, spdk_nvme_ctrlr_free_io_qpair() is not executed. Hence spdk_nvme_poll_group_destroy() fails. Update the corresponding stub in unit test together. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Icd1f1daf049c6c7ffb28790fe87989a1060f8952 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11496 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-15 09:05:09 +00:00
Shuhei Matsumoto	c113e4cdca	bdev/nvme: Alloc qpair context dynamically on nvme_ctrlr_channel This is another preparation to disconnect qpair asynchronously. Add nvme_qpair object and move the qpair and poll_group pointers and the io_path_list list from nvme_ctrlr_channel to nvme_qpair. nvme_qpair is allocated dynamically when creating nvme_ctrlr_channel, and nvme_ctrlr_channel points to nvme_qpair. We want to keep the times of references at I/O path. Change nvme_io_path to point nvme_qpair instead of nvme_ctrlr_channel, and add nvme_ctrlr_channel pointer to nvme_qpair. nvme_ctrlr_channel may be freed earlier than nvme_qpair. nvme_poll_group lists nvme_qpair instead of nvme_ctrlr_channel and nvme_qpair has a pointer to nvme_ctrlr. By using the nvme_ctrlr pointer of the nvme_qpair, a helper function nvme_ctrlr_channel_get_ctrlr() is not necessary any more. Remove it. Change-Id: Ib3f579d3441f31b9db7d3844ec56c49e2bb53a5d Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11832 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-15 09:05:09 +00:00
Shuhei Matsumoto	d7f0a1820e	bdev/nvme: Inline bdev_nvme_destroy_qpair() In the following patches, spdk_nvme_ctrlr_disconnect_io_qpair() will be changed to be asynchronous, spdk_nvme_ctrlr_disconnect_io_qpair() will be called first and then spdk_nvme_ctrlr_free_io_qpair() after the qpair is actually disconnected. We will not be able to keep the current bdev_nvme_destroy_qpair() function. As a preparation, inline bdev_nvme_destroy_qpair() and remove it. Additionally, this patch has the following changes. Previously I/O qpair was freed and then I/O path caches were cleared. Both are SPDK thread local. So there is no dependency for the ordering of these two operations. However, it will reduce the size of the following patches if we clear I/O path caches before freeing I/O qpair when the qpair is disconnected. Hence we clear I/O path caches and then free I/O qpair. Remove DTRACE for bdev_nvme_destroy_qpair() for now. It will be restored in the following patches. Furthermore, fix potential NULL pointer acces in bdev_nvme_create_qpair(). Change-Id: I0ab78ccb0d240e56b95b53179341afcd909a31f6 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10746 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-15 09:05:09 +00:00
Rui Chang	cd9fca0d20	test/unit: fix valgrind error for test_nvme_free_request In test_nvme_free_request, there is valgrind error: Conditional jump or move depends on uninitialised value(s) Signed-off-by: Rui Chang <rui.chang@arm.com> Change-Id: I80f741cac9316d86b060419e3b6fb651fa018aa4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11826 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-09 08:03:23 +00:00
Shuhei Matsumoto	d8a105742f	nvmf/rdma: Fix overflow of RB tree comparison when qp_num is very big If 0 - UINT32_MAX or UINT32_MAX - 0 is substituted into a int variable, we cannot get any expected result. Fix the bug and add unit test case to verify the fix. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Iad2ea681ad8ad234e70c7310b58785a999612156 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11837 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-03-09 08:00:58 +00:00
Shuhei Matsumoto	00a7998254	bdev/nvme: Move per controller settings into a option structure The following patches will enable us to specify I/O error resiliency options per nvme_ctrlr as global options. To do it easier, move per controller options about I/O error resiliency into struct nvme_ctrlr_opts. prchk_flags is not exactly for resiliency but move it into struct nvme_ctrlr_opts too. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I85fd1738bb6e293cd804b086ade82274485f213d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11829 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-09 08:00:45 +00:00
Shuhei Matsumoto	1a00f5c094	bdev/nvme: Fix overflow of RB tree comparison when the NSID is very big If 0 - UINT32_MAX or UINT32_MAX - 0 is substituted into a int variable, we cannot get any expected result. Fix the bug and add unit test case to verify the fix. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ib045273238753e16755328805b38569909c8b83a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11836 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-03-09 08:00:45 +00:00
Alexis Lescouet	a71cd5214b	event: Add a user option to change the size of spdk_msg_mempool The spdk_msg_mempool structure has a fixed size, which is not flexible enough. The size of the memory allocation can now be changed in the options given to the spdk_app_start function. Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com> Change-Id: I1d6524ab8cf23f69f553aedb0f5b0cdc9dde374b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11635 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-09 08:00:28 +00:00
Changpeng Liu	a576bccca9	nvmf/vfio-user: remove vfio-user CSTS.CFS We will use nvmf library CSTS.CFS instead so that the client can get this error status. Change-Id: I42c248a7333d1f9c940bb29135c887a61c906bd4 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11676 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-08 02:35:05 +00:00
paul luse	8951c15759	accel/idxd: add and respect flag to support writes to PMEM Plumbing for flags was added in prior pathces. This patch introduces and respects the relevant flags for use with PMEM aka durable memory through the accel_fw, IDXD, IOAT and SW modules. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I792f31459e061d220965feced60e0c236d819a68 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9455 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-04 21:56:54 +00:00
paul luse	12c40f05e2	accel: plumb accel flags through operations that need them This patch is just plumbing the flags param. Use of it for PMEM will come in upcoming patches. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I620df072aaad3f8062a0312bbea3da1bc3f911b9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9281 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-04 21:56:54 +00:00
Yuriy Umanets	8ecf8dfcd7	bdev/crypto: Continue init after AESNI_MB failure - Continue init of the other crypto devices (mlx5) after failure of rte_vdev_init(AESNI_MB) in vbdev_crypto_init_crypto_drivers(). It simply may not be enabled in DPDK because it requires IPSec_MB>=1.0 installed in the system. Reproduces with --with-dpdk=dpdk/install option used, when the target DPDK is built without control of IPSec version from the SPDK side. - Updated crypto_ut to test the new behavior of error handling from rte_vdev_init(AESNI_MB) in vbdev_crypto_init_crypto_drivers(). Signed-off-by: Yuriy Umanets <yumanets@nvidia.com> Change-Id: Icd4db8877afe87db8166c40d6e7b414cd43c9c25 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11624 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-04 09:40:04 +00:00
Yuriy Umanets	a837ea37da	bdev/crypto: Switched to pkt_mbuf API - Switched to using rte_mempool for mbufs instead of spdk_mempool. This allows using rte pkt_mbuf API that properly handles mbuf fields we need for mlx5 and we don't have to do it manually when sending crypto ops. - Using rte_mempool *g_mbuf_mp in vbdev crypto ut and added the mocking API code. - crypto_ut update to follow pkt_mbuf API rules. Signed-off-by: Yuriy Umanets <yumanets@nvidia.com> Change-Id: Ia5576c672ac2eebb260bfdbb528ddb9edcd8f036 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11623 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-04 09:40:04 +00:00
paul luse	492d576795	Revert "idxd: No longer set token configuration" This reverts commit `3bacd6653d`. Change-Id: I8dbaffc9f50cf9627720667644496cdaf4e81c3f Signed-off-by: paul luse <paul.e.luse@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11723 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-02 17:22:08 +00:00
Evgeniy Kochetov	5c80b1e5ab	nvme/rdma: Limit max_sges by command capsule size According to NVMe over Fabrics spec number of SGLs supported by the controller is reported in MSDBD. But it is also implicitly limited by command capsule size (IOCCSZ) since SGL are passed in capsule. This patch adjusts max_sges to capsule size if required. Adjustment to MSDBD is also moved to transport layer because it is fabrics specific parameter and is not valid for PCIe transport. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I44918eb949345c61242ca50a524d21d04b6ac058 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11669 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-02-25 08:18:32 +00:00
John Levon	5e37316308	nvmf: pass poll group to transport during create For the benefit of forthcoming vfio-user changes, register the poll group poller prior to calling the transport create callback, and pass in a pointer to the poll group itself. Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: Idbc24126c9d46f8162e4ded07c5a0ecf074fc7dd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10718 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-02-23 10:05:15 +00:00
Jim Harris	635d0cbe75	nvme: allocate extra request for fabrics connect With async connect, we need to avoid the case where the initiator is sending the icreq, and meanwhile the application submits enough I/O such that the request objects are exhausted, leaving none for the FABRICS/CONNECT command that we need to send after the icreq is done. So allocate an extra request, and then use it when sending the FABRICS/CONNECT command, rather than trying to pull one from the qpair's STAILQ. Fixes issue #2371. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: If42a3fbb3fd9d863ee48cf5cae75a9ba1754c349 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11515 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-14 15:29:39 +00:00
Ben Walker	3bacd6653d	idxd: No longer set token configuration This has changed to control the number of read buffers allocated to the group, but it is only valid to set this register if the device has indicated it supports it. Further, the default value is what we want anyway, so we can skip setting it altogether. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ic54672ea6cb16acc7613860e36d9f7033048bd98 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11484 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-02-10 22:41:12 +00:00
Ben Walker	dbdd27ff47	idxd: Rename idxd_cmdsts_reg to idxd_cmdsts_register All of the other structs and unions spell out register, so match the style. Change-Id: Ie502e80206305037d1518a1db590d89b7479abb4 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11433 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-10 22:41:12 +00:00
Evgeniy Kochetov	834e3c5a0e	nvme: Fix submission queue overflow SPDK can submit more commands to remote NVMf target than allowed by negotiated queue size. SPDK submits up to SQSIZE commands, but only SQSIZE-1 are allowed. Here is a relevant quote from NVMe over Fabrics rev.1.1a ch.2.4.1 “Submission Queue Flow Control Negotiation”: If SQ flow control is disabled, then the host should limit the number of outstanding commands for a queue pair to be less than the size of the Submission Queue. If the controller detects that the number of outstanding commands for a queue pair is greater than or equal to the size of the Submission Queue, then the controller shall: a) stop processing commands and set the Controller Fatal Status (CSTS.CFS) bit to ‘1’ (refer to section 10.5 in the NVMe Base specification); and b) terminate the NVMe Transport connection and end the association between the host and the controller. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: Ifbcf5d51911fc4ddcea1f7cde3135571648606f3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11413 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-10 15:22:08 +00:00
Evgeniy Kochetov	486426529d	nvme/rdma: Remove queue depth adjustment to crqsize According to NVMe over Fabrics specification (rev.1.1a) HSQSIZE sent in RDMA_CM_REQUEST private data (ch.7.3.6.4) shall be the same as SQSIZE later sent in Connect command (ch.3.3). SPDK NVMe RDMA initiator adjusts SQSIZE to CRQSIZE received from target in RDMA_CM_ACCEPT private data. Target is allowed to send CRQSIZE < HSQSIZE if RNR retries are used. So, it is possible that SQSIZE sent by SPDK will be lower than previously sent HSQSIZE. There are targets validating this match and they reject connection from SPDK. Linux kernel NVMe initiator doesn't perform such adjustments and connects well to such targets. This patch aligns SPDK behavior with specification and Linux kernel implementation. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I01968d1c07d284396fa5939932d85841351d7a45 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11350 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-02-10 15:22:08 +00:00
Jaylyn Ren	3e937f07eb	test/accel&rdma: Fix unittest_accel and unittest_nvme_rdma failure There are errors occur that uninitialised value created by a stack allocation when running unittest_accel and unittest_nvme_rdma with valgrind. Change-Id: I4b48b472cc7c189cbcaf8ca772830a23118e7e17 Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10559 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-09 22:22:04 +00:00
xiaoxiangxzhang	fbed59665c	unittest/nvmf_tcp: test for nvmf_tcp_pdu_ch_handle Signed-off-by: xiaoxiangxzhang <xiaoxiangx.zhang@intel.com> Change-Id: I969d08e8fbd34a2132617fb9113f89282f076fc8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8964 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-02-09 20:44:37 +00:00
Jacek Kalwas	fcc426bda8	nvmf: add auxiliary asserts to confirm API usage is correct Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com> Change-Id: Id85420fe38bf804e66cc0da892dd9e7a266eeb00 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11092 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-09 18:05:51 +00:00
Jacek Kalwas	93364164e5	nvmf: fix discovery log change notice execution it shall be executed on ctrlr's thread not subsystem's Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com> Change-Id: I58c60525191085d3d6a583862ba5d71ea90940c7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11105 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-09 18:05:51 +00:00
Konrad Sztyber	79415753ea	bdev: register bdev's UUID as its alias In many cases, addressing bdevs by their UUIDs is often easier than using their names, which can be somewhat arbitrary. For instance, the NVMe bdev builds a name by addng the n{NSID} suffix to the controller's name, while the UUID is filled with NGUID (if available). The UUID alias is stored in the form defined by RFC 4122, meaning five groups of lower-case hexadecimal characters. It's important to note that bdev layer uses case-sensitive name comparison, so the user needs to use the same textual UUID representation. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I8b112fb81f29e952459d5f81d97fdc7a591730f8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11395 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-07 21:11:10 +00:00
Mao Jiang	aa221ca1f8	test/nvmf/rdma: cases for creating rdma resources Change-Id: I2e1d464c7bd76fdd49f673c0c5863ac17372c768 Signed-off-by: Mao Jiang <maox.jiang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8460 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-04 20:58:56 +00:00
Mike Gerdts	5eb363cf8c	bdev_ut: UB due to small buffer bdev_multi_allocation tries to write four characters, an integer between 0 and INT_MAX, and a nul byte into 10 characters. That requires at least 15 characters. This leads to build failures with "make CONFIG_DEBUG=n CONFIG_UBSAN=y". Change-Id: I8cb9fd4ede31ae24809e4a04fd60a67dae3a0ac4 Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11261 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-03 20:08:35 +00:00
Mike Gerdts	b66f8df748	blob_ut: bs_opts initialized with wrong size An spdk_bs_opts structure is sometimes partially initialized due to using sizeof(opts) (struct spdk_blob_opts, 64 bytes) rather than sizeof(bs_opts) (struct spdk_bs_opts, 72 bytes). Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: Iaaa89bb419f66969d0888f49f8991c35b3dc5ea4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11268 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-02 08:25:02 +00:00
Mike Gerdts	9f9c7161c9	bdev_ut: test read-only bdev claim While not documented as such, spdk_bdev_module_claim_bdev() has always allowed a bdev that is opened read-only to remain read-only when claimed. This occurs when NULL is passed in place of an spdk_bdev_desc. This change updates the function's documentation to match the implementation and adds a unit test to ensure the current behavior remains. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: Ief26de60e4408bfe1aa60b7a4e1d8adf273470b6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11267 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-02-02 08:25:02 +00:00
Shuhei Matsumoto	cc797456f4	ut: Use unit/lib/json_mock.c for stubs Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2cd488c17dbc92c381cd956ae0d6f5ca709a24dc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11263 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-01-31 09:44:28 +00:00
Shuhei Matsumoto	def45b4c07	ut/json_mock: Add stubs for json_write_uint8 and _uint16 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I260b958e0640f737ab77654fedc8007f92eec325 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11262 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-01-31 09:44:28 +00:00
Mike Gerdts	96212d45cc	lvol: lvol_get_xattr_value failure undetectable When an unexpected xattr name is passed to lvol_get_xattr_value(), no error is returned to the caller. The one caller, blob_set_xattrs() via the xattrs->get_value callback, makes the reasonable assumption that a lookup that fails to find a value returns a NULL value. This updates lvol_get_xattr_value() to match that expectation. Signed-off-by: Mike Gerdts <mgerdts@nvidia.com> Change-Id: I5c7a740f2757e6d8265ba2637afecb729acfcdd4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11326 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-01-31 09:44:16 +00:00
Changpeng Liu	d1c2027d01	nvmf/vfio-user: add NVMe live migration support finally VFIO in QEMU uses region 9 as the PCI passthrough devices' migration channel. The format of the region 9 migration region is as follows: ------------------------------------------------------------------ \|vfio_device_migration_info\| data section \| ------------------------------------------------------------------ QEMU will access vfio_device_migration_info to controll the migration process. For SPDK vfio-user target, we also implement the BAR9 via libvfio-user, and we also define the NVMe device specific migration data stored in data section of BAR9. QEMU doesn't care about the format in data section, it will help us to gather the NVMe specific migration data in source VM and then restore the migration date to data section of BAR9 in destination VM. The core idea to implement live migration will following the device state change which is controlled by QEMU. First QEMU will try to STOP the device in the source VM, and set the destination VM to RESUME state, SPDK will save NVMe devic state data structure to BAR9 in the source VM once the subsystem is paused, then QEMU will read BAR9 in source VM and restore the content of BAR9 in destination VM, finally in the destination VM, we will restore the NVMe device state include BARs/PCI CFG/queue pairs in the destination VM. Change-Id: I42e38f28c3ff59831be63290038b50d199d06658 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7617 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-01-27 20:55:16 +00:00
Evgeniy Kochetov	08f9b40113	bdev/nvme: Fix namespace comparison This patch aligns namespace comparison with Linux kernel implementation: - UUID is optional and may be NULL - command set (CSI) should be the same Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I8f889989f24cd51b104057217f87eb303b30fa68 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11312 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-01-27 18:53:41 +00:00
Nick Connolly	968371131e	ut/nvme_ctrlr: initialize mutex for portability For correct behaviour, pthread_mutex must be initialized before use and destroyed afterwards. An already initialized mutex should not be re-initialized. Add calls to nvme_ctrlr_construct where nvme_ctrlr_destruct is called without a matching construct. Add missing calls to mutex_init and mutex_destroy as required. Signed-off-by: Nick Connolly <nick.connolly@mayadata.io> Change-Id: I9753fa7fbd77402f23a08a66f4b489a5c229487a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11298 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Community-CI: Mellanox Build Bot	2022-01-27 08:15:26 +00:00
Shuhei Matsumoto	c8f986c7ee	Revert "nvme/rdma: Correct qpair disconnect process" This reverts commit `eb09178a59`. Reason for revert: This caused a degradation for adminq. For adminq, ctrlr_delete_io_qpair() is not called until ctrlr is destructed. So necessary delete operations are not done for adminq. Reverting the patch is practical for now. Change-Id: Ib55ff81dfe97ee1e2c83876912e851c61f20e354 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10878 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-26 08:09:15 +00:00
paul luse	c501d2b37c	idxd: fix issue w/multiple WQ config Found via inspection during spec review of latest HW. We were using the wrong stride for the WQCFG regsiter when configuring but it just so happened to be the right value for the current DSA version. We were mixing up the size of the WQCFG register with the stride value used to configure the next WQCFG regsiter as they are not contiguous in HW, we need to read another capabilities bit to determine the address of the next wqcfg to configure.. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I14d1ff95e0131fd30121aa955bfbc8c8fb3fc512 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10968 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-20 21:39:56 +00:00
paul luse	026f003154	idxd: update structures based on latest public DSA spec Compliant with both current and next gen DSA. Note: some fields in gencap were mapped incorrectly previously, but this did not impact the SPDK driver because the only times those values (max_xfer_shift and max_batch_shift) were used were in asserts. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I9648184670f661166136e7898d0d8c7e07d8c746 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10966 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-20 21:39:56 +00:00
Tomasz Zawadzki	1e080e5e67	lib/vhost: move dev_dirname to rte_vhost_user Creation of sockets is specific to rte_vhost, so it functionality responsible for setting path for them. dev_dirname is renamed to g_vhost_user_dev_dirname and its definition is moved to rte_vhost_user. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I9bae67667b0f6624f2daf3244a048d10e94e553c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10631 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-01-20 19:09:20 +00:00
Tomasz Zawadzki	ef873d21e3	ut/vhost: add rte_vhost_user.c to UT vhost.c contains a lot of functionality that is rte_vhost specific. This series is moving rte_vhost specific functionality to rte_vhost_user.c. UT for vhost didn't make a distinction for either. So starting with this patch the rte_vhost_user.c is now included in the UT, only stubing out rte_vhost functions. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I0d5f62ad47d1261bbb44c0aa23400d94ece4564e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10743 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-01-20 19:09:20 +00:00
GangCao	6b7e9d0af2	Lib/iSCSI: add the LUN Resize support From SAM-4, section 5.13 (Sense Data); “When a command terminates with a CHECK CONDITION status, sense data shall be returned in the same I_T_L_Q nexus transaction (see 3.1.50) as the CHECK CONDITION status. After the sense data is returned, it shall be cleared except when it is associated with a unit attention condition and the UA_INTLCK_CTRL field in the Control mode page (see SPC-4) contains 10b or 11b.” SPDK does not set UA_INTLCK_CTRL to 10b or 11b, so we set the unit attention condition immediately against a single IO or Admin IO after reporting it via a CHECK CONDITION. Once the failed IO received at iSCSI initiator side, it will be retried. In the case of resize operation, if there is no IO from iSCSI initiator side, the unit attention condition will be delayed to report until the first IO is received at the iSCSI target side. Meanwhile, we clear the resizing (newly added) flag on our SCSI LUN structure after first time we report the resize unit attention condition. The kernel initiator won’t actually resize the corresponding block device automatically. It will report a uevent, and then you can set up udev rules to trigger a rescan. SPDK iSCSI initiator will automatically report the LUN size change. Change-Id: Ifc85b8d4d3fbea13e76fb5d1faf1ac6c8f662e6c Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11086 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-01-20 07:56:23 +00:00
Ben Walker	86bb0df191	idxd: Bump batch size to 32 Increase the batch size and with it the effective queue depth per channel to 512. Change-Id: Ide665e92d47ee753c141f34dd6a8bc4d040fe8db Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11031 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>	2022-01-20 07:54:55 +00:00
Changpeng Liu	b3cd421ffd	nvmf/vfio-user: implement device quiesce APIs libvfio-user will call quiesce callback when there are memory region add/remove and device state change requests from client, and in the quiesce callback, we will pause the subsystem so that it's safe to do everything after it, then after quiesce callback, we will resume the subsystem. The quiesce callback is also used in live migration, each device state change will quiesce the device first. Change-Id: I3a6a0320ad76c6b2d1d65c754b9f79cce5c9c683 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10620 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-20 00:13:42 +00:00
Konrad Sztyber	a7d61bef5a	nvme: guard admin qpair error injection queue Admin commands can be sent and polled from any thread, which also means that the error injection queue on the admin qpair can be accessed from multiple threads. Therefore, any modifications to that queue should be done under the ctrlr lock. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ib1ed194405cb5b93f65a007b9749fd4433dc367d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11099 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-01-19 09:05:36 +00:00
Changpeng Liu	f63c0899a2	nvmf: add reset/shutdown timeout process There is an error case that the block device didn't complete outstanding IOs during the controller reset or shutdown, so the NVMf library will wait until all the IOs returned from the backend, however, so here we added a timeout timer, when the time expired, we will try to reset the block device which hold the outstanding IOs. Fix #2194. Change-Id: I8d0746335e1f20a09e6a9ea87730551808a898d1 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9909 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com>	2022-01-19 09:04:50 +00:00
GangCao	7b67a696da	UT/NVMe: Fix compilation warnings Fix warning: missing braces around initializer This issue is seen with gcc (GCC) 4.8.5 20150623. Warning like below: nvme_tcp_ut.c:243:9: warning: (near initialization for ‘ctrlr.ns’) [-Wmissing-braces] nvme_tcp_ut.c: In function ‘test_nvme_tcp_req_init’: nvme_tcp_ut.c:525:9: warning: missing braces around initializer [-Wmissing-braces] struct spdk_nvme_ctrlr ctrlr = {0}; ^ nvme_tcp_ut.c:525:9: warning: (near initialization for ‘ctrlr.ns’) [-Wmissing-braces] And more information from below link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119 Change-Id: I88b5b3908d5d0daa9383e47a1ed53288f342ca3b Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11137 Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-01-19 09:04:27 +00:00
Shuhei Matsumoto	3185df9057	ut/bdev_nvme: Manage adminq's state and return -ENXIO if adminq is disconnected Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I81d4a8ce5c487449ab634bcd4f984d6867febf35 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10949 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-01-19 08:44:09 +00:00
Shuhei Matsumoto	49b8d1f33a	ut/bdev_nvme: Delete qpair after unwiding context from process_completions() This is the same effort as the last patch. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I94ef08abdbb2bd2e07d0cd1e552c5d05c805233e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10817 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-01-19 08:44:09 +00:00
Shuhei Matsumoto	5485f55dc1	ut/bdev_nvme: Separate disconnected and connected qpair in poll_group More precise stubs for spdk_nvme_poll_group are critically important to verify upcoming changes. Add a flag is_failed to struct spdk_nvme_qpair separately from is_connected. This is used to inject error to a connection. Replace a single list qpairs by two lists, connected_qpairs and disconnected_qpairs for struct spdk_nvme_poll_group. Then utilize these to manage qpair in poll group. spdk_nvme_ctrlr_reconnect_io_qpair() is not used in the NVMe bdev module now. Remove the corresponding stub. Adjust polling count accordingly. Change-Id: I4d867c56ae518276813f6f96d23a5f6933364fd4 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10816 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-19 08:44:09 +00:00
Shuhei Matsumoto	728e3721a4	nvme_rdma: Remove a guard for recursive calls from poll_group_disconnect_qpair() nvme_poll_group_disconnect_qpair() is called only by a single place now. We do not need the flag poll_group_disconnect_in_progress any more. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I8f9c0f14baa8fcb9b0637635a5bb3d34a8b11af5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10673 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-19 08:44:09 +00:00
Shuhei Matsumoto	7ae79a38a5	nvme: Limit spdk_nvme_poll_group_remove() to use only for disconnected qpairs Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I3c06c41664ee757423641474141439f9c32fc0b6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10671 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-19 08:44:09 +00:00
Shuhei Matsumoto	e021cc0147	nvme: Swap ctrlr_disconnect_qpair() and poll_group_remove() in nvme_ctrlr_free_io_qpair() nvme_ctrlr_disconnect_qpair() calls nvme_poll_group_disconnect_qpair() if the qpair uses a poll group, and nvme_poll_group_disconnect_qpair() calls nvme_ctrlr_disconnect_qpair() if the state of the qpair is not DISCONNECTING. This relationship made the code very complex. A few patches starting from this patch simplifies disconnect and free qpair operations. This patch swaps the ordering of nvme_ctrlr_disconnect_qpair() and spdk_nvme_poll_group_remove() in spdk_nvme_ctrlr_free_io_qpair(). This ensures the qpair is disconnected when spdk_nvme_ctrlr_free_io_qpair() calls spdk_nvme_poll_group_remove(). This enables us to limit spdk_nvme_poll_group_remove() to be available only for disconnected qpairs. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I0601a74f953a2efc4f177a51a4450baea33533d4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10670 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-19 08:44:09 +00:00
Shuhei Matsumoto	80e81273e2	bdev/nvme: Do not use ctrlr for I/O submission if reconnect failed repeatedly If ctrlr_loss_timeout_sec is set to -1, reconnect is tried repeatedly indefinitely, and I/Os continue to be queued. This patch adds another option fast_io_fail_timeout_sec, a flag fast_io_fail_timedout to nvme_ctrlr. If the time fast_io_fail_timeout_sec passed after starting reset, set fast_io_fail_timedout to true not to use the path for I/O submission. fast_io_fail_timeout_sec is initialized to zero as same as ctrlr_loss_timeout_sec and reconnect_delay_sec. The name of the parameter follows the famous DM-multipath, its fast_io_fail_tmo. Change-Id: Ib870cf8e2fd29300c47f1df69617776f4e67bd8c Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10301 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-17 14:25:15 +00:00
Shuhei Matsumoto	ae4e54fdc3	bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed Previously reconnect retry was not controlled and was repeated indefinitely. This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec, to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and reconnect_delay_timer to nvme_ctrlr to control reconnect retry. Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to zero. This means reconnect is not throttled as we did before this patch. A few more changes are added. Change nvme_io_path_is_failed() to return false if reset is throttled even if nvme_ctrlr is reseting or is to be reconnected. spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec. Not only ctrlr reset but also non-multipath ctrlr failover is controlled. So we need to include path failover into ctrlr reconnect. When the active path is removed and switched to one of the alternative paths, if ctrlr reconnect is scheduled, connecting to the alternative path is left to the scheduled reconnect. If reset or reconnect ctrlr is failed and the retry is scheduled, switch the active path to one of alternative paths. Restore unit test cases removed in the previous patches. Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>	2022-01-17 14:25:15 +00:00
Shuhei Matsumoto	962c4c3800	bdev/nvme: Fix a degradation that I/O gets queued infinitely We noticed the difference between the SPDK 21.10 and the latest master in a test. The simplified scenario is as follows: 1. Start SPDK NVMe-oF target 2. Run bdevperf for the target with -f parameter to suppress exit on failure. 3. Kill the target after I/O started. With the SPDK 21.10, bdevperf retries failed I/Os and exits after the test time is over. With the latest SPDK master, bdevperf hungs and does not exit even after the test time is over. The cause was as follows: reset ctrlr is repeated very quickly (once per 10ms by default) and hence I/Os were queued infinitely because nvme_io_path_is_failed() returned false if nvme_ctrlr is resetting. We should queue I/O when nvme_ctrlr is resetting only if reset is throttoled and fail-fast for the repeated failures is supported. Hence in this patch, fix the degradation and remove the related unit test cases. Reported-by: Evgeniy Kochetov <evgeniik@nvidia.com> Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I4047d42dc44488a05264c6a841d101a7c371358b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11062 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-17 14:25:15 +00:00
Ahriben Gonzalez	0345729e00	nvme: Add metadata support to io commands Adding metadata support for io commands. Currently metadata is ignored even if present in the cmd struct. Making metadata adress readable/writable depending on data transfer bits. Adding extra unit test to make sure metadata fields are populated. Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com> Change-Id: I1d01974a6b2831c82b43e94073065d235eea429a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10854 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-01-14 11:10:13 +00:00
Ben Walker	517b557226	nvme: Do not track a separate active namespace list We only populate active namespaces into the main namespace tree, so we don't need a separate list of active namespaces too. Change-Id: Iaf194f806cc1d9672f5567cff3dffafff3165069 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10034 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-01-14 08:35:10 +00:00
Ben Walker	e7602c158f	nvme: Hold namespaces in an RB_TREE Since this is now sparsely populated, a tree is a better choice. Change-Id: Ie66d913fa1d298de56a7d22ef55f0adf7f8803b8 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10031 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-14 08:35:10 +00:00
Ben Walker	b4dace738e	nvme: Do not allocate inactive namespace objects Some subsystems report a very large maximum value for the number of namespaces, but in essentially every case the subsystem is sparsely populated with active namespaces. To save memory, don't allocate objects for the inactive ones. Change-Id: I4cbeb5a7a898d3c685f4a3a9ec4c2ce45efffb92 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9898 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-01-14 08:35:10 +00:00
Ben Walker	1cfae16563	accel: Use vectored crc32 operations instead of chaining Chaining may be faster, but this is really an implementation detail of the idxd driver. Push the decision on how to implement a vectored crc down into the individual drivers and eliminate it from the generic framework. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Iedbdc5a6dbd3f7d1674d0a83f6827588f4b6b2fb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10291 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-01-12 08:20:39 +00:00
Konrad Sztyber	6631c2a8aa	nvmf/tcp: initialize zcopy phase in nvmf_tcp_req_get Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ia74148fb36733deaf7b2f833ac0247859311a805 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10794 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-01-12 08:20:11 +00:00
Konrad Sztyber	a50a70ecdf	nvmf: abort outstanding zcopy reqs in qpair disconnect Zero-copy requests are kept on the outstanding queue for the whole duration of the request - from the initial zcopy_start submission to the completion of zcopy_end. This means, that there's a period in which a request doesn't wait for a completion from the bdev layer, but is still on the oustanding queue (after zcopy_start callback, before zcopy_end submit). If a qpair gets disconnected while a request is in this state, we need to manually force its completion, as otherwise it might hang indefinitely (e.g. waiting for host data). Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I53731b8e363b725efa564ca3c7d89b46f5fb2a24 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10793 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-01-12 08:20:11 +00:00
Konrad Sztyber	974a32b72e	nvmf: resume queued zcopy requests The zero-copy requests can also be queued when a subsystem is paused, so we need to properly resume and submit them by using zcopy_start. Since only requests that haven't received the zero-copy buffer (i.e. before zcopy_start was called) can be queued, we don't need to bother with checking zcopy_phase. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ie629688f6961eb2ae05741df496720b91be4d80d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10792 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-12 08:20:11 +00:00
Shuhei Matsumoto	521a9bb22c	bdev/nvme: Fix race between failover and add secondary trid We sort secondary trids to avoid using disconnected trids for failover. However the sort had a bug. This bug was found by running test/nvmf/host/multipath.sh in a loop. Verify the fix by adding unit test. Fixes #2300 Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com> Change-Id: I22b0ede4d2ef98b786c3e0d1f5337a2d568ba56d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10921 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-01-10 22:18:46 +00:00
Jim Harris	b68f2eeb0b	bdev_nvme: add bdev_nvme_start_discovery RPC This patch adds the framework for a discovery service in the bdev/nvme module. Users can specify an IP/port of a discovery service. The bdev/nvme module will connect to a discovery controller, get the discovery log page, and then register for AERs. It will connect to each subsystem specified in the initial log page. AER completions will trigger fetching the log page again, at which point new subsystems will be connected to, or removed subsystems will be detached. This patch does the following: * Adds the new start_discovery RPC * Connects to the discovery controller * Gets the discovery log page * Registers for AERs * Detach from discovery controllers at shutdown Subsequent patches in this series will: * Connect to subsystems listed in discovery log page * Detach from subsystems that were listed in earlier discovery log pages but subsequently removed * Add a stop_discovery RPC Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I54bfa896a48c5619676f156b5ea9f2d1f886c72f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10694 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-01-10 15:23:39 +00:00
Konrad Sztyber	7a374fbc0b	nvmf: make zcopy_end void Since spdk_bdev_zcopy_end() cannot really fail (it only fails if we pass a bad bdev_io), we can simplify the nvmf zcopy_end functions by making them void and always expect asynchronous completion. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I6e88ac28aba13acadea88489ac0dd20d1f52f999 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10790 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-06 18:53:42 +00:00
Konrad Sztyber	92d7df1f47	nvmf: use spdk_nvmf_request_exec to submit zcopy_start Since this path now supports sending zero-copy, use it for zcopy_start. Additionally, it makes it possible make zcopy_start void, as it reports all errors asynchronously via request_complete(), and remove some of the duplicated error checks. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I41f43ce1651432d9a7d74e3680d4a3f780128a1d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10789 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-01-06 18:53:42 +00:00
Konrad Sztyber	686b9984b3	nvmf: return async/complete status in bdev zcopy operations Additionally, the NVMe completion status is now updated and the IOs are queued if the bdev layer doesn't have enough IO descriptors. It makes the zcopy operations behave similarly to the other IO operations. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I455ae781e32aa6e60d144d2c91f109bd8be46664 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10787 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-01-06 18:53:42 +00:00
Konrad Sztyber	0e09df57dd	nvmf: rename zcopy operations to zcopy_(start\|end) It makes their names consistent with the bdev API. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I314051f0980b46959d6560aa25885f13b4c28f2a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10786 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-06 18:53:42 +00:00
Konrad Sztyber	f65099d378	nvmf: remove zcopy check in spdk_nvmf_request_exec It will make it possible to submit zero-copy requests through spdk_nvmf_request_exec(). Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ibc14fe77cd477b11ed55d1350a7486caaad81add Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10783 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-01-06 18:53:42 +00:00
Konrad Sztyber	7d23ac8657	nvmf: remove zcopy phase checks from IO functions The code should never reach these functions for requests using zero-copy. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: If9f30e05a43b340a982604d5b985242d63ce252b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10782 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-06 18:53:42 +00:00
Konrad Sztyber	aa1d039836	nvmf: zero-copy enable flag in transport opts It makes it possible for the user to specify whether a transport should try to use zero-copy to execute requests when possible. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I40a92b0d7a6707f4c9292795f380846acb227200 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10780 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-01-06 18:53:42 +00:00
Changpeng Liu	2a6c2c289c	nvmf: support static CNTLID SPDK NVMf subsystem supports dynamic controller model, for transports other fabrics, users should use static controller model. Change-Id: I364ea61a71b04d51932fd9e0e16f401a383ff67c Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10149 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-01-06 01:20:32 +00:00
Alexey Marchuk	3c4a68cafc	nvme: Do not create IO qpair during ctrlr initialization If nvme ctrlr is resetting or initializing, free_io_qids bitmap is already freed or not created yet. In that case an attempt to create IO qpair leads to segmentation fault. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I6a97bf81d5a568db20d23b3f88cf01e994ba42e3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10827 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>	2021-12-27 08:43:03 +00:00
Alexey Marchuk	eb09178a59	nvme/rdma: Correct qpair disconnect process In current implementation RDMA qpair is destroyed right after disconnect. That is not graceful qpair shutdown process since there can be requests submitted to HW and we may receive completions for already destroyed/freed qpair. To avoid this, only disconnect qpair in ctrlr_disconnect_qpair transport callback, all other resources will be released in ctrlr_delete_io_qpair cb. This patch is useful when nvme poll groups are used since in that case we use shared CQ, if the disconnected qpair has WRs submitted to HW then qpair's destruction will be deferred to poll group. When nvme poll groups are not used, this patch doesn't change anything, in that case destruction flow is still ungraceful. However since CQ is destroyed immediately after qpair, we shouldn't receive any requests which point to released resources. A correct solution for non-poll group case requires async diconnect API which may lead to significant rework. There is a bug when Soft Roce is used - we may receive a completion with "normal" status when qpair is already disconnected and all nvme requests are aborted. Added a workaround for it. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0680d9ef9aaa8737d7a6d1454cd70a384bb8efac Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10327 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-12-23 08:44:40 +00:00
GangCao	10f32b9f19	lib/blob: do not assume realloc(NULL, 0) returns a not-NULL value There is situation that num_extent_pages is zero and original pointer is also NULL, the realloc() could return a Not NULL pointer. Related UT has been added and updated. 1) In the default allocation (num_clusters == 0), the extent_pages is not allocated as expected. 2) In the thin provisioning allocation (num_clusters != 0), the extent_pages will be allocated if extent_table is used. More related information as below: The crux of the problem is that according to POSIX: realloc: "If ptr is NULL, then the call is equivalent to malloc(size)" malloc: "If size is 0, then malloc returns either NULL or a unique pointer value that can later be successfully passed to free" blobstore was relying on realloc(NULL, 0) always return a unique pointer value, and not NULL. This is not portable behavior. Change-Id: Ibc28d9696f15a3c0e2aa6bb2371dc23576c28954 Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10470 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-12-20 18:14:06 +00:00
Ben Walker	fca4262987	nvme: Remove nvme_ns_update In the one place this was called, we can call nvme_ns_construct instead. There's no harm in re-fetching the identify pages. Change-Id: I91292ff9650bdc7edd5588a05837b671dcac1922 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10102 Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-12-20 08:49:41 +00:00
Peng Lian	4c1757ffb9	nvmf: update discovery log when removing hostnqn In NVMF Revision spec 1.1a, discovery log should be updated when removing hostnqn of subsystem. Update unit test to check the discovery log when removing hostnqn and destroying subsystem. Signed-off-by: Peng Lian <peng.lian@smartx.com> Change-Id: I51c597a2493295a677a7aa68e4f13a887f7e1140 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10668 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-16 08:52:20 +00:00
Anil Veerabhadrappa	68f0c6160a	ut/fc : fix fc_ls_ut compilation failure This regression was introduced when 'accept' was removed from spdk_nvmf_transport_ops structure. Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com> Change-Id: I5d880791db258a97a1861dbd841e97a7c068ce12 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10676 Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-12-16 08:43:39 +00:00
Changpeng Liu	723adbaf32	UT/vfio-user: fix clang-12 compilation error Add missed STUBs. Change-Id: I20989bf4ea66720d62f8ecc9668bb8f74e459666 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10638 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-12-15 04:32:05 +00:00
Jacek Kalwas	43022da379	nvmf: remove accept poller from generic layer Not every transport requires accept poller - transport specific layer can have its own policy and way of handling new connection. APIs to notify generic layer are already in place - spdk_nvmf_poll_group_add - spdk_nvmf_tgt_new_qpair Having accept poller removed should simplify interrupt mode impl in transport specific layer. Fixes issue #1876 Change-Id: Ia6cac0c2da67a298e88956734c50fb6e6b7521f1 Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7268 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-12-14 13:18:33 +00:00
Jim Harris	59f3cdacb1	nvmf: don't always update discovery log when adding hosts If a subsystem has no listeners, then there is no need to update the discovery log when adding a host, or setting a subsystem to allow all hosts. This eliminates some unnecessary discovery log update notifications, especially when setting 'allow any hosts' on a subsystem immediately after it is created (and before it has any listeners). Update unit test to check the adding a host to a subsystem without listeners does not rev the genctr. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I63dab5df564269e574bb925890088f52063aa378 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10546 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-10 17:32:18 +00:00
Jim Harris	3867f83dea	test/nvmf: add local var for hostnqn string Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ia967512bfcc5d7b1df15b6f6b5c132f21d601dce Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10563 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-10 17:32:18 +00:00
Jim Harris	9ac2cf7ff0	nvmf: don't update discovery log on subsystem create/delete The discovery log isn't updated when a subsystem is created or deleted, it's only updated when a listener for a subsystem is added or removed. So remove the nvmf_update_discovery_log() in the subsystem create and delete paths. They just generate extra AER completions that potentially cause the host to do unneeded work. Note that if a subsystem is deleted with active listeners, the subsystem delete path will remove each of the listeners before deleting the subsystem itself. So the discovery log will still get updated when those listeners are removed. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Id01bbfa3b24d3e1279a614a2fd60be41387a03b1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10545 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-10 17:32:18 +00:00
paul luse	fbb24d0ebe	lib/accel: remove batching from the framework and plug-in modules Batching will be made available for DSA specifically through the new idxd_perf tool. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ic51d9ad3692074805b1ffa705cea8be35737c778 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9846 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-12-08 16:35:40 +00:00
Shuhei Matsumoto	215518069a	bdev/nvme: nvme_ctrlr_create() gets prchk_flags from nvme_async_probe_ctx Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Id3deca8e0aba23299347a6aee6f0f44ee683556e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10555 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2021-12-08 08:31:24 +00:00
Shuhei Matsumoto	696ad465d7	bdev/nvme: Remove the failover_in_progress flag from struct nvme_ctrlr The failover_in_progress flag is used to decide the return value of bdev_nvme_failover(). bdev_nvme_delete() calls bdev_nvme_failover() with remove=true to remove nvme_ctrlr->active_path_id. However bdev_nvme_failover() returns zero if nvme_ctrlr->failover_in_progress is true. bdev_nvme_failover() may return zero even if it does not remove nvme_ctrlr->active_path_id. The following will be better. bdev_nvme_failover() returns -EBUSY if nvme_ctrlr->resetting is true, and the caller repeats calling bdev_nvme_failover() until the target trid becomes alternative path or bdev_nvme_failover() returns zero. To do that, the failover_in_progress flag is not necessary any more. Removing the failover_in_progress will also simplify the following patches to unify ctrlr reset and failover. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I57ab944beb1d06ea4def144c81c69705860de35f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10441 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-08 08:31:24 +00:00
Shuhei Matsumoto	7cc66c0ab1	bdev/nvme: Check if ns can be shared when configuring multipath We had not checked the bit 0 of the Namespace Multipath I/O and Namespace Sharing Capabilities (NMIC) field in the Identify Namespace data structure. If the bit 0 of the NMIC is zero, it is likely that namespaces are not identical. We should check if the value of the NMIC first, and do it in this patch. Additionally, it is not usual if the bit 0 of the CMIC and the bit 0 of the NMIC do not match. So in unit tests rename the parameter multi_ctrlr by multipath for ut_attach_ctrlr() and use it for the value of the NMIC. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I6aa7cbcc99be2507dbf18930f7b585a9ea7d0f90 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10380 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-08 08:31:24 +00:00
Shuhei Matsumoto	8afa746b4d	bdev/nvme: Use new APIs in a reset ctrlr sequence Replace the spdk_nvme_ctrlr_reset_async() and spdk_nvme_reset_poll_async() calls by the spdk_nvme_ctrlr_disconnect(), spdk_nvme_ctrlr_reconnect_async(), and spdk_nvme_ctrlr_reconnect_poll_async() calls in a reset ctrlr sequence. spdk_nvme_ctrlr_disconnect() can fail if ctrlr is already resetting or removed. But both cases are not possible. reset is controlled and the callback to the hot remove is called when the ctrlr is hot removed. So we assume spdk_nvme_ctrlr_disconnect() always succeed. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I1299e198597b2a2110f80b9a868e2dae015682ee Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10092 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-12-08 08:31:24 +00:00
Changpeng Liu	632c8d5613	nvme: make get INTEL log pages can be executed asynchronously Also we don't treat exceptions when getting INTEL log pages as a fatal error, the initialization will still contine. Change-Id: Ic2fd2be510fde2679c1546482934d0a180266936 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10341 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-12-06 23:17:07 +00:00
Evgeniy Kochetov	1fd2af0150	nvmf/ctrlr_bdev: Set DNR bit in status for failed NVMe passthru When NVMe passthru command (IO or admin) fails on submission (e.g. it is not supported), set DNR bit in completion status field. There is no sense in retrying the command in this case. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I55960c128bd9fc31f6defef0b9832259a71684b1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8578 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-12-03 08:13:52 +00:00
Evgeniy Kochetov	d03b31c61f	nvmf/ctrlr_bdev: Fix status code for failed admin passthru command If NVMe admin passthru command is not supported by underlying bdev, set status code in NVMe completion to INVALID_OPCODE. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I29c4e1f8263b76b27c199cfd2d9b2474432ec70b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10517 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-12-03 08:13:52 +00:00
Evgeniy Kochetov	a9593c7981	bdev: Fail nvme passthru command if not supported by bdev The originally detected problem is that SPDK NVMf target fails command with invalid opcode with status code INTERNAL_DEVICE_ERROR instead of INVALID_OPCODE. All unknown commands on IO queue are passed to underlying block device layer as NVME_IO type. It is not checked if this type of commands is supported and, when command fails, INTERNAL_DEVICE_ERROR is set as status code. If command fails on submission, status code is set to INVALID_OPCODE which is more relevant. This patch adds check if command type is supported to bdev_nvme_*_passthru functions. If not supported, it is failed with ENOTSUP. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I4d7f7639da17dd3b1dc3eee7eb1b4a4f876117a2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8567 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2021-12-03 08:13:52 +00:00
Josh Soref	c9c7c281f8	spelling: test Part of #2256 * achieve * additionally * against * aliases * already * another * arguments * between * capabilities * comparison * compatibility * configuration * continuing * controlq * cpumask * default * depends * dereferenced * discussed * dissect * driver * environment * everything * excluded * existing * expectation * failed * fails * following * functions * hugepages * identifiers * implicitly * in_capsule * increment * initialization * initiator * integrity * iteration * latencies * libraries * management * namespace * negotiated * negotiation * nonexistent * number * occur * occurred * occurring * offsetting * operations * outstanding * overwhelmed * parameter * parameters * partition * preempts * provisioned * responded * segment * skipped * struct * subsystem * success * successfully * sufficiently * this * threshold * transfer * transferred * unchanged * unexpected * unregistered * useless * utility * value * variable * workload Change-Id: I21ca7dab4ef575b5767e50aaeabc34314ab13396 Signed-off-by: Josh Soref <jsoref@gmail.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10409 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-12-03 08:13:22 +00:00
Jim Harris	7e68d0baca	nvme: configure AER for discovery controllers Move the CONFIGURE_AER state before SET_KEEP_ALIVE to make sure that we run the CONFIGURE_AER state for discovery controllers. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ia4e24f6507c43e3fece06b9161ff8e0b8fa0e97d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10332 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-12-02 04:02:29 +00:00

... 2 3 4 5 6 ...

2733 Commits