ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Alexey Marchuk	b0f4249c59	nvme/rdma: Add async set/get registers Now controller initialization with RDMA transport is fully async Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I26e857740d3137d0b0e987facc81fc5f6ef81f2b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10756 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-04-22 09:44:57 +00:00
Shuhei Matsumoto	dbe7e74cee	nvme: Change nvme_qpair_abort_queued_reqs() to set SC_ABORTED_SQ_DELETION Transport specific qpair_abort_reqs() set SC to SC_ABORTED_SQ_DELETION. However, nvme_qpair_abort_queued_reqs() set SC to SC_ABORTED_BY_REQUEST even if its call is not requested by the upper layer. Change nvme_qpair_abort_queued_reqs() to set SC to SC_ABORTED_SQ_DELETION for consistency. nvme_qpair_abort_queued_reqs() is used to abort queued requests that were sent while adminq was connecting. SC_ABORTED_SQ_DELETION will not be so bad even for the case. This change is required for the NVMe bdev module to be resilient for I/O error. The NVMe bdev module does not retry I/O if SC is SC_ABORTED_BY_REQUEST. SC is set to SC_INTERNAL_DEVICE_ERROR if a request is failed to submit to qpair by a generic qpair layer. We can change it to SC_ABORTED_SQ_DELETION as well but we keep this for now. SC_INTERNAL_DEVICE_ERROR is also retriable for the NVMe bdev module. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I7d8d5e97b222fe9275afc4fed024c1654c9579a2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12121 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-22 09:44:57 +00:00
zhangduan	31db7b139b	nvme_tcp: set transport_ack_timeout to ack_timeout The value of ack_timeout is calculated according to the formula 2^(transport_ack_timeout) msec. Signed-off-by: zhangduan <zhangd28@chinatelecom.cn> Change-Id: I5a938635d70693ddd405fa5907555bb745b4df0f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12215 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-20 08:21:42 +00:00
Konrad Sztyber	aa21240574	nvme/pcie: increase min admin queue size to 256 Now that IO qpairs can be created asynchronously, we need to make sure that all the create IO CQ/SQ commands can be executed simultaneously. It is pretty common to create multiple IO qpairs at the same time, e.g. adding an NVMe bdev to an nvmf subsystem will create an IO qpair on each poll group. In that case, if the number of cores exceed the size of the admin queue (actually it can be even lower due to outstanding AERs), we might run out nvme_requests on the admin queue. The chosen minimum value for the admin queue size, 256, should be enough to cover most cases. Fixes #2465 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I55c59aef64f3fdb33f7b4824d3e9beb403602633 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12270 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-19 08:18:34 +00:00
Shuhei Matsumoto	2c13441ba8	nvme_rdma: Destroy qpair after qpair is actually disconnected The RDMA transport can disconnect qpair asynchronously now. Previously, we tried to release the resource of the qpair after disconnected. However it did not work because it was done when deleting the qpair. The admin qpair was not deleted in a ctrlr reset sequence. This patch tries to satisfy the same aim again but by a different way. Previously, we released the resource of the qpair before starting actual disconnection process. This patch release the resource of the qpair after the qpair is actually disconnected. The related patches are: `b9518a5540` `eb09178a59` Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Id6a814895a35b1589b781a91744ef872b42aaa69 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11783 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	4b73223542	nvme_rdma: Wait until lingering qpair becomes quiet before completing disconnection The code to handle the lingering qpair when deleting it was really complicated. The RDMA transport can connect or disconnect qpair asynchronously. Then we can include the code to handle the lingering qpair into the code to disconnect qpair now. If the disconnected qpair is still busy, defer completion of the disconnection until qpair becomes idle. If poll group is not used, we can complete disconnection immediately because cq is already destroyed. The related data and unit test cases are not necessary anymore. So delete them in this patch. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	20cf90801e	nvme_rdma: Handle stale connection asynchronously Include delayed disconnect/connect retries with finite times into the state machine of asynchronous qpair connnection. We do not need to call back to the common transport layer but we need to do the following, clear rqpair->cq before starting disconnection if qpair uses poll group, and clear qpair->transport_failure_reason after disconnected. Additionally locate the new state STALE_CONN before INITIALIZING because cq is not ready to use for admin qpair when the state is STALE_CONN. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ibc779a2b772be9506ffd8226d5f64d6d12102ff2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11690 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	77c4657140	nvme_rdma: Factor out destroying rdma qpair operation Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I18e166a726cca69f13e7c5818eba57f478726286 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11689 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	aa36c18196	nvme_rdma: Pass callback to ctrlr_disconnect_qpair() via a parameter Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I06cbb9739286d1928ad9fc07de3715a449914d75 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11688 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	75d38a301d	nvme: poll_group_process_completions() returns -ENXIO if any qpair failed TCP transport already does it but was not documented clearly. RDMA and PCIe transports follow it and document it clearly. Then we can check each qpair's state if spdk_nvme_poll_group_process_completions() returns -ENXIO before disconnected_qpair_cb() is called. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2afe920cfd06c374251fccc1c205948fb498dd33 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11328 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	9717b0c3df	nvme_rdma: Connect and disconnect qpair asynchronously Add three states, INITIALIZING, EXITING, and EXITED to the rqpair state. Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it to opts->async_mode for I/O qpair and true for admin qpair. Replace all nvme_rdma_process_event() calls by nvme_rdma_process_event_start() calls. nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING when starting to process CM events. nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is not admin qpair. nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true or qpair->poll_group is not NULL before polling CM events, or polls CM events until completion otherwise. Add comments to clarify why we do like this. nvme_rdma_poll_group_process_completions() does not process submission for any qpair which is still connecting. Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Changpeng Liu	c47b7b0276	nvme/vfio-user: use API to setup BAR0 doorbells We can use lib/vfio-user API to setup BAR0 doorbells, existing implementation is redundant. Change-Id: Ib880d167c84c6b8482bf1a35559a34c939f6a02d Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12211 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-12 07:24:22 +00:00
Tomasz Zawadzki	6301f8915d	lib/sock: provide a hint to picking optimal poll group The process of matching qpair to poll group is split into two distinct parts that occur on different threads. See spdk_nvmf_tgt_new_qpair(). This results in a race condition for TCP between spdk_sock_map_lookup() and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group() and spdk_nvmf_poll_group_add() respectively. Fixes #2113 This patch picks a hint from nvmf_tcp for next poll group, which is then passed down to spdk_sock_map_lookup(). When matching placement_id exists, but does not have a poll group assigned - the hint will be used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-01 12:41:26 +00:00
Shuhei Matsumoto	0a61427ecc	nvme_rdma: Start qpair after resolving address and route when poll group is used Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I0b0f314c98368247582f2dfcaf69f78e24d715f9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11366 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	531c1b0f04	nvme_rdma: Make nvme_rdma_process_event() asynchronous Separate nvme_rdma_process_event() into nvme_rdma_process_event_start() and nvme_rdma_process_event_poll(). Use nvme_rdma_process_event_start() and nvme_rdma_process_event_poll() in nvme_rdma_process_event() to ensure compatibility. Change-Id: Idc960fab2540efec612dcf22f156acabd2e2874e Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10594 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	791ee7deb4	nvme_rdma: nvme_rdma_process_events() returns negated errno It will be convenient for the following patches to return negated errno directly. Change-Id: Ic80181b2ee449946dd60ad0c97a325fd48b92231 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10990 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	cf7f253302	nvme_rdma: Add callback to nvme_rdma_process_event() Change-Id: I66aa89dc54d5aaedbe2f06239cbf04aeeb2c739e Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11359 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	bcf0845727	nvme_rdma: Make CM event operations callback functions Change-Id: I9f2551a07187400dd9ef624348cd465e64557e1b Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11138 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	e5927c02e9	nvme_rdma: Remove cm_channel param from process_event() nvme_rdma_poll_events() gets the cm_channel pointer itself. Before calling nvme_rdma_process_event(), we checks the rctrlr is valid. Hence we do not have to pass the cm_channel pointer to nvme_rdma_process_event() via a parameter. This simplifies the code and makes the following patches a little easier. Change-Id: I03f095833469c5b64592264d63a592106d49e13b Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11167 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	29974dc882	nvme_rdma: Make fabric_qpair_connect() asynchronous Replace nvme_fabric_qpair_connect() by nvme_fabric_qpair_connect_async() and nvme_fabric_qpair_connect_poll(). The following is a detail. Define state of the nvme_rdma_qpair and each rqpair holds it. Initialize rqpair->state by INVALID at nvme_rdma_ctrlr_create_qpair(). _nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to FABRIC_CONNECT_SEND instead of calling nvme_fabric_qpair_connect(). Then the new function nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_fabric_qpair_connect_async() at FABRIC_CONNECT_SEND and nvme_fabric_qpair_connect_poll() until it returns 0 at FABRIC_CONNECT_POLL. nvme_rdma_qpair_process_completions() or nvme_rdma_poll_group_process_completions() calls nvme_rdma_ctrlr_connect_qpair_poll() if qpair->state is CONECTING. This patter follows the TCP transport. Change-Id: I411f4fa8071cb5ea27581f3820eba9b02c731e4c Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11334 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-01 08:28:45 +00:00
Evgeniy Kochetov	a2d4ddb3b1	nvme: Prioritize user provided trstring for transport lookup This patch fixes the issue with custom nvme transport. It is possible to register custom nvme transport with arbitrary name but it is not usable because 'spdk_nvme_trid_populate_transport' call in probe function will always set trstring to 'CUSTOM' and transport lookup will fail. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I83fd24dd8732ac0a21e22435e0acff20ab0e7521 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9557 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-31 10:31:20 +00:00
Jim Harris	7039639063	nvme: read full discovery page after reading header Some targets report they support log page offset, but then fail GET_LOG_PAGE commands that specify a non-zero offset, or report the wrong number of discovery entries when reading more than the discovery log page header but not the entire log page. So just revert to reading the entire discovery log page, after we've read the header to know how big the log page will be. This means that when we read the log page initially (without the individual entries), we need to save off the genctr, since it will get overwritten when we read the log page again. We can just store this in the discovery context, and compare it to the genctr that we read with the whole log page. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I34929253312fed9924db58904a051f3979283730 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11478 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-03-28 17:10:04 +00:00
Alexey Marchuk	94494579ce	nvme_rdma: Update reportring of RDMA responder resources responder_resources parameter of rdma cm tells remote side how many outstaing RDMA_READ of atomic operations local side can handle. Previously it was adjusted on queue depth but that was not correct since these parameters do not depend on each other. Even with qdepth=1 remote side may send several RDMA_READ operations per 1 IO request. With this change we report responder_resources equal to the maximum supported by RDMA device. Linux kernel nvme rdma driver reports this value in the same way. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I77e5c2ead6269da44c32a75a9188429f50d32ae4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11698 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-25 08:18:37 +00:00
Krzysztof Karas	a9a55513e5	nvme_ctrlr.c: Add error logs Add NVME_CTRLR_ERRLOGs to nvme_ctrlr_process_init(). The main goal is to help with debugging #2201 issue. Change-Id: I1ae6a9b30d6124dfe25eb7912402c37d476b0d4c Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10627 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-03-24 09:57:17 +00:00
Shuhei Matsumoto	6a89f75ec7	nvme_rdma: Remove handling stale connect The feature will be redesigned and restored in the following patches. For the NVMe bdev module, it can reconnect by itself without relying on the feature. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2d9c0437f7ad8412ad8cf40d11e574723b735bee Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11440 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	0c77cf90bf	nvme_rdma: Consolidate fail_qpair() calls into a single place For nvme_rdma_qpair_process_completions(), consolidate the operations to call nvme_rdma_fail_qpair() and return -ENXIO into a single place. Besides, shorten pointer references for nvme_rdma_qpair_process_completions() and nvme_rdma_poll_group_process_completions(). These will make the following patches a little easier. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Iaf72cfca0b5b3ba223d86e267da8069d43a15292 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11439 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	df7c2a2253	nvme: Call ctrlr_disconnect_done() after qpair_process_comletions() returns -ENXIO Add a new flag is_disconnecting to struct spdk_nvme_ctrlr. Separate calling nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done() by using the flag is_disconnecting. Additionally, change nvme_ctrlr_fail() to skip setting ctrlr->is_failed to true if ctrlr->is_disconnecting is true. Change-Id: Ie2c74ba41f120662a30f6198751d07005d23abcf Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11000 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	8926303b59	nvme: Caller polls qpair until disconnected if async connect failed nvme_transport_ctrlr_connect_qpair() calls nvme_transport_ctrlr_disconnect_qpair() if failed. If async qpair disconnect is supported, even when connect qpair failed, nvme_transport_ctrlr_connect_qpair() may complete asynchronously later. The cases that qpair->async is set to true are I/O qpair for the NVMe bdev module and admin qpair. example/nvme/perf and example/nvme/reconnect use I/O qpair but both set qpair->async to false. For the NVMe bdev module, I/O qpair is connected when creating I/O channel or resetting ctrlr. If spdk_nvme_ctrlr_connect_io_qpair() returns 0 for a I/O qpair, the qpair is in a poll group and is polled by spdk_nvme_poll_group_process_completions() and a disconnected callback is called to the qpair. Hence we do not need to add additional polling for I/O qpair in the NVMe bdev module. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I6e0aadcfd98e5cb77b362ef1a79e0eca2985f36e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11112 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	9d1063d732	nvme: qpair_process_completions() processes if qpair is disconnecting If poll group is not used and if qpair is disconnecting, spdk_nvme_qpair_process_completions() has to poll qpair until it is actually disconnected even if ctrlr is failed. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I6da84f1e35780d21480fbe6f07e76af3048a777b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11018 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	21322e01dd	nvme: Separate spdk_nvme_ctrlr_disconnect() into disconnect and after it Serparate spdk_nvme_ctrlr_disconnect() into nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done() to call nvme_ctrlr_disconnect_done() after adminq is actually disconnected when disconnecting adminq asynchronously. The following patches will add a new flag is_disconnecting to struct spdk_nvme_ctrlr and prevent us from setting ctrlr->is_failed to true between nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done(). By this patch, nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done() are executed in the same context. So it is not possible to set ctrlr->is_failed to true between nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done(). Hence nvme_ctrlr_disconnect_done() does not have to clear ctrlr->is_failed again. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I18b5b68f37e27b54782691823edae9738c26faa1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10999 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	4a73675dbf	nvme: Remove deprecated spdk_nvme_ctrlr_reset_async() and _reset_poll_async() Change spdk_nvme_ctrlr_reset() to use spdk_nvme_ctrlr_disconnect(), spdk_nvme_ctrlr_reconnect_async(), and spdk_nvme_ctrlr_reconnect_poll_async(). Then remove the deprecated spdk_nvme_ctrlr_reset_async() and spdk_nvme_ctrlr_reset_poll_async(). These changes simplify the following patches to make spdk_nvme_ctrlr_disconnect() asynchronous. Change-Id: Ia71e8e0ad5b2dff42b7423634f66de47863926e2 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10913 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	cfe11bd1db	nvme: Factor out operations done after disconnect qpair completes This is a preparation to make nvme_transport_ctrlr_disconnect_qpair() asynchronous. For nvme_transport_ctrlr_disconnect_qpair(), factor out operations after returning from transport's specific ctrlr_disconnect_qpair() into a helper function nvme_transport_ctrlr_disconnect_qpair_done(). Then move nvme_transport_ctrlr_disconnect_qpair_done() into the end of the transport specific ctrlr_disconnect_qpair(). Additionally remove the operation to overwrite the qpair state to DISCONNECTED from nvme_transport_connect_qpair_fail() because this is duplicated and nvme_transport_ctrlr_disconnect_qpair() is responsible to make the qpair disconnected even after it completes asynchronously. Change-Id: I9c8faa7039d306d3e31a8f51826755ce8840a8aa Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10851 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	1285481917	nvme: Free I/O qpair now even if it is in poll group completion spdk_nvme_poll_group has followed spdk_nvme_qpair about how to process I/O qpair deletion inside of a completion context. spdk_nvme_qpair_process_completions() accesses qpair after returning from nvme_transport_qpair_process_completions(). So this is reasonable. On the other hand, if spdk_nvme_poll_group_process_completions() can execute spdk_nvme_ctrlr_free_io_qpair() inside of a completion context, the target qpair is ensured to be deleted after returning from spdk_nvme_ctrlr_free_io_qpair(). Then the target qpair is not accessed anymore in spdk_nvme_poll_group_process_completions(). Remove two variables, in_completion_context and num_qpairs_to_delete, of spdk_nvme_transport_poll_group and the related code. This change is really necessary to support the following case. In the NVMe bdev module, a nvme_qpair has a qpair and a poll_group channel. disconnected_qpair_cb calls spdk_nvme_ctrlr_free_io_qpair() for the qpair and spdk_put_io_channel() to the poll_group_channel. spdk_nvme_ctrlr_free_io_qpair() is executed after unwinding stack but spdk_put_io_channel() is executed now. The callback to spdk_put_io_channel() calls spdk_nvme_poll_group_destroy(). However, spdk_nvme_ctrlr_free_io_qpair() is not executed. Hence spdk_nvme_poll_group_destroy() fails. Update the corresponding stub in unit test together. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Icd1f1daf049c6c7ffb28790fe87989a1060f8952 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11496 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-03-15 09:05:09 +00:00
Shuhei Matsumoto	486f46e867	nvme_rdma: Call disconnected_qpair_cb when qpair is in disconnected_qpairs list We want to call disconnected_qpairs_cb only if qpair is actually disconnected. When we disconnect qpair asynchronously, for qpairs in the group->disconnected_qpairs list, we want to poll them until actually disconnected and then call disconnected_qpairs_cb for them. As a preparation, call disconnected_qpair_cb only for qpairs which is in the group->disconnected_qpairs list. For TCP and PCIe transports, disconnecting qpair will continue to be synchronous for now. So we change only RDMA transport. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ifaf6157e1e02fa13f52a66409c9e60fc814d71dd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11495 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-15 09:05:09 +00:00
Shuhei Matsumoto	999f0362ab	nvme_tcp: Remove qpair from group->needs_poll when removing it from poll_group spdk_sock_close() may not complete synchronously. Then the following scenario is possible. ctrlr_disconnected_qpair() calls spdk_sock_close(). Then any request may complete and call _pdu_write_done(). qpair is still associated with a poll_group and is inserted into the group->needs_poll list. After ctrlr_disconnected_qpair() completes, disconnected_qpair_cb() is called. disconnected_qpair_cb() calls spdk_nvme_ctrlr_free_io_qpair(). spdk_nvme_ctrlr_free_io_qpair() calls spdk_nvme_poll_group_remove(). spdk_nvme_poll_group() removes qpair only from the group->disconnected_qpairs list. Even after qpair->poll_group is nullified, qpair is still in the group->needs_poll list. Then spdk_nvme_poll_group_process_completions() polls all qpairs in the group->needs_poll list and hits the assert. Fixes the issue #2390 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I6ede8bd3b7b1a57a34ac61436159975ab6253fbe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11882 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-03-14 08:44:03 +00:00
John Levon	c4f7ddd2c7	lib/nvme: report shadow doorbell update stats Currently shadow doorbell updates are not counted; add statistics for those, and rename the other statistic for clarity. Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: I211a77902e38265c99b15862034c6d022dc582a0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11844 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-10 09:49:25 +00:00
John Levon	9e239b1c1e	lib/nvme: remove needless check nvme_pcie_qpair_update_mmio_required() is only called for QPs with shadow doorbells enabled, so we don't need an additional branch here. Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: Idf65f92deb0c6f93019892ef5a02bc28ba4c0f8c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11843 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-10 09:49:25 +00:00
John Levon	8071e0f1dc	lib/nvme: drop unused argument nvme_pcie_qpair_update_mmio_required() qpair argument is unused. Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: I0bd6897eb8e6a06f211cc599ab6045409bb43641 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11842 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-10 09:49:25 +00:00
John Levon	ba4ffda671	lib/nvme: correct typo in transport stats "doorbell" not "doobell" Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: I9261559576e72a09b63fbc984ae0ec2a2572eb2c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11841 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-03-09 08:03:42 +00:00
Jim Harris	12d522515f	nvme: simplify spdk_nvme_transport_id_populate_trstring Note that this also works around a false positive in gcc-11 of type -Wstringop-overread. Fixes issue #2391. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ib5301b9ef9fa3ead2a1a2318655533a8cfba03fe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11709 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-02-28 11:07:05 +00:00
Evgeniy Kochetov	5c80b1e5ab	nvme/rdma: Limit max_sges by command capsule size According to NVMe over Fabrics spec number of SGLs supported by the controller is reported in MSDBD. But it is also implicitly limited by command capsule size (IOCCSZ) since SGL are passed in capsule. This patch adjusts max_sges to capsule size if required. Adjustment to MSDBD is also moved to transport layer because it is fabrics specific parameter and is not valid for PCIe transport. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I44918eb949345c61242ca50a524d21d04b6ac058 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11669 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-02-25 08:18:32 +00:00
Shuhei Matsumoto	7594030409	nvme: Set dnr to zero for abort_reqs() including a fix of degradation The patch nvme: Set dnr to zero for nvme_qpair_abort_reqs() `1b3172f726` did the change stated in the title. However, Revert "nvme/rdma: Correct qpair disconnect process" `c8f986c7ee` destroyed it for RDMA transport. Additionally, we had still set DNR to 1 in nvme_qpair_init(). This patch fixes both. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Iee60ac24aa7e04cce0f394014c9d9afc9d2b56ec Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11644 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-24 14:56:03 +00:00
Jim Harris	635d0cbe75	nvme: allocate extra request for fabrics connect With async connect, we need to avoid the case where the initiator is sending the icreq, and meanwhile the application submits enough I/O such that the request objects are exhausted, leaving none for the FABRICS/CONNECT command that we need to send after the icreq is done. So allocate an extra request, and then use it when sending the FABRICS/CONNECT command, rather than trying to pull one from the qpair's STAILQ. Fixes issue #2371. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: If42a3fbb3fd9d863ee48cf5cae75a9ba1754c349 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11515 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-14 15:29:39 +00:00
Jim Harris	a97200ad45	nvme: optimize struct spdk_nvme_qpair packing Group fields such that those not used in the I/O path are at the end of the structure. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I43eca1faacd29a5bf34be6ee644191d865cd42a9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11514 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-14 15:29:39 +00:00
Jim Harris	56618eacb9	nvme: add NVME_INIT_REQUEST macro This macro will be used in an upcoming patch that needs to construct an nvme_request structure outside of the standard nvme_allocate() routines. Examined x86 optimized assembly with this patch, and there is no change. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I0f6b8500e06b56edc33f437f351536cf857d13d3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11513 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-14 15:29:39 +00:00
Evgeniy Kochetov	834e3c5a0e	nvme: Fix submission queue overflow SPDK can submit more commands to remote NVMf target than allowed by negotiated queue size. SPDK submits up to SQSIZE commands, but only SQSIZE-1 are allowed. Here is a relevant quote from NVMe over Fabrics rev.1.1a ch.2.4.1 “Submission Queue Flow Control Negotiation”: If SQ flow control is disabled, then the host should limit the number of outstanding commands for a queue pair to be less than the size of the Submission Queue. If the controller detects that the number of outstanding commands for a queue pair is greater than or equal to the size of the Submission Queue, then the controller shall: a) stop processing commands and set the Controller Fatal Status (CSTS.CFS) bit to ‘1’ (refer to section 10.5 in the NVMe Base specification); and b) terminate the NVMe Transport connection and end the association between the host and the controller. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: Ifbcf5d51911fc4ddcea1f7cde3135571648606f3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11413 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-02-10 15:22:08 +00:00
Evgeniy Kochetov	486426529d	nvme/rdma: Remove queue depth adjustment to crqsize According to NVMe over Fabrics specification (rev.1.1a) HSQSIZE sent in RDMA_CM_REQUEST private data (ch.7.3.6.4) shall be the same as SQSIZE later sent in Connect command (ch.3.3). SPDK NVMe RDMA initiator adjusts SQSIZE to CRQSIZE received from target in RDMA_CM_ACCEPT private data. Target is allowed to send CRQSIZE < HSQSIZE if RNR retries are used. So, it is possible that SQSIZE sent by SPDK will be lower than previously sent HSQSIZE. There are targets validating this match and they reject connection from SPDK. Linux kernel NVMe initiator doesn't perform such adjustments and connects well to such targets. This patch aligns SPDK behavior with specification and Linux kernel implementation. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I01968d1c07d284396fa5939932d85841351d7a45 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11350 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-02-10 15:22:08 +00:00
Jaylyn Ren	3e937f07eb	test/accel&rdma: Fix unittest_accel and unittest_nvme_rdma failure There are errors occur that uninitialised value created by a stack allocation when running unittest_accel and unittest_nvme_rdma with valgrind. Change-Id: I4b48b472cc7c189cbcaf8ca772830a23118e7e17 Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10559 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-02-09 22:22:04 +00:00
Tomasz Zawadzki	047c067c05	so_ver: increase all major versions To allow SO_MINOR updates on LTS for the whole year it is supported, the major version for all components needs to be increased. This is to prevent scenario where two versions exists with matching versions, but conflicting ABI. Ex. Next SPDK release adds an API call increasing the minor version, then LTS needs just a subset of those additions. Increasing major so version after LTS, allows the future releases to update versions as needed. Yet allowing LTS to increase minor version separately. Disabled test for increasing SO version without ABI change, as that is goal of this patch. This check shall be removed with SPDK 22.05 release. This patch: - increases SO_VER by 1 for all components - resets SO_MINOR to 0 for all components - removes suppressions for ABI tests Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Id1a5358882dc496faa5b0b5c9a63b326c378c551 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11361 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-01-31 15:29:56 +00:00
Shuhei Matsumoto	fc48cf8681	nvme_rdma: Check only if Soft RoCE receive normal completion after disconnect We saw this unexpected behavior by the current SPDK master. Add the check to clarify this behavior occurs only when we use Soft RoCE. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I3a5eaa9064a0601c65139e7868898545926d0dbf Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11225 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com>	2022-01-26 08:09:15 +00:00

1 2 3 4 5 ...

1603 Commits