ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
MengjinWu	e4569bd421	test/nvme_tcp: Correct the psh_len in nvme_tcp unittest psh len is not the same with header len. Add an assert in nvme_tcp.c to prevent this happen again. Signed-off-by: MengjinWu <mengjin.wu@intel.com> Change-Id: Ibc250752bedf3da8994f79c51fb01577a222d364 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14521 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-20 20:29:40 +00:00
MengjinWu	0b7f5a57ac	nvme/tcp: remove unnecessary if check in nvme_tcp_read_pdu This "if" is of no use here. The state machine has the "NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_CH" state means the pdu does not receive enough length of header. Signed-off-by: MengjinWu <mengjin.wu@intel.com> Change-Id: Id50943f77b570fd337e2bb4e3b45281018d159e4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14504 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-20 20:29:40 +00:00
Aleksey Marchuk	c66b68e94e	nvme/rdma: Inline nvme_rdma_calloc/free These functions used to allocate resources using calloc/spdk_zmalloc depending on the g_nvme_hooks pointer. Later these functions were refactored to always use spdk_zmalloc, so they became simple wrappers of spdk_zmalloc and spdk_free. There is no sense to use them, call spdk memory API directly. Signed-off-by: Aleksey Marchuk <alexeymar@nvidia.com> Change-Id: I3b514b20e2128beb5d2397881d3de00111a8a3bc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14429 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-20 20:27:52 +00:00
Aleksey Marchuk	77aef307fd	nvme/rdma: Don't reg MRs for cmds and rsps Since now cmds and rsps buffers are allocated from huge pages, there are already registered MR for this memory. In that way we can avoid registering 2 additional MRs per qpair, just perform memory translation to get lkey. Signed-off-by: Aleksey Marchuk <alexeymar@nvidia.com> Change-Id: I2cb39a15e5d224698c293ac18af00a909840eaa8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14428 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-20 20:27:52 +00:00
MengjinWu	48312019c8	nvme/tcp: Remove duplicate code in nvme_tcp_read_pdu Signed-off-by: MengjinWu <mengjin.wu@intel.com> Change-Id: I63f51ecba2b4d40579d2592d2c85a7aefdacf7e7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14503 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-09-15 19:25:02 +00:00
MengjinWu	31fc5f196f	nvme/tcp: simplify state change function state change function do not need to use swtich to do some work. Do memset in state machine. Signed-off-by: MengjinWu <mengjin.wu@intel.com> Change-Id: Ie66454d8f31860f403171f20858a6b4a24e3c76f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14502 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>	2022-09-15 19:25:02 +00:00
Boris Glimcher	35f7f0ce1e	nvme/tcp: Allow to choose SSL socket implementation Adding `psk` field to `spdk_nvme_ctrlr_opts` Adding `psk` parameter to `bdev_nvme_attach_controller` RPC Change-Id: Ie6f0d8b04ce472e6153934e985c026acded6cdfc Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14046 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-09-14 07:44:53 +00:00
Shuhei Matsumoto	cdf61c2f22	nvme: Polls only the qpair if ctrlr is not fabrics when connecting synchronously For non-fabric controllers, the corresponding I/O qpairs are simply re-enabled at controller reset. This had a issue when I/O qpairs span multiple threads and poll group is used. spdk_nvme_ctrlr_reconnect_poll_async() calls nvme_transport_ctrlr_connect_qpair() with qpair->async being false. Then nvme_transport_ctrlr_connect_qpair() calls spdk_nvme_poll_group_process_completions() until the qpair is connected. spdk_nvme_poll_group_process_completions() may poll other qpairs. This may cause I/O to complete on a wrong thread. For PCIe controller, spdk_nvme_poll_group_process_completions() calls spdk_nvme_qpair_process_completions() simply for each qpair. Hence change nvme_transport_ctrlr_connect_qpair() to call spdk_nvme_qpair_process_completions() if the controller is non-fabrics. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ieb270c2fb154124021ef6d25577b817d05e5ca9e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14295 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-09-05 12:50:00 +00:00
Shuhei Matsumoto	0e4b13dc53	nvme_rdma: Destroy qpair after it is disconnected and drained By the previous patches, a qpair is destroyed after it is actually disconnected. But after the qpair is destroyed, it is checked if drained by using rqpair->current_num_sends and rqpair->current_num_recvs. However, if the qpair is the last of a poller of a poll group, CQ is destroyed before checking if the qpair is drained. If CQ is destroyed, at least rqpair->current_num_recvs is not updated, and we may get one second timeout. This should be avoided. Hence, destroy the qpair after it is disconnected and drained. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ibd6c83e8a3e7b6e11e9b45cee42669da6d42a621 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14278 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-05 12:49:11 +00:00
Shuhei Matsumoto	1d58eb038b	nvme_rdma: Release poller from poll group when qpair is actually disconnected If the being disconnected qpair is the last of a poller of a poll group, CQ is destroyed and the poller is released before the qpair is actually disconnected. This patch destroy CQ and release the poller after the qpair is actually disconnected. One exception is when spdk_nvme_ctrlr_free_io_qpair() is called to a connected qpair. In this case, the qpair is removed from a poll group before the qpair is actually disconnected. In this case, destroy CQ and release the poller when the qpair is removed from the poll group. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Idf266bbb6dbb40f04ae6313db724fabf80865763 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14253 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-05 12:49:11 +00:00
Shuhei Matsumoto	80d75fda06	nvme_rdma: Clean up releasing poller from poll group We have two cases to call nvme_rdma_poll_group_put_poller(). For consistency, make the two cases the same sequence. This will make the next patch easier. The next patch will release poller from poll group when qpair is actually disconnected as possible as we can. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I4178113d5277240e287e83a57e97cf32fd0f7457 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14252 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-09-05 12:49:11 +00:00
Jim Harris	b90d7b5b43	nvme: add admin queue size quirk for Hyper-V Hyper-V NVMe SSD controllers require admin queue size to be even multiples of a page. Add quirk to adjust the admin queue size if user overrides the default value to something other than an even multiple. As part of this change, set the quirks earlier when constructing a pcie controller, so that the quirks value can be used in the generic nvme_ctrlr_construct() function. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I417cd3cdc7e3ba512ec412f4876b0e0b7432341c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14220 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-09-01 08:31:46 +00:00
yidong0635	b813f998ea	nvme_pcie_common: Move group right before using. Better not to cache a value especially for there's an error return. Signed-off-by: yidong0635 <dongx.yi@intel.com> Change-Id: I3b243a66f4db9af34bc2ea01bafdac33004be128 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13650 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-09-01 08:26:34 +00:00
Jim Harris	3d59045a2a	nvme: remove incorrect comment about spdk_nvme_ctrlr structs This was correct back when we only supported PCIe, but doesn't in the newfangled world of fabrics and vfio-user. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I565edd2dab1eff862844585df8c25da508e4816d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14136 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-08-30 16:20:23 +00:00
Shuhei Matsumoto	4a6f858872	nvme_rdma: Set REUSEADDR to reuse source address among multiple CM IDs When we specify source address for admin and I/O qpairs, rdma_resolve_addr() succeeded only for admin qpair and failed for following all I/O qpairs because rdma_resolve_addr() returned -EADDRINUSE. To reuse source address among multiple qpairs, set the REUSEADDR option for each CM ID before executing rdma_resolve_addr() if source address is specified. We may miss something. Even if rdma_set_option() fails, execute rdma_resolve_addr(). Fixes issue #2604 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: If03f82d4499cf83c0e428a62e91c9d9e6aad28e0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14229 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com>	2022-08-29 11:41:17 +00:00
Jim Harris	4300c62167	nvme: add spdk_nvme_ctrlr_disable_read_changed_ns_list_log_page() Commit `a119799b` ("test/nvme/aer: remove duplicated changed NS list log") changed the nvme driver to read the CHANGED_NS_LIST log page before calling the application's AER callback (previously it would read it after). Commit `b801af090` ("nvme: add disable_read_changed_ns_list_log_page") added a new ctrlr_opts member to allow the application to tell the driver to not read this log page, and will read the log page itself instead to clear the AEN. But we cannot add this option to the 22.01 LTS branch since it breaks the ABI. So adding this API here, which can then be backported manually to the 22.01 branch for LTS users that require it. Restoring the old behavior is not correct for applications that want to consume the CHANGED_NS_LIST log page contents itself to know which namespaces have changed. Even if the driver reads the log page after the application, that read could happen during a small window between when a namespace change event has occurred and the AEN has been sent to the host. The only safe way for the application to consume ChANGED_NS_LIST log page contents itself is to make sure the driver never issues such a log page request itself. Fixes issue #2647. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iaeffe23dc7817c0c94441a36ed4d6f64a1f15a4e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14134 Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-08-25 07:31:44 +00:00
Jim Harris	e36f0d363e	nvme/pcie, nvme/tcp: add cb_arg context tracepoint argument This allows mapping an nvme_request back to the nvme_bdev_io. This requires bumping up the max number of arguments per tracepoint. 5 was previously chosen as max since it exactly fit in 64 bytes (1 cacheline) when all arguments were stored as uint64_t, but now that we support uint32_t arguments we can afford extra arguments when some of them are uint32_t. I've bumped it to 8 so we can avoid having to touch this value multiple times if we find some cases where we need 7 or 8 args. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ie2ef5e59d10549860b47542e68c1c34efa63047f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13995 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-08-19 11:06:31 +00:00
Jim Harris	0f068506ca	nvme: complete register_operations in the correct process In multi-process, we need to make sure we don't complete a register_operation in the wrong process. So save the pid in the nvme_register_completion structure when it is inserted into the STAILQ, then only complete operations where the pid matches. Fixes issue #2630. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I58c995237db486fecdd89d95e9e7a64379d0b0e5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13940 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-08-18 10:09:55 +00:00
Jim Harris	b801af090a	nvme: add disable_read_changed_ns_list_log_page Similar to the disable_read_ana_log_page ctrlr_opt, this enables the application to tell the NVMe driver to not read the CHANGED_NS_LIST log page in response to a NS_ATTR_CHANGED AEN, and will do the read itself. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ie447734187d4a4cb95ceef6e0131b640b8ba5984 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14088 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2022-08-18 10:08:40 +00:00
Jim Harris	c50cb569de	include: add STATIC_ASSERTS for opts structures with size member Various opts structures in SPDK have a size member, to enable ABI compatibility should fields be added in the future. But this requires the strucures to be packed, otherwise for example a structure may be padded at the end, and a new field added may just consume some of that padding. So add STATIC_ASSERTS for the current sizes in this patch. Upcoming patches will make the structures packed and add in reserved fields to fill in holes. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I9107d01d7b533f8542385a3538894bcd9f8c465d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14086 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Community-CI: Mellanox Build Bot	2022-08-18 10:08:40 +00:00
Shuhei Matsumoto	e93ba047ac	nvme: Restore complete_abort_queued_reqs() call into process_completions() spdk_nvme_qpair_process_completions() had called always _nvme_qpair_complete_abort_queued_reqs() at its end. However, the call was accidentally removed by a commit `59c8bb527b` to fix an issue. By this removal, aborting request was not completed for some error cases. Fix the degradation by restoring the call. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I0099eb7a008f823e1282576504423cdc248911d7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14045 Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2022-08-17 07:17:17 +00:00
Ziv Hirsch	eda407a6f0	nvme: add support for verify command Signed-off-by: Ziv Hirsch <zivhirsch13@gmail.com> Change-Id: Ic9859d5078d9568bb28eefcf8fb70a7fc222ee15 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13928 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-08-16 10:25:01 +00:00
LiadOz	5c3360ce1f	nvme/nvme_tcp: Check for timeout when socket connection fails Fixes #2614 Signed-off-by: LiadOz <liadozil@gmail.com> Change-Id: Ie4942d52b1af42ed859338fc59f3e29dcd59e68c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13891 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com>	2022-08-16 10:23:26 +00:00
Jim Harris	a6b7e1839d	nvme/tcp: add trace points for cmd submit/complete Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iad56e7a96cf0210bcf54825c8bcc39af9366b72c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13992 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>	2022-08-16 10:23:10 +00:00
Jim Harris	9396cb9a94	nvme/tcp: simplify outstanding_reqs handling Avoid putting a new req on the outstanding_reqs TAILQ until we know it can be initialized successfully. This avoids adding to the TAILQ only to remove it just after. This allow simplifies the outstanding_reqs TAILQ handling, since reqs are now only inserted and removed in one place each. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I5ccc41c14abd541ffcf2a602246e0671386840c7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13991 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-08-16 10:23:10 +00:00
Jim Harris	b0396da090	nvme/pcie: rename trace object to NVME_PCIE_REQ We were using "TR" for "tracker" previously, but we are tracing the nvme_requests, not nvme_trackers, so use the right names for the trace object to avoid confusion. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ia3886d74b162138c2cdbe0017224d9494f74966c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13990 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-08-16 10:23:10 +00:00
Jim Harris	97661e86b7	nvme/pcie: add cpl status to PCIE_COMPLETE trace event Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I51e87f0f23b84956f96ab2efc62ad99a8d74cd4e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13989 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-08-16 10:23:10 +00:00
Jim Harris	7b05b29d48	nvme/pcie: use 4-byte trace arguments where possible Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I24c3fd545cadc403ac1f3589c6242a08a7a2f517 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14000 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-08-16 10:23:10 +00:00
Shuhei Matsumoto	227d83e2fa	nvme: Use spdk_nvme_ctrlr_is_fabrics() to update ioccsz ioccsz is specific for fabrics. spdk_nvme_ctrlr_is_fabrics() returns true for custom fabrics transport. Hence we can use spdk_nvme_ctrlr_is_fabrics() safely in nvme_ctrlr_update_nvmf_ioccsz(). Before this change, in the unit tests, ctrlr->trid.trtype was set to zero at initialization. After this change, for most cases, spdk_nvme_ctrlr_is_fabrics() should return false for most cases. SPDK_NVME_TRANSPORT_PCIE did not work. Hence, initialize ctrlr->trid.trtype by SPDK_NVME_TRANSPORT_CUSTOM_FABRICS instead. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I4bedcab4a9f2876c1c9463ff10ad0966754f1713 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13948 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-08-12 08:59:52 +00:00
Shuhei Matsumoto	cd65512d08	nvme_rdma: Fix assertion for rqpair->current_num_sends/recvs assert() in nvme_rdma_queue_recv_wr() was wrong and assert() in nvme_rdma_cq_process_completions() was missing. This patch fixes both. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Signed-off-by: Denis Nagorny <denisn@nvidia.com> Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: Ied057d75dbfd9e54ce3c3671355b9ec3acad7ff5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13597 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-08-12 08:59:43 +00:00
Shuhei Matsumoto	41bb31a36d	nvme_rdma: Replace rdma_dereg_mr() by ibv_dereg_mr() rdma_reg_msgs() was replaced by ibv_reg_mr() recently to support persistent PD per RDMA device. The difference between rdma_dereg_mr() and ibv_dereg_mr() is only return value and errno. For consistency, replace rdma_dereg_mr() by ibv_dereg_mr(). Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I55e0743690e74f9510863bfa122a75d0632dce4e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13949 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-08-12 08:59:43 +00:00
Shuhei Matsumoto	d75daea532	nvme_rdma: Use persistent protection domain for qpair Get a PD for the device from the PD pool managed by the RDMA provider when creating a QP, and put the PD when destroying the PD. By this change, PD is managed completely by the RDMA provider or the hooks. nvme_rdma_ctrlr::pd was added long time ago but is not referenced anywhere. Remove nvme_rdma_ctrlr::pd for cleanup and clarification. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: If8dc8ad011eed70149012128bd1b33f1a8b7b90b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13770 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-08-12 08:59:43 +00:00
Shuhei Matsumoto	a26d74173e	nvme: Increase major SO version An earlier commit added ctrlr_ready into struct spdk_nvme_transport_ops. However, the major SO version was not increased. Fixes: `3dd0bc9e` (nvme: Add transport controller ready step) Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Id903634f9aaf5bdaa62fd30e92a4fb39a985b86f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13981 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-08-11 19:16:32 +00:00
Evgeniy Kochetov	3dd0bc9e09	nvme: Add transport controller ready step This step allows custom transports to perform extra actions or checks at controller initialization and fail initialization if required. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: Ic7cadae5398a35903917ceace3828f4371be63a3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12631 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-08-04 07:29:03 +00:00
Shuhei Matsumoto	4f2f1aa9c5	nvme_rdma: Use pd of rdma_qp instead of default pd of cm_id This is another preparation to create and use ibv_context and pd. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Signed-off-by: Denis Nagorny <denisn@nvidia.com> Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: Id594fa1ccb2daf535b1aaaef0a397bda2ec98578 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13710 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-08-02 07:39:41 +00:00
Shuhei Matsumoto	a3a51453b8	nvme_rdma: Pass pd instead of cm_id to nvme_rdma_reg_mr() The following patches will create and use ibv_context and pd explicitly instead of using default ibv_context and pd created by rdmacm. As a preparation, pass pd instead of cm_id to nvme_rdma_reg_mr(). Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Signed-off-by: Denis Nagorny <denisn@nvidia.com> Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: Ifdcd18ed363b8ba4a23a920bf3559237e38821c6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13599 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-08-02 07:39:41 +00:00
Konrad Sztyber	a818564374	nvme: check CSTS.CFS when initializing ctrlrs If Controller Fatal Status (CFS) bit is set, there's no point in waiting for CSTS.RDY and the only way to move forward with the initialization is to perform a controller reset. This fixes issues with test/nvme/sw_hotplug.sh when running under qemu. It seems that during that test, qemu marks the emulated NVMe drives as fatal, so if we didn't check CSTS.CFS, the initialization would time out. Fixes #2201. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I97712debc80c3dd6199545d393c0f340f29d33b2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13820 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michal Berger <michal.berger@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-08-02 07:37:04 +00:00
Changpeng Liu	673c8a65e1	nvme: remove `nvme_ctrlr_init_ana_log_page` function The function `nvme_ctrlr_init_ana_log_page` is exactly same with `nvme_ctrlr_update_ana_log_page`, so remove it. Change-Id: I1ad51635f47cf95cfa6de217e3b9144885c3b74e Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13652 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-07-28 07:07:31 +00:00
Evgeniy Kochetov	3b26e2c594	nvme/rdma: Create poller and CQ on demand Original implementation creates pollers and CQs for all discovered devices at poll group creation. Device (ibv_context) that has no references, i.e. has no QPs, may be removed from the system and ibv_context may be closed by rdma_cm. In this case we will have a CQ that refers to closed ibv_context and it may crash in ibv_poll_cq. With this patch pollers are created on demand when we create the first QP for a device. When there are no more QPs on the poller, we destroy the poller. This also helps to avoid polling CQs that don't have any QPs attached. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I46dd2c8b9b2902168dba24e139c904f51bd1b101 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13692 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-07-22 07:27:22 +00:00
Changpeng Liu	c88345ab3d	nvme: apply `nvme_pcie_poll_group_get_stats` to vfio-user Both PCIE and VFIO-USER can use the same APIs to get IO queue pair statistic data, so merge them here. Change-Id: Iadf9ead2bd5abaf11d2ef5d1884acb67369f85bb Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13538 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-07-22 06:43:35 +00:00
Changpeng Liu	dbecab8da0	nvme/pcie: make `nvme_pcie_ctrlr_delete_io_qpair` call trace multi-process safe When a secondary process exit without deleting allocated IO queue pair, then a new secondary process will do cleanup for previous allocated queue pair, then segment fault will happen due to `stat` inside IO queue pair data strucutre can't be accessed in this cleanup process. Fix issue #2565. Change-Id: I01a037642683901941b5268ac20d17b78b6c6350 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13537 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-07-21 08:11:50 +00:00
GangCao	0b92da6c48	NVMe/TCP: explicitly initialize the cpl structure To fix the Klocwork issues. Change-Id: Ib9e490cd3f2140a1c2f86300979efd604054b972 Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13695 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-07-18 10:16:29 +00:00
Alexey Marchuk	3512714b3f	nvme_fabrics: Lock mutext when prcessing set/get regs That is possible to get/set registers from any thread, during regs processing we are polling admin qpair to get a completion. At the same time, another thread can also poll admin qpair and that can lead to undefined behavior. This patch fixes an issue when bdev_nvme is configured with io_timeout. If remote target becomes unresponsive (e.g. due to link down), IO timeout occurs and bdev_nvme tries to get csts registers in timeout_cb. At the same time another thread can process adminq, so we may have 2 simultaneous adminq polls. If admin qpair is disconnecting at that time (RDMA transport) we may destroy resources twice from different threads. We don't see a problem with set_regs function but it won't be redundant to lock mutex in set_regs as well. Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com> Change-Id: I7ec3984d25d0249061005533d13b22315b44ddf2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13687 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-07-15 16:06:54 +00:00
Changpeng Liu	ac31590b37	nvme: make `spdk_nvme_ctrlr_free_io_qpair` multi-process safe In the multi-process case, a process may call `spdk_nvme_ctrlr_free_io_qpair` on a foreign I/O qpair (i.e. one that this process did not create) when that qpairs process exits unexpectedly. The variable `qpair->poll_group` isn't multi-process safe, we can't use it in `spdk_nvme_ctrlr_free_io_qpair` and related transport poll group APIs. Change-Id: Ic13a6a2c7d760477be5be5a56a45caa2b5518717 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13573 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-07-11 07:41:09 +00:00
Jim Harris	a6704e454c	nvme: put rdma req in nvme_rdma_req_complete All of the callers immediately put the req right after the nvme_rdma_req_complete call, so just move the put into that function instead. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic370cf689850924e0c902a6071af8b3a7ed58c0b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13527 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	e415bf0033	nvme: add cmd/cpl printing for rdma errors This follows similar logic in the pcie and tcp completion paths, including omitting error messages when aborting aers by adding a print_on_error parameter to the completion function. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Id558d0af2cdd705dfb60abb842bd567a0949ccce Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13525 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	05dce1ee78	nvme: don't try to enable intel log pages on fabrics ctrlrs By default, the SPDK nvmf target reports vid==INTEL, which results in the SPDK nvme driver trying to enable Intel vendor-specific log page. Fix this by trying to enable those log pages only for PCIE transport controllers. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I78ebf365d4fa6295d1f610697266c3ead765988d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13524 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	988ce2ecaa	nvme: use assert for INTEL_VID check on log pages We can only get to this code path if the controller has vid==INTEL, so make that more clear by changing the check to an assert. Remove unit test that calls nvme_ctrlr_construct_intel_support_log_page_list() for a controller that is not VID==INTEL - this is no longer valid. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I3b58451bc95992bf641e7452f0ac4c2bac9fe31c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13523 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	4a24f581d6	nvme: add cmd/cpl printing for tcp errors This follows similar logic in the pcie completion path, including omitting error messages when aborting aers by adding a print_on_error parameter to the completion function. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I96df72280bb8fcbee3847fdc27f38e14a1bf3251 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13522 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	21d15cb043	nvme: cache values in nvme_tcp_req_complete nvme_tcp_req_complete_safe caches values on the request, so that we can free the request before completing it. This allows the recently completed req to get reused in full queue depth workloads, if the callback function submits a new I/O. So do this nvme_tcp_req_complete as well, to make all of the completion paths identical. The paths that were calling nvme_tcp_req_complete previously are all non-fast-path, so the extra overhead is not important. This allows us to call nvme_tcp_req_complete from nvme_tcp_req_complete_safe to reduce code duplication, so do that in this patch as well. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I876cea5ea20aba8ccc57d179e63546a463a87b35 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13521 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Jim Harris	d1179a5801	nvme: put req in nvme_tcp_req_complete All callers of nvme_tcp_req_complete call nvme_tcp_req_put immediately afterwards, so move this call into nvme_tcp_req_complete. This will help enable some improvements in later patches. Note that nvme_tcp_req_complete_safe has this same functionality open coded right now, but that will get changed in the next patch. It calls nvme_tcp_req_put immediately after the TAILQ_REMOVE, so do that in nvme_tcp_req_complete as well. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I368122bc49a7f0772e3011e5427e3c43618380eb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13520 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-07-04 07:23:13 +00:00
Shuhei Matsumoto	4be6d30438	nvme: Add ctrlr_abort_queued_aborts() into qpair_abort_all_queued_reqs() nvme_qpair_abort_all_queued_reqs() aborts error injections, queued requests, aborting queued requests, and outstanding requests. (Aborting outstanding requests depends on transports.) However, it did not abort queued aborts. Include nvme_ctrlr_abort_queued_aborts() into nvme_qpair_abort_all_queued_reqs() to do really the name of the function indicates. nvme_ctrlr_abort_queued_aborts() has been called in a few cases, but we do not care duplication. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I19102cc6603a72ce5c398a7947cb4d606b692991 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12849 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot	2022-06-30 07:51:23 +00:00
Ben Walker	8dd1cd2104	check_format: For C files only, fix return type breaks In SPDK, declarations have the return type on the same line. Definitions have the return type on a separate line. Astyle has an option for enforcing this. Unfortunately, it seems to have two bugs: 1) It doesn't work correctly at all on C++ files. 2) It often fails on functions that return enums, or long type names Deal with 1) by adjusting the check_format.sh script to only tell astyle to fix return type line breaks for C files and not C++. Deal with 2) by adding a few typedefs to work around the problem. Change-Id: Idf28281466cab8411ce252d5f02ab384166790c6 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13437 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>	2022-06-27 09:33:48 +00:00
Shuhei Matsumoto	ceaa4ee0f7	nvme: Increment ctrlr->outstanding_aborts when aborting req in ctrlr->queued_aborts We had not incremented ctrlr->outstanding_aborts when aborting a request in the ctrlr->queued_aborts, and ctrlr->outstanding_aborts became negative. Fix the bug in this patch. Additionally add assert to check if ctrlr->outstanding_aborts is not negative. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I58090286f070ba854bdea87f0f8ecb7810890338 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13452 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-06-24 07:22:36 +00:00
Sebastian Brzezinka	14ecc7787d	nvme: Complete pending register operations first Fully asynchronous ctrlr detach (`b6ecc3729`) introduce a register operation state machine that waits for operation to complete. When controller failed to initialize, `nvme_ctrlr_fail` set qpair state to `DISCONNECTED` immediately, causing qpair process completions to never complete register operations therefore prevent async detach exit. Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Change-Id: I205c5157b8ea7b4535f98ff4052414310e421446 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12858 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-06-20 10:00:17 +00:00
Richael Zhuang	4295661eb8	nvme_tcp: fix bug about qpair stuck in CONNECTING state When running perf test, sometimes after CONNECT req's resp was received and processed, the qpair still failed to change from state CONNECTING to CONNECTED. For when it goes to nvme_fabric_qpair_connect_poll -> nvme_wait_for_completion_robust_lock_timeout_poll to process the CONNECT req's resp, the req may have not been finished in sock_check_zcopy, although its resp has been received and processed, which means the tcp_req->ordering.bits.send_ack is still 0 and the status->done still is false. And after the req is completed in sock_check_zcopy, we need to poll this qpair again to make the state enter CONNECTED. And if icreq's resp received and processed before nvme_tcp_send_icreq_complete is called by _sock_check_zcopy, the qpair will be stuck in CONNECTING and it never proceed to send the CONNECT req. We also need to put it in pgroup->needs_poll to fix it. I can reproduce this bug with the following configuration. target: 16NVMe SSD, running on 20 cores; initiator: randread test using nvme perf with 32 cpu cores and zerocopy enabled. The error doesn't always occur. CONNECT failure is about 1 failure in ten with the following log. And icreq failure is less frequent with only target side's "keep alive timeout" log. Error reported in initiator side: Initialization complete. Launching workers. [2022-05-23 14:51:07.286794] nvme_qpair.c: 760:spdk_nvme_qpair_process_completions: ERROR: CQ transport error -6 (No such device or address) on qpair id 2 ERROR: unable to connect I/O qpair. ERROR: init_ns_worker_ctx() failed And target side shows: Disconnecting host from subsystem nqn.2016-06.io.spdk:cnode2 due to keep alive timeout Change-Id: Id72c2ffd615ab73c5fc67d36c3ff8b730cebcef7 Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12975 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-06-14 09:18:04 +00:00
Jim Harris	488570ebd4	Replace most BSD 3-clause license text with SPDX identifier. Many open source projects have moved to using SPDX identifiers to specify license information, reducing the amount of boilerplate code in every source file. This patch replaces the bulk of SPDK .c, .cpp and Makefiles with the BSD-3-Clause identifier. Almost all of these files share the exact same license text, and this patch only modifies the files that contain the most common license text. There can be slight variations because the third clause contains company names - most say "Intel Corporation", but there are instances for Nvidia, Samsung, Eideticom and even "the copyright holder". Used a bash script to automate replacement of the license text with SPDX identifier which is checked into scripts/spdx.sh. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iaa88ab5e92ea471691dc298cfe41ebfb5d169780 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12904 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: <qun.wan@intel.com>	2022-06-09 07:35:12 +00:00
Heinrich Schuchardt	72b5626d33	nvme/pcie: memory barrier for RISC-V Play it safe and add the same memory barrier in nvme_pcie_qpair_process_completions() as for ppc64. Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Change-Id: I7079b4769d30106387ef4549495a72b7fea6a77a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12879 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-06-06 07:34:27 +00:00
MengjinWu	bb33310aa0	nvmf: remove XOR in nvme_tcp_pdu_calc_data_digest Prepare for the later patch, and make the later patch code clean Signed-off-by: MengjinWu <mengjin.wu@intel.com> Change-Id: I12b175c86a5245f38dc76fe2d3918ec4b30a475a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12830 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>	2022-06-02 08:16:38 +00:00
Konrad Sztyber	1f3bd08fa0	nvme/tcp: check tcp_req for NULL in pdu_payload_handle For a C2HTermReq PDU, there's no associated tcp_req, so we need to check it for NULL before dereferencing it. Also, while here, moved some of the assignments to the declarations to reduce the number of boilerplate lines. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Iac05ef0ba605e2f40d0026ad1b131c28d29f7314 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12845 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-06-01 08:56:58 +00:00
Jim Harris	64df311eba	nvme: add KEYED_DATA_BLOCK to sgl_types This SGL type was missed in the original commit that added the pretty printing. Fixes: `4d9ab1e9a1` ("nvme: pretty print dptr") Reported-by: Ramanjaneya Burugula <burugula@gmail.com> Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ibc655db4e65009071f39f55f691c94a094cea0bc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12705 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-25 07:43:03 +00:00
Or Gerlitz	9b5dabff7f	nvme/rdma: Always use spdk allocation scheme Use the conventional huge-pages based spdk allocation scheme for the initiator data-structures unconditionally. Change-Id: I5baee7614e3ac9b5497b3d771dfddfbaa7fdf65b Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12687 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-25 07:42:47 +00:00
Shuhei Matsumoto	51e897c42e	nvme: Abort queued requests even if they are children of a large I/O A iterator function nvme_request_add_abort() covers not only a small I/O request but also children of a large I/O. However nvme_qpair_abort_queued_reqs_with_cbarg() did not check the latter. check if cmd_cb_arg matches not only req->cb_arg but also req->parent_cb_arg. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I015e29b0a8f58920b9a13081330a94f9dd976a45 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12557 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-05-20 09:19:07 +00:00
Shuhei Matsumoto	09c7c76876	nvme: Set I/O qpairs to failed only if reset is synchronous For PCIe transport, we need to stop any activity of the controller before deleting I/O qpair resource in a controller reset sequence. However, we set I/O qpairs to failed before disabling a controller. In the NVMe bdev module, this caused disconnected qpair callback to delete I/O qpairs before disabling the controller. Hence, change the code slightly to set I/O qpairs to failed only if reset is synchronous to keep backward compatibility. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ica71aad0a1dabce45616dfdfff5f11b07131bbd1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12736 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-20 09:17:28 +00:00
Shuhei Matsumoto	64454afb7c	nvme: disconnect() sets and reconnect_async() clears prepare_for_reset The following patches swaps the ordering of destrloying I/O qpairs and disconnecting a controller for PCIe transport. prepare_for_reset is a flag for PCIe transport. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I3009de9fea089fc93ecf87adba42e85c9a77e715 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12582 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-19 08:23:57 +00:00
Shuhei Matsumoto	736b9da034	nvme: Do Controller Level Reset when disconnecting adminq for PCIe As described in the previous patches, we need to delete all I/O SQ/CQs before aborting trackers when disconnecting a controller. The following patches reorder the operations. This patch changes adminq disconnection to initiate a Controller Level Reset and adminq completion processes it if ctrlr->is_disconnecting is true. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I64f06bae2ce8a9127124029fd042db0028198e3c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12560 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>	2022-05-19 08:23:57 +00:00
Ben Walker	813756e75e	nvme: Do not abort transport commands when disconnecting a qpair Make this a transport-level decision instead. TCP and RDMA do want to abort, but PCIe cannot because these commands may still be receiving DMA operations from the device. Change-Id: I305acddc3819c903eb3217e8f710d4216d0b3931 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11509 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2022-05-19 08:23:57 +00:00
Shuhei Matsumoto	bdc9fa832d	nvme: Add helper functions to do a Controller Level Reset (Set CC.EN to 0) Previously, we did not do any Controller Level Reset when disconnecting the admin qpair. However, for PCIe transport, we need to stop any activity of the controller, i.e., delete all I/O SQ and CQs before nvme_transport_ctrlr_disconnect_qpair_done() calls nvme_transport_qpair_abort_reqs() (i.e., nvme_pcie_qpair_abort_trackers()). Otherwise, some corruption may occur because completed I/Os may still be in progress on the NVMe device. Not to change any public API, nvme_pcie_ctrlr_disconnect_qpair() is a convenient place to initiate a Controller Level Reset because it is called from spdk_nvme_ctrlr_disconnect(). Then nvme_pcie_qpair_process_completions() can process it until completion. However, necessary functions are not accessible from PCIe transport. This patch adds two helper functions and guards us from some undesirable behaviors because it was not assumed that nvme_ctrlr_process_init() is called from the completion context and ends in the middle of transition. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I3d986e94ba71b83beeff7e75cf92033b5fa6f075 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12559 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-05-19 08:23:57 +00:00
Alexey Marchuk	622ceb7f07	nvme/rdma: Use rdma qpair as cm_id context It simplifies code and removes cast of nvme_qpair to rdma_qpair Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I363246cf9d8c9cbafd48b26facdb5cc37fdd8e67 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12701 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-05-18 00:34:29 +00:00
Alexey Marchuk	1003e28623	nvme/rdma: Fix qpair destroy/disconnect race When qpair is attached to a poll group, disconnect process is async - we are waiting for the DISCONNECTED event from rdmacm to destroy rdma resources. However the user (nvme_perf) can destroy qpair immediatelly, so memory allocated for qpair is freed but rdma resouces are still allocated. That means that we may receive rdmacm event (DISCONNECTED) for the destroyed qpair, that leads to use-after-free. To fix this problem, add a check for internal qpair state when qpair is destroyed, if disconnect is not finished, then we forcefully destroy rdma resources. Fixes issue #2515 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reported-by: Or Gerlitz <ogerlitz@nvidia.com> Change-Id: I7bfa53c9f6fe6ed787323a8941f1f2db17ea0c20 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12700 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-05-18 00:34:29 +00:00
Alexey Marchuk	007fb1d3cb	nvme: Fix keyed/unkeyd SGL nvme cmd dump Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I0a08518b5c30455a17158aa440715515d0c066fc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12133 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-05-17 20:11:43 +00:00
Shuhei Matsumoto	5e5423de93	nvme: Add DISABLED to ctrlr's state to show completion of Controller Level Reset In the following patches, nvme_ctrlr_process_init() will be used to disable the controller when disconnecting the admin qpair for PCIe transport. In this case, we will have to exit nvme_ctrlr_process_init() after CSTS.RDY is 0. However, spdk_nvme_ctrlr_reset() and spdk_nvme_ctrlr_reconnect_poll_async() have to continue nvme_ctrlr_process_init() until the controller becomes ready. To differentiate stop and continue clearly, add a new state NVME_CTRLR_STATE_DISABLED to enum nvme_ctrlr_state. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic0a5fb7114d4eeb1cefec28bc404184768fb0a96 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12613 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-05-12 07:28:02 +00:00
Changpeng Liu	4e241cba01	nvme/quirks: don't use SGL for Huawei SSDs We see reports that Huawei SSDs can't handle hardware SGL properly, it requires additional alignment, so add a quirk here to force Huawei SSDs use PRP instead. Fix #2489. Change-Id: I20a57e754bc6ff8666d681191994818f2192decc Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12405 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Dong Yi <dongx.yi@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-05-02 20:00:35 +00:00
Alex Michon	f89cf818c0	nvme/pcie: Fix doorbell delay with fuse operations When sending the first part of a fuse command, we set the first_fused_submitted flag so that we don't ring the doorbell immediately. When the second part is sent, we ring the doorbell for both commands. However, this doesn't work well when we use the option to delay ringing the doorbell. We send both parts, then later when we try to ring the doorbell, we don't because of the first_fused_submitted flag from the first command. Replace this mechanism by keeping track of the last submitted fuse. Change-Id: Ia4ac9b3ce9c319ee4c7e42f86eadda93dac85fca Signed-off-by: Alex Michon <amichon@kalrayinc.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12182 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>	2022-04-27 07:36:20 +00:00
Alexey Marchuk	b0f4249c59	nvme/rdma: Add async set/get registers Now controller initialization with RDMA transport is fully async Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I26e857740d3137d0b0e987facc81fc5f6ef81f2b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10756 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-04-22 09:44:57 +00:00
Shuhei Matsumoto	dbe7e74cee	nvme: Change nvme_qpair_abort_queued_reqs() to set SC_ABORTED_SQ_DELETION Transport specific qpair_abort_reqs() set SC to SC_ABORTED_SQ_DELETION. However, nvme_qpair_abort_queued_reqs() set SC to SC_ABORTED_BY_REQUEST even if its call is not requested by the upper layer. Change nvme_qpair_abort_queued_reqs() to set SC to SC_ABORTED_SQ_DELETION for consistency. nvme_qpair_abort_queued_reqs() is used to abort queued requests that were sent while adminq was connecting. SC_ABORTED_SQ_DELETION will not be so bad even for the case. This change is required for the NVMe bdev module to be resilient for I/O error. The NVMe bdev module does not retry I/O if SC is SC_ABORTED_BY_REQUEST. SC is set to SC_INTERNAL_DEVICE_ERROR if a request is failed to submit to qpair by a generic qpair layer. We can change it to SC_ABORTED_SQ_DELETION as well but we keep this for now. SC_INTERNAL_DEVICE_ERROR is also retriable for the NVMe bdev module. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I7d8d5e97b222fe9275afc4fed024c1654c9579a2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12121 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-22 09:44:57 +00:00
zhangduan	31db7b139b	nvme_tcp: set transport_ack_timeout to ack_timeout The value of ack_timeout is calculated according to the formula 2^(transport_ack_timeout) msec. Signed-off-by: zhangduan <zhangd28@chinatelecom.cn> Change-Id: I5a938635d70693ddd405fa5907555bb745b4df0f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12215 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-20 08:21:42 +00:00
Konrad Sztyber	aa21240574	nvme/pcie: increase min admin queue size to 256 Now that IO qpairs can be created asynchronously, we need to make sure that all the create IO CQ/SQ commands can be executed simultaneously. It is pretty common to create multiple IO qpairs at the same time, e.g. adding an NVMe bdev to an nvmf subsystem will create an IO qpair on each poll group. In that case, if the number of cores exceed the size of the admin queue (actually it can be even lower due to outstanding AERs), we might run out nvme_requests on the admin queue. The chosen minimum value for the admin queue size, 256, should be enough to cover most cases. Fixes #2465 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I55c59aef64f3fdb33f7b4824d3e9beb403602633 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12270 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-19 08:18:34 +00:00
Shuhei Matsumoto	2c13441ba8	nvme_rdma: Destroy qpair after qpair is actually disconnected The RDMA transport can disconnect qpair asynchronously now. Previously, we tried to release the resource of the qpair after disconnected. However it did not work because it was done when deleting the qpair. The admin qpair was not deleted in a ctrlr reset sequence. This patch tries to satisfy the same aim again but by a different way. Previously, we released the resource of the qpair before starting actual disconnection process. This patch release the resource of the qpair after the qpair is actually disconnected. The related patches are: `b9518a5540` `eb09178a59` Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Id6a814895a35b1589b781a91744ef872b42aaa69 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11783 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	4b73223542	nvme_rdma: Wait until lingering qpair becomes quiet before completing disconnection The code to handle the lingering qpair when deleting it was really complicated. The RDMA transport can connect or disconnect qpair asynchronously. Then we can include the code to handle the lingering qpair into the code to disconnect qpair now. If the disconnected qpair is still busy, defer completion of the disconnection until qpair becomes idle. If poll group is not used, we can complete disconnection immediately because cq is already destroyed. The related data and unit test cases are not necessary anymore. So delete them in this patch. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	20cf90801e	nvme_rdma: Handle stale connection asynchronously Include delayed disconnect/connect retries with finite times into the state machine of asynchronous qpair connnection. We do not need to call back to the common transport layer but we need to do the following, clear rqpair->cq before starting disconnection if qpair uses poll group, and clear qpair->transport_failure_reason after disconnected. Additionally locate the new state STALE_CONN before INITIALIZING because cq is not ready to use for admin qpair when the state is STALE_CONN. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ibc779a2b772be9506ffd8226d5f64d6d12102ff2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11690 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	77c4657140	nvme_rdma: Factor out destroying rdma qpair operation Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I18e166a726cca69f13e7c5818eba57f478726286 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11689 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	aa36c18196	nvme_rdma: Pass callback to ctrlr_disconnect_qpair() via a parameter Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I06cbb9739286d1928ad9fc07de3715a449914d75 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11688 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	75d38a301d	nvme: poll_group_process_completions() returns -ENXIO if any qpair failed TCP transport already does it but was not documented clearly. RDMA and PCIe transports follow it and document it clearly. Then we can check each qpair's state if spdk_nvme_poll_group_process_completions() returns -ENXIO before disconnected_qpair_cb() is called. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2afe920cfd06c374251fccc1c205948fb498dd33 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11328 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	9717b0c3df	nvme_rdma: Connect and disconnect qpair asynchronously Add three states, INITIALIZING, EXITING, and EXITED to the rqpair state. Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it to opts->async_mode for I/O qpair and true for admin qpair. Replace all nvme_rdma_process_event() calls by nvme_rdma_process_event_start() calls. nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING when starting to process CM events. nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is not admin qpair. nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true or qpair->poll_group is not NULL before polling CM events, or polls CM events until completion otherwise. Add comments to clarify why we do like this. nvme_rdma_poll_group_process_completions() does not process submission for any qpair which is still connecting. Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Changpeng Liu	c47b7b0276	nvme/vfio-user: use API to setup BAR0 doorbells We can use lib/vfio-user API to setup BAR0 doorbells, existing implementation is redundant. Change-Id: Ib880d167c84c6b8482bf1a35559a34c939f6a02d Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12211 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-12 07:24:22 +00:00
Tomasz Zawadzki	6301f8915d	lib/sock: provide a hint to picking optimal poll group The process of matching qpair to poll group is split into two distinct parts that occur on different threads. See spdk_nvmf_tgt_new_qpair(). This results in a race condition for TCP between spdk_sock_map_lookup() and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group() and spdk_nvmf_poll_group_add() respectively. Fixes #2113 This patch picks a hint from nvmf_tcp for next poll group, which is then passed down to spdk_sock_map_lookup(). When matching placement_id exists, but does not have a poll group assigned - the hint will be used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-01 12:41:26 +00:00
Shuhei Matsumoto	0a61427ecc	nvme_rdma: Start qpair after resolving address and route when poll group is used Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I0b0f314c98368247582f2dfcaf69f78e24d715f9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11366 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	531c1b0f04	nvme_rdma: Make nvme_rdma_process_event() asynchronous Separate nvme_rdma_process_event() into nvme_rdma_process_event_start() and nvme_rdma_process_event_poll(). Use nvme_rdma_process_event_start() and nvme_rdma_process_event_poll() in nvme_rdma_process_event() to ensure compatibility. Change-Id: Idc960fab2540efec612dcf22f156acabd2e2874e Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10594 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	791ee7deb4	nvme_rdma: nvme_rdma_process_events() returns negated errno It will be convenient for the following patches to return negated errno directly. Change-Id: Ic80181b2ee449946dd60ad0c97a325fd48b92231 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10990 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	cf7f253302	nvme_rdma: Add callback to nvme_rdma_process_event() Change-Id: I66aa89dc54d5aaedbe2f06239cbf04aeeb2c739e Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11359 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	bcf0845727	nvme_rdma: Make CM event operations callback functions Change-Id: I9f2551a07187400dd9ef624348cd465e64557e1b Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11138 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	e5927c02e9	nvme_rdma: Remove cm_channel param from process_event() nvme_rdma_poll_events() gets the cm_channel pointer itself. Before calling nvme_rdma_process_event(), we checks the rctrlr is valid. Hence we do not have to pass the cm_channel pointer to nvme_rdma_process_event() via a parameter. This simplifies the code and makes the following patches a little easier. Change-Id: I03f095833469c5b64592264d63a592106d49e13b Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11167 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:28:45 +00:00
Shuhei Matsumoto	29974dc882	nvme_rdma: Make fabric_qpair_connect() asynchronous Replace nvme_fabric_qpair_connect() by nvme_fabric_qpair_connect_async() and nvme_fabric_qpair_connect_poll(). The following is a detail. Define state of the nvme_rdma_qpair and each rqpair holds it. Initialize rqpair->state by INVALID at nvme_rdma_ctrlr_create_qpair(). _nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to FABRIC_CONNECT_SEND instead of calling nvme_fabric_qpair_connect(). Then the new function nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_fabric_qpair_connect_async() at FABRIC_CONNECT_SEND and nvme_fabric_qpair_connect_poll() until it returns 0 at FABRIC_CONNECT_POLL. nvme_rdma_qpair_process_completions() or nvme_rdma_poll_group_process_completions() calls nvme_rdma_ctrlr_connect_qpair_poll() if qpair->state is CONECTING. This patter follows the TCP transport. Change-Id: I411f4fa8071cb5ea27581f3820eba9b02c731e4c Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11334 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-01 08:28:45 +00:00
Evgeniy Kochetov	a2d4ddb3b1	nvme: Prioritize user provided trstring for transport lookup This patch fixes the issue with custom nvme transport. It is possible to register custom nvme transport with arbitrary name but it is not usable because 'spdk_nvme_trid_populate_transport' call in probe function will always set trstring to 'CUSTOM' and transport lookup will fail. Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I83fd24dd8732ac0a21e22435e0acff20ab0e7521 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9557 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-31 10:31:20 +00:00
Jim Harris	7039639063	nvme: read full discovery page after reading header Some targets report they support log page offset, but then fail GET_LOG_PAGE commands that specify a non-zero offset, or report the wrong number of discovery entries when reading more than the discovery log page header but not the entire log page. So just revert to reading the entire discovery log page, after we've read the header to know how big the log page will be. This means that when we read the log page initially (without the individual entries), we need to save off the genctr, since it will get overwritten when we read the log page again. We can just store this in the discovery context, and compare it to the genctr that we read with the whole log page. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I34929253312fed9924db58904a051f3979283730 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11478 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-03-28 17:10:04 +00:00
Alexey Marchuk	94494579ce	nvme_rdma: Update reportring of RDMA responder resources responder_resources parameter of rdma cm tells remote side how many outstaing RDMA_READ of atomic operations local side can handle. Previously it was adjusted on queue depth but that was not correct since these parameters do not depend on each other. Even with qdepth=1 remote side may send several RDMA_READ operations per 1 IO request. With this change we report responder_resources equal to the maximum supported by RDMA device. Linux kernel nvme rdma driver reports this value in the same way. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I77e5c2ead6269da44c32a75a9188429f50d32ae4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11698 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-03-25 08:18:37 +00:00
Krzysztof Karas	a9a55513e5	nvme_ctrlr.c: Add error logs Add NVME_CTRLR_ERRLOGs to nvme_ctrlr_process_init(). The main goal is to help with debugging #2201 issue. Change-Id: I1ae6a9b30d6124dfe25eb7912402c37d476b0d4c Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10627 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-03-24 09:57:17 +00:00
Shuhei Matsumoto	6a89f75ec7	nvme_rdma: Remove handling stale connect The feature will be redesigned and restored in the following patches. For the NVMe bdev module, it can reconnect by itself without relying on the feature. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2d9c0437f7ad8412ad8cf40d11e574723b735bee Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11440 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-03-21 10:49:11 +00:00
Shuhei Matsumoto	0c77cf90bf	nvme_rdma: Consolidate fail_qpair() calls into a single place For nvme_rdma_qpair_process_completions(), consolidate the operations to call nvme_rdma_fail_qpair() and return -ENXIO into a single place. Besides, shorten pointer references for nvme_rdma_qpair_process_completions() and nvme_rdma_poll_group_process_completions(). These will make the following patches a little easier. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Iaf72cfca0b5b3ba223d86e267da8069d43a15292 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11439 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-03-21 10:49:11 +00:00

1 2 3 4 5 ...

1727 Commits