ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Jim Harris	d6f6ffd274	nvme: add NVME_CTRLR_STATE_CONNECT_ADMINQ Connect the adminq as part of controller initialization instead of controller construction. We never actually 'connected' the adminq for PCIe or vfio-user transports, since its a nop. But their connect_qpair transport ops function is also a nop for the adminq, so it's fine to generically connect the adminq across all transports. Note that we cannot read registers (cc or csts) during controller initialization now until after the adminq has been connected since reading fabrics registers depends on a connected adminq. This gets special cased for now, but eventually reading cc and csts will need to be part of the state machine itself to make it asynchronous. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ia5566d7c549d78d24b94ea253df51e697da6237f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8079 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-06-01 07:43:12 +00:00
Ziye Yang	252430a053	nvme_tcp: Correctly handle the data digest err According to NVMe-oF 1.1 spec, it is not a fatal error. So according to Figure 126 in NVMe Base specification, we should return "Transient Transport Error". Change-Id: I601304ae2bb24508882fb1ec8c7e53ec587ab515 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7795 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-05-31 07:15:16 +00:00
Jim Harris	f5ba8a5ef5	nvme: add NVME_CTRLR_STATE_READ_CAP Read CAP (Capabilities) register as part of controller initialization instead of controller construction. For now, still read CAP in the pcie and vfio-user controller construction, since they need the drstd (doorbell stride) to construct the admin queue. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I000fe880f2ec0d6de1d565c883d7ea0ae1ac2c81 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8078 Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-05-28 08:14:06 +00:00
Jim Harris	df01076f70	nvme: add NVME_CTRLR_STATE_READ_VS Read VS (Version) register as part of controller initialization instead of controller construction. This prepares for upcoming changes to make controller attach fully asynchronous. Since reading fabrics registers is an asynchronous operation, it will be easier to read the VS register as part of controller initialization which operates as an asynchronous state machine. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I771386dbdf5902633e0d9f91b3b20be98f26fdc3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8076 Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-05-28 08:14:06 +00:00
Ziye Yang	2250abaeca	nvme/tcp: Raname send_pdu to pdu in tcp_req. Since we will reuse send_pdu for other purpose in the next patch. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Iee5166131b70a25bc13aaa847bfc9066231f31a9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8028 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot	2021-05-26 09:20:15 +00:00
Ziye Yang	9776b89444	nvme/tcp: Fix the bug when doing offloading. For nvme/tcp connection, we use the synced manner if the qpair is not fully connected. Thus without the check, we will stuck here. And this patch fixes this issue. Change-Id: I72815bf5b4c0b31c4866bc1b9034b0e42b81d3f1 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8025 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot	2021-05-26 09:20:15 +00:00
Ziye Yang	00b0dc6624	nvme/tcp: Do not offload header crc32c calculation if header digest is enabled. The header size is very small, which does not have too much value to offload such calculation by hardware. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Iaa82f39312df7eef3282325a33677ea41ab735ab Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8011 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-05-25 07:12:43 +00:00
Ziye Yang	bcbccf8bb5	lib/nvme_tcp: Refactor the code to generate _nvme_tcp_pdu_payload_handle The purpose is to prepared for implement the async crc32 caculation in the future patch. Change-Id: Ia75f28154c49f08b527d48c63b9da79a6bdfede8 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7794 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-05-10 11:12:57 +00:00
Ziye Yang	82e4bfd346	nvme/tcp: Change the type of recv_pdu to pointer. This is prepared for using the hardware offloading engine in accel framework. And some fields in nvme_tcp_pdu needs to be DMA addressable. Change-Id: I75325e2cd7ff25fe938bea0ac9489a5027e3e0e9 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7770 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2021-05-07 11:41:24 +00:00
Ben Walker	42b47742de	nvme/tcp: Only flush socket if not part of poll group If the qpair is part of a poll group, the socket will get flushed as part of polling that group already. We only need to explicitly flush to handle the case where the qpair is not in a poll group. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ib2a510b6d26d1622950437d81e0a40f6b15d6b54 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7049 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot	2021-04-19 12:54:24 +00:00
Ben Walker	6b86039fd9	nvme/tcp: Ensure qpair is polled when it gets a writev_async completion There was a fix for this that went into the posix layer, but the underlying problem is the logic in the nvme/tcp transport. Attempt to fix that instead. Change-Id: I04dd850bb201641d441c8c1f88c7bb8ba1d09e58 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6751 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:24 +00:00
Ziye Yang	a620cd198f	nvme/tcp: Fix the zero copy enablement issue. Remove the polling group check. Because at this moment, the qpair is not added into a polling group. If we do not remove it, we will never enable zcopy feature for I/O qpair. And in sock implementmentation, we already fixed the zero copy handling if a socket is not in a polling group. See posix_sock_flush function. So we can fix this issue if we directly remove this check. Reported by: Aleksey Marchuk <alexeymar@mellanox.com> Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I969936c4b6c7f13cbfa4d6eb479010c53f3e384a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7056 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2021-03-26 08:22:53 +00:00
Ziye Yang	f1f4f7d3bc	nvme/tcp: Use the async manner to send pdu when crc32c enabled. This patch refactor the pdu sending logic with the async manner, then if the group contains the accel engine, we can use it. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I2d669c0a3255d7a8898441e406906add2f3a3556 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6759 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-03-18 14:42:35 +00:00
Jim Harris	6156777bd4	nvme: assert if user tries to delete NULL tcp qpair It is invalid to try to delete a NULL qpair, so do not check for it in nvme_tcp_ctrlr_delete_io_qpair and return an error when NULL. Just change it to an assert instead. This makes it consistent with pcie and rdma. While here, add an assert in rdma as well. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic2f76deecb21b78749dac85e33fb1fa0d14a1239 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6917 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: <dongx.yi@intel.com>	2021-03-18 14:41:44 +00:00
Changpeng Liu	2f579469b6	nvme/tcp: pass correct parameter to nvme_tcp_qpair_send_h2c_term_req_complete Previously the callback parameter for this function is NULL, this will cause segment fault, so pass the correct parameter here. Fix #1817 Change-Id: Ie768b7bf4a72862d16a44742ab3032803d0939a2 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6690 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: <dongx.yi@intel.com> Community-CI: Mellanox Build Bot	2021-03-05 08:33:18 +00:00
Ziye Yang	579a678a51	nvme/tcp: Move sock creation into nvme_tcp_ctrlr_create_qpair function. Purpose: To get the optimal group, we need the socket information. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I17b048a402fbf002307dd225f64b20a9f876d642 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3324 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot	2021-02-25 10:26:08 +00:00
Ziye Yang	5206698e77	nvme/tcp: Add the implementation to get the optimal polling group Add the real support in nvme tcp transport. Change-Id: I2aa9b0284d6fe009925e67f602a055e787f77987 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5734 Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-02-04 08:30:54 +00:00
yidong0635	73487b15d8	nvme/nvme_tcp: Remove unnecessary returns. No need these returns at the end of void functions. So remove them. Signed-off-by: yidong0635 <dongx.yi@intel.com> Change-Id: I8889745f3ef82af513d03259a77a33c1f4f536cb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6015 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2021-01-22 08:16:17 +00:00
Alexey Marchuk	74542bae77	tcp: Rename readv_offset to rw_offset in nvme_tcp_pdu In the next patch this member will be used to track both read and write offsets Change-Id: I852125ff35257f9821ddf4a641d96afb29ebf0a0 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5924 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-01-21 09:55:53 +00:00
Alexey Marchuk	d296fcd8d9	nvme_tcp: Fix icreq/icresp handing with zcopy enabled. There is a problem with TCP zcopy enabled: 1. TCP initiator sends icreq and start polling a qpair. Polling of qpair actively calls nvme_tcp_read_pdu function 2. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_CH state, it reads 8 bytes of common PDU header. It determines the type of the PDU and finds the size of PDU_PSH header. 3. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state. It should read 120 bytes of icresp PDU. The number of bytes which needs to be read is pdu->psh_len - pdu->psh_valid_bytes. qpair receives 120 bytes (the full PDU) and calls nvme_tcp_pdu_psh_handle -> nvme_tcp_icresp_handle. Here we check that we haven't yet received buffer reclaim notification and simply return from this function. At the same time we continue to poll the qpair. 4. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state and tries to read data from a socket again. The number of bytes is pdu->psh_len - pdu->psh_valid_bytes. But now pdu->psh_len == pdu->psh_valid_bytes, so we call nvme_tcp_read_data with zero length. readv with zero length is commonly used to check errors on the socket, but in our case there is no errors and readv returns 0. 5. nvme_tcp_read_data treats zero as error and return NVME_TCP_CONNECTION_FATAL. Fix is to handle icresp, but leave qpair in INITIALIZING state until we receive acknowledgement for icreqsend_ack. We also move qpair to NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY recv_state so recv_pdu will be zerofied and qpair will try to read a common PDU header. But since it is not initialized yet, it won't receive anything from the target. Fixes issue #1633 Change-Id: I22cedefe530a8ac3b51495988ed6265d8fad15bb Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4969 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-10-30 09:05:35 +00:00
Alexey Marchuk	85fa43241b	nvmf/tcp: Support ICD for fabric/admin commands According to the SPEC we should support up to 8192 bytes of ICD for admin and fabric commands. Transport configuration parameter in_capsule_data_size is applied to all qpair types - admin and IO. Also we allocate resources when we get a connection request, so we don't know qpair type at this moment. Create a list of buffer in TCP poll group to support ICD up to 8192 bytes when configuration ICD is less than this value. The number of elements in this pool is hardcoded, it is planned to add a new configuration parameter later. Fixes issue #1569 Change-Id: I8589e3e2ea95d515f5503c6de7c1ee40aaf7b6da Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4754 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-10-27 08:40:12 +00:00
Alexey Marchuk	c72a16431a	nvme/tcp: Fix check of completion number during icresp handling The current approach checks "rc == 0". It worked before adding polling of poll group since a single qpair should return 1 completion for its own icreq while poll group can return several completions for all qpairs attached to this poll group (but .e.g not for those qpair who is waiting for the completion). Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I60d05d8d6640e4e2bbaf3cd533d2f5a3637adea1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4768 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-10-21 20:45:13 +00:00
Tomasz Zawadzki	2172c432cf	log: simplify SPDK_LOG_REGISTER_COMPONENT This patch removes the string from register component. Removed are all instances in libs or hardcoded in apps. Starting with this patch literal passed to register, serves as name for the flag. All instances of SPDK_LOG_* were replaced with just * in lowercase. No actual name change for flags occur in this patch. Affected are SPDK_LOG_REGISTER_COMPONENT() and SPDK_*LOG() macros. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I002b232fde57ecf9c6777726b181fc0341f1bb17 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4495 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Mellanox Build Bot Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI	2020-10-14 08:00:35 +00:00
Alexey Marchuk	3a2148213f	nvme/tcp: Enable zcopy send when qpair is attached to poll group We can receive buffer reclaim notifications only when a qpair is attached to a poll group (so qpair's socket is connected to a socket poll group). The previous assumption that we enable zcopy only for IO qpairs was wrong since IO qpair might not use poll groups too (e.g. abort application). Fixes issue #1607 Change-Id: I67329d755d81da6606e65eddfeceb20839346d87 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4476 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-10-06 09:35:31 +00:00
Jim Harris	1deb6b9e6b	nvme: disable zero copy for client TCP sockets This seems to be causing some CI test failures. So disable zero copy in all cases for now for client sockets. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iceea09fe65fb90c7df15f500878a473f1ad4152c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4473 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-30 00:21:26 +00:00
Alexey Marchuk	86865969ff	sock/posix: Enable send zero copy for client sockets In NVME TCP initiator zero copy is enabled for IO qpairs and disabled for admin qpairs Change-Id: Ibdf521dccde9b95ec5dd15a5eb2baed8fcf8b88e Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4211 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	a910bc647d	nvme/tcp: Calculate requests completed asyncronously A preparation step for enabling zero copy in NVMEoF TCP initiator. With zero copy enabled, some requests might be completed out of "process_completions" call and we should take them into account to return the correct number of completions. Change-Id: Iba7973f6da815645bbfad0334619d46b66379226 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4209 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	2ceff364e5	nvme/tcp: Add synchronization for icreq send ack and icresp A preparation step for enabling zero copy in NVMEoF TCP initiator. We should wait for both events to occur before continue qpair initialization. Add a new bit to nvme_tcp_qpair::flags to track receiving of icreq ack since icreq is sent without tcp_req and there is no way to apply existing synchronization mechanisms. Move tcp qpair to initializing state if we receive icresp before icreq ack, this state will be checked during handling of icreq ack to continue qpair initialization Change-Id: I7f1ec710d49fb1322eb0a7f133190220b9f585ab Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4207 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	bc36528cda	nvme/tcp: Process poll_group when waiting for icresp A preparation step for enabling zero copy in NVMEoF TCP initiator. Since nvme_tcp_qpair_process_completions doesn't process poll group, we can't get asycn notification from kernel. 1. Add a qpair to poll group before we send icreq in order to be able to process buffer reclaim notification. 2. Check if qpair is connected to a poll group and call nvme_tcp_poll_group_process_completions instead of nvme_tcp_qpair_process_completions when waiting for icresp 3. Add processing of poll group to nvme_wait_for_completion_timeout and nvme_wait_for_completion_robust_lock since they are used to process FABRIC_CONNECT command Change-Id: I38d2d9496bca8d0cd72e44883df2df802e31a87d Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4208 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	a85579d8ef	nvme/tcp: Refactor header/data digest using bitfields Currently host/data digest are bool members of nvme_tcp_qpair structure. Change the type of this members to bitfield, reserved bits will be used in the next patches to support zero copy. Change-Id: If0659bf2445901e45fe0816af5f4fca5f494b154 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4206 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	7388e54de4	nvme/tcp: Complete request when all acks are received A preparation step for enabling zero copy in NVMEoF TCP initiator. Make sure that we complete a request (call user's callback) when all acknowledgements are received. For write operation - when we received send cmd ack, h2c ack and response from target. For read operation - when we received send cmd ack and c2h completed Since we can receive send ack after resp command, store nvme completion received in resp command in a new field added to tcp_req structure Change-Id: Id10d506a346738c7a641a979e1c8f86bc07465a4 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4204 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	2d4af0c174	nvme/tcp: Add synchronization for subsequent R2T requests. A preparation step for enabling zero copy in NVMEoF TCP initiator. Some NVMEoF TCP targets can send several R2T requests. We should check that we finished the previous H2C (received buffer reclaim notification from kernel) before sending the next H2C. This patch adds a new ordering bit indicating the described case and 2 fields to nvme_tcp_req to store the values from the last R2T request which will be applied when send ack is received. Change-Id: Iaa5ad846712ca18a8382680baa02413c18c4eb37 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4203 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-09-29 09:35:47 +00:00
Ben Walker	85ddcf6f8d	nvme/tcp: Clean up error message Fix some spelling and make the message clearer Change-Id: Ib291542a9735d6409db84f16c530e78567123f67 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4249 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2020-09-22 11:40:58 +00:00
Alexey Marchuk	e7c92b2426	nvme/tcp: Rename r2t_recv, set this flag when send_ack is 0 Rename ordering bit r2t_recv to h2c_send_waiting_ack, that is more descriptive name. Change-Id: I6d6143ff4c1cccc74e11226b7974706808092f9a Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4202 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-16 07:58:59 +00:00
Alexey Marchuk	dc88d13129	nvme/tcp: Move tcp_req ordering bits to union This makes it easier to zerofy ordering bits. Change-Id: If5696bfedfff1bf75e41c1449eac7fccb469e98b Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4201 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-09-16 07:58:59 +00:00
Ziye Yang	d4d2e317b5	nvme/tcp: Make the return value consistent. We should make nvme_tcp_ctrlr_connect_qpair always return negative value if this function fails. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I457e704e39d7a3acd298fd48e89e8ea51e2ed4ad Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3809 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-08-24 07:37:13 +00:00
Ziye Yang	0d3cc15a62	nvme/tcp: Correct the incapsule data usage According to page35 in recent NVMe-oF spec ( NVMe-over-Fabrics-1.1-2019.10.22-Ratified), ioccsz is used to restrict the incapsule size of I/O command, so do not restrict the NVMe-oF OPC command and also the admin command. We accidently trigger an bug in kernel since we do not send the fabrics command with the incapsule and make the kernel coredump, though the kernel has bugs. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I869a2c8ab7b9c2ac1e5cc5b603920662591c2c64 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3837 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-08-20 09:26:06 +00:00
Ziye Yang	7bac9b06f1	nvme TCP: Make the control related pdu also allocated from the SPDK DMA memory Purpose: To make the pdu management consistent with other PDUs, then we can easily adapt our code into some hardware offloading solution. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ic4a2847fd1b6cacda4cbaa52ff12c338f0394805 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3588 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-08-04 18:28:08 +00:00
Ziye Yang	1da44e0604	nvme_tcp: Move the default buffer factor size in nvme_tcp.h 1 Change the default factor from 4 to 8, which can be used to improve the performance. 2 Change the base buffer size in nvme_tcp.c, we should not use sizeof(struct spdk_nvme_tcp_cmd), it is 72 bytes. Normally, the initiator will receive C2h pdus and R2T Pdus by most, so set the size of using sizeof(struct spdk_nvme_tcp_c2h_data_hdr) is enough. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I384f4cb026cb8d83e75b639f7256ee8cb8ed1df1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3283 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: John Kariuki <John.K.Kariuki@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-07-22 12:21:07 +00:00
Alexey Marchuk	e137881e4e	nvme/tcp: Insert free req at the head of the list lifo model is more cache friendly Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Id937ab0c1b8b4ce121136144c7d6013bbe5eb963 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3282 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2020-07-13 08:40:31 +00:00
Ziye Yang	4c9aad0299	nvme/tcp: Report the free entries if sending_ack is set Previous we fix the same issue in this commit: `cb98b2ab3e` But we forget to fix it here. And we also need to update here, otherwise we will still face the same issue described in commit: `cb98b2ab3e` Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I3660dbb6e97c92ea4cb347cfce4bf23c6dfe97ab Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3242 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-07-09 07:23:19 +00:00
Shuhei Matsumoto	f2bd635ecf	lib/nvme: Add qpair_iterate_requests() to iterate the common operation among transports To abort requests whose cb_arg matches, add child abort request greedily. Iterating all outstanding requests is unique for each transport but adding child abort is common among transports, and adding child abort is replaceable by other operations. Hence add qpair_iterate_requests() function to the function pointer table of transport, and pass the operation done in the iteration by a parameter of it. In each transport, the implementation of qpair_iterate_requests() uses TAILQ_FOREACH_SAFE() for potential future use cases. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ic70d1bf2613fce2566eade26335ceed731f66a89 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2038 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2020-07-08 07:54:01 +00:00
Shuhei Matsumoto	ad69e739e1	nvme/tcp: Dequeue request from outstanding list before calling completion Each request has a callback context as cb_arg, and the callback to nvme_complete_request() for the completed request may reuse the context to the new request. On the other hand, TCP transport dequeues tcp_req from tqpair->outstanding_reqs after calling nvme_complete_request() for the request pointe by tcp_req. Hence while nvme_complete_request() is executed, tqpair->outstanding_reqs may have two requests which has the same callback context, the completed request and the new submitted request. The upcoming patch will search all requests whose cb_arg matches to abort them. In the above case, the search may find two requests by mistake. To avoid such error, move dequeueing tcp_req from tqpair->outstanding_reqs before calling nvme_request_complete(). One exception is the case that only nvme_tcp_req_put() is called. For the case remove tcp_req from tqpair->outstanding_reqs before calling nvme_tcp_req_put(). Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I5f2ac292c60431ac1e27b8657db92b220860a0a8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2865 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-07-08 07:54:01 +00:00
Shuhei Matsumoto	e060285ea6	nvme/tcp: Change nvme_tcp_req_complete() to take tcp_req instead of req Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ida0ee76015821d7db54b273d14383a245a18047b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3058 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-07-08 07:54:01 +00:00
Ziye Yang	449dee3563	nvme/tcp: Fix the sending conflict between cmd and h2c pdu. As is well known, we may also handle the r2t data pdu receving earlier before calling the cb function of send_cmd due to the outof order execution of the lower layer uring socket interface.So we need to fix this issue, otherwise the data of the sending_pdu will be placed with the wrong data. And it will cause the issue shown in https://github.com/spdk/spdk/issues/1473 Fixes #1473 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Idac1ad65761695f3a655b85003861c1d1f4f3875 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3215 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-07-07 07:31:53 +00:00
Ziye Yang	cb98b2ab3e	nvme/tcp: Report that we have free entries if send_ack is set. Without this patch, we will face the following warning code when compiled with (--with-uring --enable-debug) while testing big I/O size: 256KB, e.g., "nvme_qpair.c: 474:nvme_qpair_resubmit_requests: ERROR: Unable to resubmit as many requests as we completed" The reason is because the nvme_tcp_request structure is not freed yet if send_ack is not set, so there will be no entries when there are other requests submit again. And this patch can mitigate such issue. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I4c7616fbd3c82a883b4e9facd257a1a4f66e876d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3123 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-07-01 07:51:17 +00:00
Ziye Yang	ceb07eb8f4	nvme/tcp: Fix send_cb and recv pdu function contention when there is R2T. When using uring socket, we see following assert nvme_tcp.c:1018: nvme_tcp_capsule_resp_hdr_handle: Assertion `tcp_req->state == NVME_TCP_REQ_ACTIVE' failed. Detailed info is in https://ci.spdk.io/results/autotest-per-patch/builds/19205/archive/nvmf-tcp-vg-autotest/build.log We face this issue, because there is also code execution ordering between "sending callback function" and "pdu receving function". We did not find it in physical machine testing, but finding it in vagrant machine in CI. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I5eb241d564c0fc42ce0601b7c85999a2550f0de3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3046 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-06-29 09:18:13 +00:00
Ziye Yang	2ac8d3ff5e	nvme/tcp: Allocate send_pdu with DMA allocated memory. Purpose: It will be used to leverage the uring acceleration later when we use io_uring_prep_write_fixed. Because for using the Registered buffers feature in I/O uring, we currently can register all the huge memories. And if we allocate send_pdus in DMA memory, we can leverage such feature. Change-Id: Id0ba5f7fe43202027c0378e9cbe74d861aad21e5 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3002 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-06-24 08:22:17 +00:00
Ziye Yang	3a1f5364d2	nvme/tcp: Fix nvme_tcp_req free conflict between cmd sending and incoming pdu receiving This patch tries to solve the out of order call back handling for cmd sending and the incoming pdu handling. Normally, the cmd call back will be called before receving the next PDU from the target if the application uses the sync manner. With the uring implementation, after sending the cmd to the target, we may have the following scenerio: (1) Firstly receive the incoming pdu(e.g., CapsuleResp pdu, C2hdata pdu) due to the group polling read event. (2) Secondly execute the callback function related with NVMe command sending. This means that the data from the initiator is really sent out to the target, and the target receives, then sending back the data to the initiator. But the uring io_uring_cqe event is not handled, thus if we execute (1) first, it will clean the data structures related with nvme_tcp_req, and the nvme_tcp_req will be used for other purpose. Then causes wrong behaviour like the following: "Rand Write test failed at QD=128 because fio hangs with the following error: nvme_tcp.c: 971:nvme_tcp_capsule_resp_hdr_handle: ERROR: no tcp_req is found with cid=66 for tqpair=0x7f23d8001710". And this patch can address this issue. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I5043aaa8adf5033d93dedac15f633f0850e0b9f2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2818 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-06-22 07:47:31 +00:00
Seth Howell	1a9c19a954	lib/nvme: remove spdk prefix from internal headers. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Iccde5860b83217163428ff504cba87a1cf209720 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2444 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2020-06-01 13:07:30 +00:00
Seth Howell	a3f72b2e5a	lib: net, notify, nvme, rocksdb remove spdk_ prefix. remove only the spdk_ prefix from static functions in the above libraries. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: I59ce032c3312fa73f30c133fd62e603c1eee2859 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2365 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-05-21 09:19:00 +00:00
Seth Howell	5d0718528d	nvme: implement epoll in the tcp transport. Change-Id: I6672361baca4969f23259c19b73ed9dbe2f436bd Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/885 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-04-24 19:38:00 +00:00
Seth Howell	fe5e1db68e	nvme/tcp: add naive implementation of poll_group api This implementation simply loops over qpairs calling process_completions. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Ia1f59c13444703e00c6b769d378874f48b9ef03e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/627 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-04-24 16:36:03 +00:00
Ziye Yang	94345a0a1a	nvme: Add the priority field in struct spdk_nvme_transport_id Purpose: To set the priority of the NVMe-oF connection especially for TCP connection. For example, the previous example can be: trtype:TCP adrfam:IPv4 traddr:10.67.110.181 trsvcid:4420 With the change, it could be: trtype:TCP adrfam:IPv4 traddr:10.67.110.181 trsvcid:4420 priority:2 The priority is optional. We try to change spdk_nvme_transport_id but not in spdk_nvme_ctrlr_opts since the opts in spdk_nvme_ctrlr_opts will reflect in every nvme ctrlr, this is short of flexibility. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Change-Id: I1ba364c714a95f2dbeab2b3fcc832b0222b48a15 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1875 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-04-24 15:53:34 +00:00
Seth Howell	fc86e792e4	lib/nvme: switch poll group to use connect/disconnect semantics. This makes more sense within the context of the nvme driver and helps us avoid the awkward situation of getting a failed_qp callback on a qpair that simply hasn't been connected. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Ibac83c87c514ddcf7bd360af10fab462ae011112 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1734 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-04-22 19:06:26 +00:00
Seth Howell	6189c0ceb7	lib/nvme: abort all requests when disconnecting a qpair. By aborting all requests from every qpair when it is disconnected, we can completely avoid having to abort requests when we enable the qpair since nothing will be left enabled. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Iba3bd866405dd182b72285def0843c9809f6500e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1788 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-04-22 19:06:26 +00:00
Seth Howell	6338af34fc	lib/nvme: handle qpair state in transport layer. The state should be changed and checked by the transport layer. All transports should follow the same list of steps when disconnecting/reconnecting. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: If2647624345f2c70f78a20bba4e2206d2762f120 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1853 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-04-22 19:06:26 +00:00
Seth Howell	e1c9185005	lib/nvme: always call the transport disconnect function. The qpair states should be maintained at the generic level. Always going through the transport disconnect function is one step in that direction. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: I019b2b4a14fe192eff5293f918d633dde2c5400a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1851 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-04-22 19:06:26 +00:00
Seth Howell	9649ee09fa	lib/nvme: rename NVME_QPAIR_DISABLED This variable really indicates when a qpair is no longer connected. So NVME_QPAIR_DISCONNECTED is actually much more accurate. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Ia480d94f795bb0d8f5b4eff9f2857d6fe8ea1b34 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1850 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-04-22 19:06:26 +00:00
Ben Walker	0accbe8a37	nvme/tcp: Properly size the receive buffer Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I38e6e2f532597cb5e359879680edfc2172157c2f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1635 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-04-08 06:42:55 +00:00
Seth Howell	c998c6c69e	nvme: add API for qpair poll groups. This API will allow us to simplify the polling mechanism for qpairs on a single thread. It also will pave the way for doing transport specific aggregation of qpair polling to increase performance. The generic implementation is included. The transport specific calls have yet to be implemented. Change-Id: If07b4170b2be61e4690847c993ec3bde9560b0f0 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/579 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-04-07 08:38:40 +00:00
Seth Howell	3b99ee9929	lib/nvme: move connect directly into alloc_io_qpair. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Iadbada599764c7a2f4cdd4848a81a2fa39a89b46 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1120 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-03-17 08:23:50 +00:00
Ben Walker	ea65bf612d	Revert "nvme/tcp: Change hdr in nvme_tcp_pdu to pointer" This reverts commit `ea5ad0b286`. This code is moving from the nvmf target to the posix sock layer in this series. Change-Id: I333bdf325848e726ab82a9e6916e1bbdcd34009c Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/446 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-03-17 08:23:07 +00:00
Jacek Kalwas	daa8f941e4	nvme: extend ctrlr opts with admin queue size Align rdma and tcp to respect opts. Reduce default number of entries for admin queue so it becomes memory optimization. Linux driver by default creates admin queue with 32 depth, there is no good reason to enlarge that queue by default within SPDK NVMe driver. Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com> Change-Id: I97ceea8f350c52313021a63190fb0980f604c48e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1110 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2020-03-12 09:04:18 +00:00
Ziye Yang	9ba4bb22fe	lib/nvme_tcp: get the max_sges from the nvme ctrlr. Add the error print if there is still remaining_size in order to provide more meaningful debug info. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I7b15c9c9a630ea7ecb2d3191b73c9c99f7febf31 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1189 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-03-11 02:25:12 +00:00
Seth Howell	f146bbe42d	lib/nvme: move common connect code into transport shim This gets rid of some duplicate lines of code. Change-Id: I24d4864921f6030672f3640b33f88f37a9e8175a Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1136 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-03-06 10:29:21 +00:00
Changpeng Liu	8d6f48fbf8	nvme: set transport string before the probe based on transport type Users may only set the transport type, but for the actual probe process, the trstring field is mandatory, so set the trstring based on transport type at first. Also remove unnecessary spdk_nvme_trid_populate_transport() call from each transport module. Fix #1228. Change-Id: I2378065945cf725df4b1997293a737c101969e69 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1001 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-02-26 09:26:09 +00:00
Alexey Marchuk	33204a4354	nvme/tcp: Align local variables types Some of variables have types which don't match their usage in code Change-Id: Ic2bd5fd6561c70143dde436ce9cddc0be4d3b0d0 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/521 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-02-17 10:06:30 +00:00
Alexey Marchuk	c3ba9127d0	nvme: Store NVMEoF ioccsz and icdoff in ctrlr structure This allows to avoid calculation of ioccsz bytes on each request and removes access to "cold" ctrlr structures in data path. Add UT to check validness of calculation Change-Id: I55ceff99eb924156155e69a20f587a4f92b83f0b Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/519 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-02-17 10:06:30 +00:00
Ben Walker	5ac51a3214	nvme:Make ctrlr_alloc_cmb_io_buffer optional for transports If the transport doesn't define one, don't call it. Change-Id: I8b83132f9fc0accbd4faa8fa0fc17a6bd11e543e Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/783 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-02-17 10:06:20 +00:00
Ben Walker	7dbe0e7c61	nvme: Remove nvme_transport_get_ctrlr_registers Wasn't used. Change-Id: I9812e24540f6d86f47d39091ea5fd9b7880b4413 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/735 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-02-12 12:07:16 +00:00
Ben Walker	f5bc2cbe86	nvme: No longer DECLARE_TRANSPORT(tcp) With the transport plugin system, this isn't used anymore. Change-Id: Ib81c73f262d44edb6c937ca0056ac027b1e1ca75 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/712 Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-02-12 12:07:16 +00:00
Alexey Marchuk	9727aa281f	tcp: refactor of header/data digest support check Some functions performed incorrect header/data digest support check, align it with NVMEoF spec. Use a table to check if PDU supports digest depending on its type. Change-Id: I6170dd19ace017f37fda0a923f604732799460b9 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483375 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-02-04 18:18:49 +00:00
Ben Walker	d0f4a51fdc	sock/posix: Block recursive calls to spdk_sock_flush Don't allow calling spdk_sock_flush while the socket is closed. Change-Id: I9020a49ab8906b0f343e3f48f8b96bd38308ab17 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483148 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-30 10:22:20 +00:00
Or Gerlitz	8e8a5f7c28	nvme/tcp: Use writev_async for sending data on sockets Amortize the writev syscall cost by using the writev_async socket API. This allows the socket layer to batch writes into one system call and also apply further optimizations such as posix's MSG_ZEROCOPY when they are available. As part of doing so we remove the error return in the socket layer writev_async implementation for sockets that don't have a poll group. Doing so eliminates the send queue processing. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Change-Id: I5432ae322afaff7b96c22269fc06b75f9ae60b81 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475420 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-22 13:53:09 +00:00
Ziye Yang	0bfaaace8f	sock: Add impl_name parameter in spdk_sock_listen/connect. Purpose: With this patch, (1)We can support using different sock implementations in one application together. (2)For one IP address managed by kernel, we can use different method to listen/connect, e.g., posix, or uring. With this patch, we can designate the specified sock implementation if impl_name is not NULL and valid. Otherwise, spdk_sock_listen/connect will try to use the sock implementations in the list by order if impl_name is NULL. Without this patch, the app will always use the same type of sock implementation if the order is fixed. For example, if we have posix and uring together, the first one will always be uring. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ic49563f5025085471d356798e522ff7ab748f586 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478140 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-16 09:11:32 +00:00
Seth Howell	738b9569f0	lib/nvme: remove extra function calls in tcp transport. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: I031cb5263598d09fb4956873c35d74ec3173fe63 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478875 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-16 09:10:38 +00:00
Seth Howell	f6cf92a31f	lib/nvme: make transport.c use fn tables. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: Ida58785784b4ed50393e1d43a9cd902de74a2eaa Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478873 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-16 09:10:38 +00:00
Seth Howell	e4eef6975c	lib/nvme: add function tables for all transports. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: I4e7af1c42a19346f4abcb17910a41f8104a2de1b Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478871 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-16 09:10:38 +00:00
Seth Howell	7ed0904b9b	lib/nvme: update trid struct with trstring. The trtype should be stored as both an enum and string. This is intended to help pave the way for pluggable NVMe-oF transports. Signed-off-by: Seth Howell <seth.howell@intel.com> Change-Id: I6af658d7a17c405e191ff401b80ab704c65497e7 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478744 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>	2020-01-16 09:10:38 +00:00
Ziye Yang	0e3dbd9a60	nvme/tcp: Add a timeout for construct connection. Purpose: To avoid the hang if there is no response from the target. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ib68a9e4c1a28436af2b2ae65891de04067e3dc7d Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477121 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-12-19 11:06:23 +00:00
Seth Howell	24bca2eadd	nvme: add an enum for why a qpair disconnected Change-Id: I1a9517d9673051615942c873416505704740691a Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475805 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-12-09 13:55:41 +00:00
Seth Howell	3911922005	nvme: remove redundant transport_qp_is_failed checks The qpair state transport_qpair_is_failed is actually equivalent to NVME_QPAIR_IS_CONNECTED in the qpair state machine. There are a couple of places where we check against transport_qp_is_failed and then immediately check to see if we are in the connected state. If we are failed, or we are not in the connected state we return the same value to the calling function. Since the checks for transport_qpair_is_failed are not necessary, they can be removed. As a result, there is no need to keep track of it and it can be removed from the qpair structure. Change-Id: I4aef5d20eb267bfd6118e5d1d088df05574d9ffd Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475802 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-12-09 13:55:41 +00:00
Seth Howell	f6646fd9fa	nvme/tcp: detect cq errors. We should alert the upper layer when the qpair becomes unusable due to qpair errors. Change-Id: Icdee3b55a14441a60111f3bd7a44dceef93bbb09 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474095 Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-11-15 18:27:27 +00:00
Ben Walker	83ffb2075e	nvme/tcp: Rename pdu->ctx to pdu->req This is always the request pointer, so rename it for clarity. Change-Id: Ifbda7db7787c65f0deb190a1e94f0676b2c0d99a Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470530 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>	2019-11-01 17:56:16 +00:00
Seth Howell	6035f73d7b	nvme_fabrics: move ctrlr_scan to common code. This function is identical between the two transports. Change-Id: If50b781259f224eb2c21de7da14564e6ce487650 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471778 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-10-22 21:14:22 +00:00
Seth Howell	fa9f668a8b	nvme: call the generic qpair_connect fn from all transports. This wasn't being done in the previous case which meant that I/O qpairs were not being moved to the connecting state when connecting for the first time. However, to prepare the way for a coherent state machine for nvme qpairs, we need to ensure that all qpairs go through the same states. Change-Id: I3cfe799a003acd926b24c107ab1461a96239c1bb Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471753 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-10-22 21:14:22 +00:00
Seth Howell	c2df8f6d84	nvme: unify ctrlr_scan function between rdma & tcp These functions are functionally equivalent. Just unify the way they wait for completions so that they are completely identical and we can merge them into a common function. Change-Id: Id5d734b6ae613b3ac828d89853d986cdadfb211a Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471936 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-10-22 21:14:22 +00:00
Seth Howell	2c68fef058	nvme: move queued request resubmit to generic layer We were already passing up from each transport the number of completions done during the transport specific call. So just use that return code and batch all of the submissions together at one time in the generic code. This change and subsequent moves of code from the transport layer to the genric layer are aimed at making reset handling at the generic NVMe layer simpler. Change-Id: I028aea86d76352363ffffe661deec2215bc9c450 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469757 Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-10-07 15:05:00 +00:00
Seth Howell	7630daa204	nvme: move queueing requests to the generic layer The tailq and the requests all belong to the generic layer, might as well put the queueing code there for better encapsulation. Change-Id: Id5f08f798121b50a21044cfc61856999c50ca227 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469758 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-09-30 21:17:47 +00:00
Ben Walker	647afdec44	Revert "nvme: small code cleanup for nvme_transport_ctrlr_scan" This reverts commit `6129e78d26`. When the initiator sends the discovery log page, if the log page exceeds the size of its data buffer, it will break it up into multiple log page commands with appropriate offsets. However, supporting offsets in log pages is an optional feature in NVMe and reported by the EDLP bit in the identify data. This commit changed the discovery process to no longer send an identify command prior to doing the discovery log page command, so the values in the identify data are always 0. If the discovery log page exceeds the size of the data buffer (4k), it will then fail to send the second log page with an offset because it believes the controller does not support the feature. Revert this change to fix it. An identify should always be sent as part of the discovery process. A test case is included in a follow up patch the demonstrates the bug. Reported-by: Zahra Khatami <zahra.k.khatami@oracle.com> Reported-by: Akshay Shah <akshay.shah@oracle.com> Change-Id: Iefd512a7521e0fea90541b3eb547671cfa816ea6 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466819 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-09-09 21:52:07 +00:00
Ziye Yang	24eb7a84b0	nvme/tcp: fix the iov vector count. Since we use pdu->data_iovcnt to build the iov in nvme_tcp_build_iovs, so send out pdu has the maximal iov number equals to: 2 + pdu->data_iovcnt, so we change the comparison. This makes sure that we can handle all the data owned by one pdu. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I2b9258cc5716d706c0fa38af609726c439708768 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467207 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>	2019-09-09 02:08:31 +00:00
Ziye Yang	ea5ad0b286	nvme/tcp: Change hdr in nvme_tcp_pdu to pointer Purpose: Prepare the further optimnization in the target side whening receving pdu headers, we expect to use zero copy. Change-Id: Iae7f9106844736d7160d39d0af1f5941084422ec Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465380 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>	2019-08-28 15:38:02 +00:00
Ziye Yang	73d9cef8c5	nvmf/tcp: add nvme_tcp_pdu_cal_psh function. Purpose: 1 Do not caculated the psh_len every time. 2 Small fix, for ch_valid_bypes, and psh_valid_bytes, we do not need to use uin32_t. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I9b643da4b0ebabdfe50f30e9e0a738fe95beb159 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464253 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-08-07 01:46:54 +00:00
Tomasz Zawadzki	8df52a0f4a	lib/nvme_tcp: assert tcp_req->req before it is dereferenced The value of tcp_req->req was asserted after it was already dereferenced. This patch fixes that. Change-Id: I5eb01e88be09d41fb8e632c49d5a7ccf2315788f Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462508 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-24 18:09:33 +00:00
Andrey Kuzmin	fa6bfa80af	Nvme: check spdk_nvme_qpair_process_completions return value. nvme_tcp_qpair_process_completions returns -1 on socket I/O error. Unless the caller checks this return value (which spdk_nvme_wait_for_completion_robust_lock currently doesn't), on connection loss or any other fatal connection error spdk_nvme_wait_for_completion will never exit the completion check loop. Change-Id: I92bb349beb071db312e6c31b84db2a7b51ec486c Signed-off-by: Andrey Kuzmin <akuzmin@jetstreamsoft.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460657 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-07-09 00:27:54 +00:00
Shuhei Matsumoto	8b539eb553	nvme: Set appropriate value to max_xfer_size and max_sge SPDK NVMe-oF initiator driver could not transfer IO whose size is more than 128KiB even if NVMe-oF target allows IO whose size is more than 128KiB both for RDMA and TCP transport. Some use cases need to transfer IO larger than 128KiB. For RDMA transport, max_mr_size by ibv_query_device of RDMA devices indicates the maximum size of a single memory region and is independent from the actual I/O size, and is very likely to be larger than 2 MiB which is the granularity we currently register memory regions. Actually some RDMA NICs return UINT64_MAX for max_mr_size by ibv_query_device. Hence use UINT32_MAX and let the generic layer use the controller data to moderate this value. On the other hand, for TCP transport, there is no limit for maximum IO size and hence use UINT32_MAX. Besides, for RDMA transport, max_sges should be the minimum of max_sge got by querying RDMA devices and NVME_RDMA_MAX_SGL_DESCRIPTORS. Hence do this change together in this patch. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Idc813afd3e525bf5f370c0fcd2623f9c146a5528 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459218 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-07-05 06:35:41 +00:00
Shuhei Matsumoto	3ff1ff004e	nvme/tcp: Minor cleanups for SGL operations Using naming rules consistent with other related libraries is helpful to ensure the quality as verified by this patch series. This patch changes a few parts to use iov and iovcnt for SGL operations. Besides, name of an array points to the head of the array and is constant. So copying name of array to an another pointer is not necessary and can be removed. Change-Id: I2324f28126b3088098c1c767cf6c060f22c175c3 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455629 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2019-07-04 08:58:40 +00:00
Shuhei Matsumoto	3184884f9d	nvmf/tcp: Properly handle multiple iovecs in processing H2C and C2H NVMe/TCP target had assumed the size of each iovec was io_unit_size. Using nvme_tcp_pdu_set_data_buf() instead removes the assumption and supports any alignment transparently. Hence this patch moves nvme_tcp_pdu_set_data_buf() to include/spdk_internal/nvme_tcp.h and replaces the current code to use it. Besides, this patch simplifies spdk_nvmf_tcp_calc_c2h_data_pdu_num() because sum of iov_len of iovecs is equal to the variable length now. We cannot separate code movement (lib/nvme/nvme_tcp.c to include/ spdk_internal/nvme_tcp.h) and code replacement (lib/nvmf/tcp.c) because moved functions are static and compiler give warning if they are not referenced in lib/nvmf/tcp.c. The next patch will add UT code. Change-Id: Iaece5639c6d9a41bd35ee4eb2b75220682dcecd1 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455625 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-07-04 08:58:40 +00:00
Shuhei Matsumoto	f62d5ccbe6	nvme/tcp: Properly handle multiple iovecs in nvme_tcp_pdu_set_data_buf nvme_tcp_pdu_set_data_buf() has been used to process C2H and H2C for NVMe/TCP initiator. In this case, NVMe/TCP cuts out the part of the input data buffer and transfers the part, and repeats these cut and transfers until the whole data buffer is transferred. NVMe/TCP uses two SGLs, and use one to parse from the offset datao to datao + datal and another to append from the offset 0 to datal. However, the current nvme_tcp_pdu_set_data_buf() had used data_length as not data length of this transfer but total length of the whole transfers by mistake. Recently DIF library updated to properly handle very similar cases, and so this patch takes DIF library as a reference and corrects the implementation. The next patch will add UT code to verify the bug will be fixed. The code size is pretty large and so UT code is separated. Change-Id: Ibeed4de182b8b8740566e874e2757280dc21f9e8 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455623 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2019-07-01 08:28:20 +00:00

1 2 3 4

195 Commits