ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
Ben Walker	ea0aaf5e85	nvme: Transports now set qpair state to NVME_QPAIR_CONNECTED inside .ctrlr_connect_qpair Previously this was assumed to be a synchronous process so the generic layer transport code updated the state after .ctrlr_connect_qpair returned. In preparation for making this support asynchronous mode, shift that responsibility down into the individual transports. While none of the transports actually do this asynchronously, insert a busy wait in nvme_transport_ctrlr_connect_qpair to wait for the qpair to exit from the CONNECTING state. None of the upper layer code can actually correct handle a transport doing this asynchronously, so the busy wait will cover that. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I3c1a5c115264ffcb87e549765d891d796e0c81fe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8909 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-07-28 07:04:00 +00:00
Ziye Yang	9ab0ffcce2	nvme_tcp: Add data pdu crc32c offloading in receving side by Accel framework. For receving the pdu, we add the crc32c offloading by Accel framework. Because the size of to caculate the header digest size is too small, so we do not offload the header digest. Change-Id: If2c827a3a4e9d19f0b6d5aa8d89b0823925bd860 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7734 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-06-15 08:34:58 +00:00
Jim Harris	d6f6ffd274	nvme: add NVME_CTRLR_STATE_CONNECT_ADMINQ Connect the adminq as part of controller initialization instead of controller construction. We never actually 'connected' the adminq for PCIe or vfio-user transports, since its a nop. But their connect_qpair transport ops function is also a nop for the adminq, so it's fine to generically connect the adminq across all transports. Note that we cannot read registers (cc or csts) during controller initialization now until after the adminq has been connected since reading fabrics registers depends on a connected adminq. This gets special cased for now, but eventually reading cc and csts will need to be part of the state machine itself to make it asynchronous. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ia5566d7c549d78d24b94ea253df51e697da6237f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8079 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-06-01 07:43:12 +00:00
Ziye Yang	252430a053	nvme_tcp: Correctly handle the data digest err According to NVMe-oF 1.1 spec, it is not a fatal error. So according to Figure 126 in NVMe Base specification, we should return "Transient Transport Error". Change-Id: I601304ae2bb24508882fb1ec8c7e53ec587ab515 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7795 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-05-31 07:15:16 +00:00
Jim Harris	f5ba8a5ef5	nvme: add NVME_CTRLR_STATE_READ_CAP Read CAP (Capabilities) register as part of controller initialization instead of controller construction. For now, still read CAP in the pcie and vfio-user controller construction, since they need the drstd (doorbell stride) to construct the admin queue. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I000fe880f2ec0d6de1d565c883d7ea0ae1ac2c81 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8078 Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-05-28 08:14:06 +00:00
Jim Harris	df01076f70	nvme: add NVME_CTRLR_STATE_READ_VS Read VS (Version) register as part of controller initialization instead of controller construction. This prepares for upcoming changes to make controller attach fully asynchronous. Since reading fabrics registers is an asynchronous operation, it will be easier to read the VS register as part of controller initialization which operates as an asynchronous state machine. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I771386dbdf5902633e0d9f91b3b20be98f26fdc3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8076 Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-05-28 08:14:06 +00:00
Ziye Yang	2250abaeca	nvme/tcp: Raname send_pdu to pdu in tcp_req. Since we will reuse send_pdu for other purpose in the next patch. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Iee5166131b70a25bc13aaa847bfc9066231f31a9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8028 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot	2021-05-26 09:20:15 +00:00
Ziye Yang	9776b89444	nvme/tcp: Fix the bug when doing offloading. For nvme/tcp connection, we use the synced manner if the qpair is not fully connected. Thus without the check, we will stuck here. And this patch fixes this issue. Change-Id: I72815bf5b4c0b31c4866bc1b9034b0e42b81d3f1 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8025 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot	2021-05-26 09:20:15 +00:00
Ziye Yang	00b0dc6624	nvme/tcp: Do not offload header crc32c calculation if header digest is enabled. The header size is very small, which does not have too much value to offload such calculation by hardware. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Iaa82f39312df7eef3282325a33677ea41ab735ab Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8011 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-05-25 07:12:43 +00:00
Ziye Yang	bcbccf8bb5	lib/nvme_tcp: Refactor the code to generate _nvme_tcp_pdu_payload_handle The purpose is to prepared for implement the async crc32 caculation in the future patch. Change-Id: Ia75f28154c49f08b527d48c63b9da79a6bdfede8 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7794 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-05-10 11:12:57 +00:00
Ziye Yang	82e4bfd346	nvme/tcp: Change the type of recv_pdu to pointer. This is prepared for using the hardware offloading engine in accel framework. And some fields in nvme_tcp_pdu needs to be DMA addressable. Change-Id: I75325e2cd7ff25fe938bea0ac9489a5027e3e0e9 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7770 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2021-05-07 11:41:24 +00:00
Ben Walker	42b47742de	nvme/tcp: Only flush socket if not part of poll group If the qpair is part of a poll group, the socket will get flushed as part of polling that group already. We only need to explicitly flush to handle the case where the qpair is not in a poll group. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ib2a510b6d26d1622950437d81e0a40f6b15d6b54 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7049 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot	2021-04-19 12:54:24 +00:00
Ben Walker	6b86039fd9	nvme/tcp: Ensure qpair is polled when it gets a writev_async completion There was a fix for this that went into the posix layer, but the underlying problem is the logic in the nvme/tcp transport. Attempt to fix that instead. Change-Id: I04dd850bb201641d441c8c1f88c7bb8ba1d09e58 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6751 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:24 +00:00
Ziye Yang	a620cd198f	nvme/tcp: Fix the zero copy enablement issue. Remove the polling group check. Because at this moment, the qpair is not added into a polling group. If we do not remove it, we will never enable zcopy feature for I/O qpair. And in sock implementmentation, we already fixed the zero copy handling if a socket is not in a polling group. See posix_sock_flush function. So we can fix this issue if we directly remove this check. Reported by: Aleksey Marchuk <alexeymar@mellanox.com> Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I969936c4b6c7f13cbfa4d6eb479010c53f3e384a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7056 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Mellanox Build Bot	2021-03-26 08:22:53 +00:00
Ziye Yang	f1f4f7d3bc	nvme/tcp: Use the async manner to send pdu when crc32c enabled. This patch refactor the pdu sending logic with the async manner, then if the group contains the accel engine, we can use it. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I2d669c0a3255d7a8898441e406906add2f3a3556 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6759 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-03-18 14:42:35 +00:00
Jim Harris	6156777bd4	nvme: assert if user tries to delete NULL tcp qpair It is invalid to try to delete a NULL qpair, so do not check for it in nvme_tcp_ctrlr_delete_io_qpair and return an error when NULL. Just change it to an assert instead. This makes it consistent with pcie and rdma. While here, add an assert in rdma as well. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic2f76deecb21b78749dac85e33fb1fa0d14a1239 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6917 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: <dongx.yi@intel.com>	2021-03-18 14:41:44 +00:00
Changpeng Liu	2f579469b6	nvme/tcp: pass correct parameter to nvme_tcp_qpair_send_h2c_term_req_complete Previously the callback parameter for this function is NULL, this will cause segment fault, so pass the correct parameter here. Fix #1817 Change-Id: Ie768b7bf4a72862d16a44742ab3032803d0939a2 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6690 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: <dongx.yi@intel.com> Community-CI: Mellanox Build Bot	2021-03-05 08:33:18 +00:00
Ziye Yang	579a678a51	nvme/tcp: Move sock creation into nvme_tcp_ctrlr_create_qpair function. Purpose: To get the optimal group, we need the socket information. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I17b048a402fbf002307dd225f64b20a9f876d642 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3324 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot	2021-02-25 10:26:08 +00:00
Ziye Yang	5206698e77	nvme/tcp: Add the implementation to get the optimal polling group Add the real support in nvme tcp transport. Change-Id: I2aa9b0284d6fe009925e67f602a055e787f77987 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5734 Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2021-02-04 08:30:54 +00:00
yidong0635	73487b15d8	nvme/nvme_tcp: Remove unnecessary returns. No need these returns at the end of void functions. So remove them. Signed-off-by: yidong0635 <dongx.yi@intel.com> Change-Id: I8889745f3ef82af513d03259a77a33c1f4f536cb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6015 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2021-01-22 08:16:17 +00:00
Alexey Marchuk	74542bae77	tcp: Rename readv_offset to rw_offset in nvme_tcp_pdu In the next patch this member will be used to track both read and write offsets Change-Id: I852125ff35257f9821ddf4a641d96afb29ebf0a0 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5924 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-01-21 09:55:53 +00:00
Alexey Marchuk	d296fcd8d9	nvme_tcp: Fix icreq/icresp handing with zcopy enabled. There is a problem with TCP zcopy enabled: 1. TCP initiator sends icreq and start polling a qpair. Polling of qpair actively calls nvme_tcp_read_pdu function 2. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_CH state, it reads 8 bytes of common PDU header. It determines the type of the PDU and finds the size of PDU_PSH header. 3. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state. It should read 120 bytes of icresp PDU. The number of bytes which needs to be read is pdu->psh_len - pdu->psh_valid_bytes. qpair receives 120 bytes (the full PDU) and calls nvme_tcp_pdu_psh_handle -> nvme_tcp_icresp_handle. Here we check that we haven't yet received buffer reclaim notification and simply return from this function. At the same time we continue to poll the qpair. 4. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state and tries to read data from a socket again. The number of bytes is pdu->psh_len - pdu->psh_valid_bytes. But now pdu->psh_len == pdu->psh_valid_bytes, so we call nvme_tcp_read_data with zero length. readv with zero length is commonly used to check errors on the socket, but in our case there is no errors and readv returns 0. 5. nvme_tcp_read_data treats zero as error and return NVME_TCP_CONNECTION_FATAL. Fix is to handle icresp, but leave qpair in INITIALIZING state until we receive acknowledgement for icreqsend_ack. We also move qpair to NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY recv_state so recv_pdu will be zerofied and qpair will try to read a common PDU header. But since it is not initialized yet, it won't receive anything from the target. Fixes issue #1633 Change-Id: I22cedefe530a8ac3b51495988ed6265d8fad15bb Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4969 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-10-30 09:05:35 +00:00
Alexey Marchuk	85fa43241b	nvmf/tcp: Support ICD for fabric/admin commands According to the SPEC we should support up to 8192 bytes of ICD for admin and fabric commands. Transport configuration parameter in_capsule_data_size is applied to all qpair types - admin and IO. Also we allocate resources when we get a connection request, so we don't know qpair type at this moment. Create a list of buffer in TCP poll group to support ICD up to 8192 bytes when configuration ICD is less than this value. The number of elements in this pool is hardcoded, it is planned to add a new configuration parameter later. Fixes issue #1569 Change-Id: I8589e3e2ea95d515f5503c6de7c1ee40aaf7b6da Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4754 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-10-27 08:40:12 +00:00
Alexey Marchuk	c72a16431a	nvme/tcp: Fix check of completion number during icresp handling The current approach checks "rc == 0". It worked before adding polling of poll group since a single qpair should return 1 completion for its own icreq while poll group can return several completions for all qpairs attached to this poll group (but .e.g not for those qpair who is waiting for the completion). Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I60d05d8d6640e4e2bbaf3cd533d2f5a3637adea1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4768 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-10-21 20:45:13 +00:00
Tomasz Zawadzki	2172c432cf	log: simplify SPDK_LOG_REGISTER_COMPONENT This patch removes the string from register component. Removed are all instances in libs or hardcoded in apps. Starting with this patch literal passed to register, serves as name for the flag. All instances of SPDK_LOG_* were replaced with just * in lowercase. No actual name change for flags occur in this patch. Affected are SPDK_LOG_REGISTER_COMPONENT() and SPDK_*LOG() macros. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I002b232fde57ecf9c6777726b181fc0341f1bb17 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4495 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Mellanox Build Bot Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI	2020-10-14 08:00:35 +00:00
Alexey Marchuk	3a2148213f	nvme/tcp: Enable zcopy send when qpair is attached to poll group We can receive buffer reclaim notifications only when a qpair is attached to a poll group (so qpair's socket is connected to a socket poll group). The previous assumption that we enable zcopy only for IO qpairs was wrong since IO qpair might not use poll groups too (e.g. abort application). Fixes issue #1607 Change-Id: I67329d755d81da6606e65eddfeceb20839346d87 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4476 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-10-06 09:35:31 +00:00
Jim Harris	1deb6b9e6b	nvme: disable zero copy for client TCP sockets This seems to be causing some CI test failures. So disable zero copy in all cases for now for client sockets. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iceea09fe65fb90c7df15f500878a473f1ad4152c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4473 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-30 00:21:26 +00:00
Alexey Marchuk	86865969ff	sock/posix: Enable send zero copy for client sockets In NVME TCP initiator zero copy is enabled for IO qpairs and disabled for admin qpairs Change-Id: Ibdf521dccde9b95ec5dd15a5eb2baed8fcf8b88e Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4211 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	a910bc647d	nvme/tcp: Calculate requests completed asyncronously A preparation step for enabling zero copy in NVMEoF TCP initiator. With zero copy enabled, some requests might be completed out of "process_completions" call and we should take them into account to return the correct number of completions. Change-Id: Iba7973f6da815645bbfad0334619d46b66379226 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4209 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	2ceff364e5	nvme/tcp: Add synchronization for icreq send ack and icresp A preparation step for enabling zero copy in NVMEoF TCP initiator. We should wait for both events to occur before continue qpair initialization. Add a new bit to nvme_tcp_qpair::flags to track receiving of icreq ack since icreq is sent without tcp_req and there is no way to apply existing synchronization mechanisms. Move tcp qpair to initializing state if we receive icresp before icreq ack, this state will be checked during handling of icreq ack to continue qpair initialization Change-Id: I7f1ec710d49fb1322eb0a7f133190220b9f585ab Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4207 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	bc36528cda	nvme/tcp: Process poll_group when waiting for icresp A preparation step for enabling zero copy in NVMEoF TCP initiator. Since nvme_tcp_qpair_process_completions doesn't process poll group, we can't get asycn notification from kernel. 1. Add a qpair to poll group before we send icreq in order to be able to process buffer reclaim notification. 2. Check if qpair is connected to a poll group and call nvme_tcp_poll_group_process_completions instead of nvme_tcp_qpair_process_completions when waiting for icresp 3. Add processing of poll group to nvme_wait_for_completion_timeout and nvme_wait_for_completion_robust_lock since they are used to process FABRIC_CONNECT command Change-Id: I38d2d9496bca8d0cd72e44883df2df802e31a87d Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4208 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	a85579d8ef	nvme/tcp: Refactor header/data digest using bitfields Currently host/data digest are bool members of nvme_tcp_qpair structure. Change the type of this members to bitfield, reserved bits will be used in the next patches to support zero copy. Change-Id: If0659bf2445901e45fe0816af5f4fca5f494b154 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4206 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	7388e54de4	nvme/tcp: Complete request when all acks are received A preparation step for enabling zero copy in NVMEoF TCP initiator. Make sure that we complete a request (call user's callback) when all acknowledgements are received. For write operation - when we received send cmd ack, h2c ack and response from target. For read operation - when we received send cmd ack and c2h completed Since we can receive send ack after resp command, store nvme completion received in resp command in a new field added to tcp_req structure Change-Id: Id10d506a346738c7a641a979e1c8f86bc07465a4 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4204 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	2d4af0c174	nvme/tcp: Add synchronization for subsequent R2T requests. A preparation step for enabling zero copy in NVMEoF TCP initiator. Some NVMEoF TCP targets can send several R2T requests. We should check that we finished the previous H2C (received buffer reclaim notification from kernel) before sending the next H2C. This patch adds a new ordering bit indicating the described case and 2 fields to nvme_tcp_req to store the values from the last R2T request which will be applied when send ack is received. Change-Id: Iaa5ad846712ca18a8382680baa02413c18c4eb37 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4203 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-09-29 09:35:47 +00:00
Ben Walker	85ddcf6f8d	nvme/tcp: Clean up error message Fix some spelling and make the message clearer Change-Id: Ib291542a9735d6409db84f16c530e78567123f67 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4249 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2020-09-22 11:40:58 +00:00
Alexey Marchuk	e7c92b2426	nvme/tcp: Rename r2t_recv, set this flag when send_ack is 0 Rename ordering bit r2t_recv to h2c_send_waiting_ack, that is more descriptive name. Change-Id: I6d6143ff4c1cccc74e11226b7974706808092f9a Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4202 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-16 07:58:59 +00:00
Alexey Marchuk	dc88d13129	nvme/tcp: Move tcp_req ordering bits to union This makes it easier to zerofy ordering bits. Change-Id: If5696bfedfff1bf75e41c1449eac7fccb469e98b Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4201 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-09-16 07:58:59 +00:00
Ziye Yang	d4d2e317b5	nvme/tcp: Make the return value consistent. We should make nvme_tcp_ctrlr_connect_qpair always return negative value if this function fails. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I457e704e39d7a3acd298fd48e89e8ea51e2ed4ad Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3809 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-08-24 07:37:13 +00:00
Ziye Yang	0d3cc15a62	nvme/tcp: Correct the incapsule data usage According to page35 in recent NVMe-oF spec ( NVMe-over-Fabrics-1.1-2019.10.22-Ratified), ioccsz is used to restrict the incapsule size of I/O command, so do not restrict the NVMe-oF OPC command and also the admin command. We accidently trigger an bug in kernel since we do not send the fabrics command with the incapsule and make the kernel coredump, though the kernel has bugs. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I869a2c8ab7b9c2ac1e5cc5b603920662591c2c64 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3837 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-08-20 09:26:06 +00:00
Ziye Yang	7bac9b06f1	nvme TCP: Make the control related pdu also allocated from the SPDK DMA memory Purpose: To make the pdu management consistent with other PDUs, then we can easily adapt our code into some hardware offloading solution. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ic4a2847fd1b6cacda4cbaa52ff12c338f0394805 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3588 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-08-04 18:28:08 +00:00
Ziye Yang	1da44e0604	nvme_tcp: Move the default buffer factor size in nvme_tcp.h 1 Change the default factor from 4 to 8, which can be used to improve the performance. 2 Change the base buffer size in nvme_tcp.c, we should not use sizeof(struct spdk_nvme_tcp_cmd), it is 72 bytes. Normally, the initiator will receive C2h pdus and R2T Pdus by most, so set the size of using sizeof(struct spdk_nvme_tcp_c2h_data_hdr) is enough. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I384f4cb026cb8d83e75b639f7256ee8cb8ed1df1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3283 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: John Kariuki <John.K.Kariuki@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-07-22 12:21:07 +00:00
Alexey Marchuk	e137881e4e	nvme/tcp: Insert free req at the head of the list lifo model is more cache friendly Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Id937ab0c1b8b4ce121136144c7d6013bbe5eb963 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3282 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2020-07-13 08:40:31 +00:00
Ziye Yang	4c9aad0299	nvme/tcp: Report the free entries if sending_ack is set Previous we fix the same issue in this commit: `cb98b2ab3e` But we forget to fix it here. And we also need to update here, otherwise we will still face the same issue described in commit: `cb98b2ab3e` Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I3660dbb6e97c92ea4cb347cfce4bf23c6dfe97ab Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3242 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-07-09 07:23:19 +00:00
Shuhei Matsumoto	f2bd635ecf	lib/nvme: Add qpair_iterate_requests() to iterate the common operation among transports To abort requests whose cb_arg matches, add child abort request greedily. Iterating all outstanding requests is unique for each transport but adding child abort is common among transports, and adding child abort is replaceable by other operations. Hence add qpair_iterate_requests() function to the function pointer table of transport, and pass the operation done in the iteration by a parameter of it. In each transport, the implementation of qpair_iterate_requests() uses TAILQ_FOREACH_SAFE() for potential future use cases. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ic70d1bf2613fce2566eade26335ceed731f66a89 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2038 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>	2020-07-08 07:54:01 +00:00
Shuhei Matsumoto	ad69e739e1	nvme/tcp: Dequeue request from outstanding list before calling completion Each request has a callback context as cb_arg, and the callback to nvme_complete_request() for the completed request may reuse the context to the new request. On the other hand, TCP transport dequeues tcp_req from tqpair->outstanding_reqs after calling nvme_complete_request() for the request pointe by tcp_req. Hence while nvme_complete_request() is executed, tqpair->outstanding_reqs may have two requests which has the same callback context, the completed request and the new submitted request. The upcoming patch will search all requests whose cb_arg matches to abort them. In the above case, the search may find two requests by mistake. To avoid such error, move dequeueing tcp_req from tqpair->outstanding_reqs before calling nvme_request_complete(). One exception is the case that only nvme_tcp_req_put() is called. For the case remove tcp_req from tqpair->outstanding_reqs before calling nvme_tcp_req_put(). Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I5f2ac292c60431ac1e27b8657db92b220860a0a8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2865 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-07-08 07:54:01 +00:00
Shuhei Matsumoto	e060285ea6	nvme/tcp: Change nvme_tcp_req_complete() to take tcp_req instead of req Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ida0ee76015821d7db54b273d14383a245a18047b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3058 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-07-08 07:54:01 +00:00
Ziye Yang	449dee3563	nvme/tcp: Fix the sending conflict between cmd and h2c pdu. As is well known, we may also handle the r2t data pdu receving earlier before calling the cb function of send_cmd due to the outof order execution of the lower layer uring socket interface.So we need to fix this issue, otherwise the data of the sending_pdu will be placed with the wrong data. And it will cause the issue shown in https://github.com/spdk/spdk/issues/1473 Fixes #1473 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Idac1ad65761695f3a655b85003861c1d1f4f3875 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3215 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-07-07 07:31:53 +00:00
Ziye Yang	cb98b2ab3e	nvme/tcp: Report that we have free entries if send_ack is set. Without this patch, we will face the following warning code when compiled with (--with-uring --enable-debug) while testing big I/O size: 256KB, e.g., "nvme_qpair.c: 474:nvme_qpair_resubmit_requests: ERROR: Unable to resubmit as many requests as we completed" The reason is because the nvme_tcp_request structure is not freed yet if send_ack is not set, so there will be no entries when there are other requests submit again. And this patch can mitigate such issue. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I4c7616fbd3c82a883b4e9facd257a1a4f66e876d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3123 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-07-01 07:51:17 +00:00
Ziye Yang	ceb07eb8f4	nvme/tcp: Fix send_cb and recv pdu function contention when there is R2T. When using uring socket, we see following assert nvme_tcp.c:1018: nvme_tcp_capsule_resp_hdr_handle: Assertion `tcp_req->state == NVME_TCP_REQ_ACTIVE' failed. Detailed info is in https://ci.spdk.io/results/autotest-per-patch/builds/19205/archive/nvmf-tcp-vg-autotest/build.log We face this issue, because there is also code execution ordering between "sending callback function" and "pdu receving function". We did not find it in physical machine testing, but finding it in vagrant machine in CI. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I5eb241d564c0fc42ce0601b7c85999a2550f0de3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3046 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-06-29 09:18:13 +00:00
Ziye Yang	2ac8d3ff5e	nvme/tcp: Allocate send_pdu with DMA allocated memory. Purpose: It will be used to leverage the uring acceleration later when we use io_uring_prep_write_fixed. Because for using the Registered buffers feature in I/O uring, we currently can register all the huge memories. And if we allocate send_pdus in DMA memory, we can leverage such feature. Change-Id: Id0ba5f7fe43202027c0378e9cbe74d861aad21e5 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3002 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-06-24 08:22:17 +00:00

1 2 3

147 Commits