ivampiresp/Spdk - Spdk - Leaflow Developers

Author	SHA1	Message	Date
John Levon	48408177b5	lib/nvmf: add a comment on max admin queue size Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: I247e95843bd15a341a66f7ab07d9639bea403bd4 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12301 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-20 08:21:02 +00:00
Ben Walker	e22c933edb	idxd: Make many internal idxd_user functions take an idxd_user object This reduces a lot of casting. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ibc04f422858642d0e20c9b020bb6c5d1b70256fe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11534 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2022-04-20 08:20:45 +00:00
Konrad Sztyber	72925e3db8	nvmf/tcp: delay completion for zcopy reqs w/ in-progress writes When a qpair is disconnected, any outstanding zero-copy requests are freed to release their buffers before the qpair gets destroyed. However, if there is a PDU being sent to the host as part of this request (e.g. C2HData/R2T), we need to wait until that write is done before freeing the request to avoid freeing it twice. Fixes #2445 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I2a6e82f26a4f011dfd18c55c821e9039de7e584a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12255 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-19 11:15:45 +00:00
Konrad Sztyber	75169d0dec	nvmf/tcp: update pdu_in_use flag in write functions This makes the flag indicate whether there's an outstanding PDU write for a given request. Additionally, it reduces the number of places we need to update this flag. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Id7e587f84955b096c46bfbf88d4dd222214d4a6a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12254 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-19 11:15:45 +00:00
Konrad Sztyber	c676c0815d	nvmf/tcp: use different callbacks for sending mgmt/req PDUs This will make it possible to have some common handling in request's PDU write completion. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Icaff38da0e47dd93327e3d8f09edd9fdba8f532e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12253 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-19 11:15:45 +00:00
Konrad Sztyber	37dc93b9ef	nvmf/tcp: adjust assert for zcopy req complete When an request using zcopy is completed, it might have an unreleased zcopy_bdev_io attached in three cases: 1) the request was a read, 2) the request was a failed write, 3) the qpair is being disconnected. The last case was missing from the assertion. Fixes #2425 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I5cbeaa198a1fd878c98caf148a0bc47060e35bca Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12263 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-19 11:14:56 +00:00
Konrad Sztyber	aa21240574	nvme/pcie: increase min admin queue size to 256 Now that IO qpairs can be created asynchronously, we need to make sure that all the create IO CQ/SQ commands can be executed simultaneously. It is pretty common to create multiple IO qpairs at the same time, e.g. adding an NVMe bdev to an nvmf subsystem will create an IO qpair on each poll group. In that case, if the number of cores exceed the size of the admin queue (actually it can be even lower due to outstanding AERs), we might run out nvme_requests on the admin queue. The chosen minimum value for the admin queue size, 256, should be enough to cover most cases. Fixes #2465 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I55c59aef64f3fdb33f7b4824d3e9beb403602633 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12270 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-19 08:18:34 +00:00
Shuhei Matsumoto	2c13441ba8	nvme_rdma: Destroy qpair after qpair is actually disconnected The RDMA transport can disconnect qpair asynchronously now. Previously, we tried to release the resource of the qpair after disconnected. However it did not work because it was done when deleting the qpair. The admin qpair was not deleted in a ctrlr reset sequence. This patch tries to satisfy the same aim again but by a different way. Previously, we released the resource of the qpair before starting actual disconnection process. This patch release the resource of the qpair after the qpair is actually disconnected. The related patches are: `b9518a5540` `eb09178a59` Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Id6a814895a35b1589b781a91744ef872b42aaa69 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11783 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	4b73223542	nvme_rdma: Wait until lingering qpair becomes quiet before completing disconnection The code to handle the lingering qpair when deleting it was really complicated. The RDMA transport can connect or disconnect qpair asynchronously. Then we can include the code to handle the lingering qpair into the code to disconnect qpair now. If the disconnected qpair is still busy, defer completion of the disconnection until qpair becomes idle. If poll group is not used, we can complete disconnection immediately because cq is already destroyed. The related data and unit test cases are not necessary anymore. So delete them in this patch. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	20cf90801e	nvme_rdma: Handle stale connection asynchronously Include delayed disconnect/connect retries with finite times into the state machine of asynchronous qpair connnection. We do not need to call back to the common transport layer but we need to do the following, clear rqpair->cq before starting disconnection if qpair uses poll group, and clear qpair->transport_failure_reason after disconnected. Additionally locate the new state STALE_CONN before INITIALIZING because cq is not ready to use for admin qpair when the state is STALE_CONN. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ibc779a2b772be9506ffd8226d5f64d6d12102ff2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11690 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	77c4657140	nvme_rdma: Factor out destroying rdma qpair operation Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I18e166a726cca69f13e7c5818eba57f478726286 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11689 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	aa36c18196	nvme_rdma: Pass callback to ctrlr_disconnect_qpair() via a parameter Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I06cbb9739286d1928ad9fc07de3715a449914d75 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11688 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	75d38a301d	nvme: poll_group_process_completions() returns -ENXIO if any qpair failed TCP transport already does it but was not documented clearly. RDMA and PCIe transports follow it and document it clearly. Then we can check each qpair's state if spdk_nvme_poll_group_process_completions() returns -ENXIO before disconnected_qpair_cb() is called. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2afe920cfd06c374251fccc1c205948fb498dd33 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11328 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Shuhei Matsumoto	9717b0c3df	nvme_rdma: Connect and disconnect qpair asynchronously Add three states, INITIALIZING, EXITING, and EXITED to the rqpair state. Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it to opts->async_mode for I/O qpair and true for admin qpair. Replace all nvme_rdma_process_event() calls by nvme_rdma_process_event_start() calls. nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING when starting to process CM events. nvme_rdma_ctrlr_connect_qpair_poll() calls nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is not admin qpair. nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true or qpair->poll_group is not NULL before polling CM events, or polls CM events until completion otherwise. Add comments to clarify why we do like this. nvme_rdma_poll_group_process_completions() does not process submission for any qpair which is still connecting. Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-18 18:35:29 +00:00
Jim Harris	ac0c53ae58	env_dpdk: do not set RTE_MEMPOOL_F_NO_IOVA_CONTIG This was added in patch `07526d85`, back in March 2018. This was before DPDK supported dynamic hugepage allocations. Presumably this flag was added to reduce the amount of memory lost due to mempool buffers that would otherwise span an IOVA boundary (mostly typical with IOMMU off and we are relying on physical addresses). Removing it simplifies any code in SPDK that uses mempool buffers for DMA operations, since it doesn't have to worry about splitting buffers that span an IOVA boundary - DPDK has already done it for us. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I49f6c1407fad02acae7e07c9dd00cb0449bd3554 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12277 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-15 08:25:54 +00:00
Tomasz Zawadzki	0368340581	lib/vhost: consolidate successful and invalid request path Both blk_request_finish() and invalid_blk_request() acomplished the same thing, with variation on handled statuses and debug logs. Consolidating those two into single function will help later on when replacing completion of request processing to single callback. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Iae7b93db01bfd98819b2bb8fad9e11afcdb3a459 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12196 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-15 07:49:32 +00:00
Tomasz Zawadzki	4f95fd7be6	lib/vhost_blk: get bdev io_channels via vhost_blk functions This patch adds vhost_blk_[get/put]_io_channel() to be used by virtio_blk transports. Functions related to vhost_user sessions were modified to use it. dummy_io_channel reference is managed at the vhost_blk layer and as such continues to use the spdk_[get/put]_io_channel() APIs. The description is updated to reflect its not specific to vhost_user transport. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I6644198da83bfa0210c167e203d3875e96f1e7ea Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11101 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-15 07:49:32 +00:00
Tomasz Zawadzki	223f1f1446	lib/vhost: separate out vhost_user specific json config The vhost_user_config_json() will be replaced with callback to virtio_blk transport. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I6ea0ea38f505f0d354cd34ee5ab9cd3a858bd82e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9538 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-15 07:49:32 +00:00
Tomasz Zawadzki	6f89388ed3	lib/vhost: move vhost_user related fields from spdk_vhost_dev spdk_vhost_dev structure should only contain generic fields that are to be used by either vhost, vhost_blk or vhost_scsi layer. The vhost_user backend can hold its properties in spdk_vhost_user_dev, which is maintained within rte_vhost. Both structures contain references back to each other. The reference in spdk_vhost_dev is a void pointer to allow future transports to keep the reference to their own structures. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I68640c524426d885c20242146365ba242fa9df8e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11813 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-15 07:49:32 +00:00
Or Gerlitz	bfcfdb7903	nvmf/rdma: Use spdk allocation scheme for RDMA requests and receives In a similar manner for what we do for other per IO data-structures of cmds, cpls and bufs, use the conventional huge-pages based spdk allocation scheme for RDMA requests and receives. Change-Id: I4c2e86e928106e78c053f24915e2a9ce1a200c78 Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12273 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-15 07:48:23 +00:00
Or Gerlitz	5edb8edca7	nvmf/rdma: use LIFO practice for incoming queue To maximize cache locality, use lifo and not fifo when managing objects which are used per IO such as the RDMA receive elements queue. Change-Id: Id8917558acc1bec29943fcbae6afe6b072bde6ac Reported-by: Jim Harris <james.r.harris@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12272 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-15 07:48:23 +00:00
zhangduan	87cfed8442	sock: Add ack_timeout to spdk_sock_opts Due to the same reason as transport_ack_timeout for RDMA transport, TCP transport also needs ack timeout. This timeout in msec will make TCP socket to wait for ack util closes connection. Signed-off-by: zhangduan <zhangd28@chinatelecom.cn> Change-Id: I81c0089ac0d4afe4afdd2f2c7e5bff1790f59199 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12214 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-14 08:34:29 +00:00
paul luse	37b68d7287	accel: cleanup by getting rid of capabilties enum In support of upcoming patches and to greatly simplify things, the capabilites enum which held bit positions for each opcode has been removed. Only the opcodes enum remains and thus only opcodes are used throughout. For the capabiltiies bitmap a helper function is added to convert from opcode to bit position. Right now it is used in the IO path but in upcoming patches that goes away and the conversion is only done at init time. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ic4ad15b9f24ad3675a7bba4831f4e81de9b7bc70 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11949 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:32:50 +00:00
Shuhei Matsumoto	0d29a988be	bdev: Use spdk_for_each_bdev() for bdev_get_io_stat and get_bdevs RPCs spdk_for_each_bdev() can be applied simply to rpc_bdev_get_bdevs() because the callback function rpc_dump_bdev_info() is synchronous. spdk_for_each_bdev() cannot be applied simply to rpc_bdev_get_iostat(). The factored-out callback function _bdev_get_device_stat() is asynchronous. Add desc pointer to struct spdk_bdev_io_stat and open before and close after executing spdk_bdev_get_device_stat(). Replace spdk_bdev_get_by_name() by spdk_bdev_open_ext() when a bdev name is specified. spdk_bdev_register() checks if the name of each bdev is set and each bdev is opened while collecting its stats now. Hence it is not possible that spdk_bdev_get_name() returns NULL. Simplify rpc_bdev_get_iostat_cb() based on this fact. Furthermore, we want to fail the RPC for all failures. The callback function to spdk_bdev_get_device_stat() is executed after stack unwinding if successful. Defer starting RPC response until it is ensured that all spdk_bdev_get_device_stat() calls will succeed or there is no bdev. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I7b036d6d707c49d19c8922a159b12b5b5ce7ca41 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12089 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:32:12 +00:00
Ziv Hirsch	e749fa9c27	nvmf: fix buffer overflow on admin commands When req->iovcnt is bigger than 1, `memset(req->data, 0, req->length)` is wrong. Signed-off-by: Ziv Hirsch <zivhirsch13@gmail.com> Change-Id: Ie53eba686b4c5889bbde3b3644d51acbef303b42 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12216 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-14 08:31:35 +00:00
Jim Harris	92f0be87a0	iscsi: use EVP APIs for md5 calculations OpenSSL 3.0 deprecated the MD5_xxx APIs, so switch the md5 code in the iscsi library to use the EVP APIs recommended by OpenSSL instead. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic5e3cd6e30ebc8b027f0715434cc3be045f1b770 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12240 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-13 08:33:51 +00:00
Jim Harris	7778bc3a33	vhost: copy virtio_blk_outhdr to local struct Some SeaBIOS versions are not aligning virtio_blk_outhdr on 8-byte boundary, causing ubsan failures. To be safe, let's just make sure on our end that we only access a properly aligned structure by copying this small (16-byte) data structure to a local structure variable. Fixes issue #2452. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iacad72c3a1759fb8dc5ba411272a34d93ef2a6fc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12238 Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-13 08:33:51 +00:00
Changpeng Liu	1bac8afd02	nvmf/vfio-user: unlink created files Only FDs are used for passing them to another process, we can unlink them after creation. Here we only unlink the files created in vfio-user, and there is still one file created via libvfio-user, it will be fixed via https://github.com/nutanix/libvfio-user/issues/660. Partly fix issue #2449. Change-Id: Ie27640e0cb85f44596e9d0ad5a2b67adf0419f5c Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12195 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-12 13:59:57 +00:00
Changpeng Liu	c47b7b0276	nvme/vfio-user: use API to setup BAR0 doorbells We can use lib/vfio-user API to setup BAR0 doorbells, existing implementation is redundant. Change-Id: Ib880d167c84c6b8482bf1a35559a34c939f6a02d Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12211 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: John Levon <levon@movementarian.org> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-12 07:24:22 +00:00
Konrad Sztyber	91aee82d74	vmd: use config_bus_number when resetting root ports The config_bus_number is an offset within the config space reserved for the devices behind the VMD, while bus_number refers to the actual bus number assigned by VMD that depend on the VMCAP and VMCONFIG registers. So, to access the mapped config space we have to use config_bus_number. We didn't do that when resetting root ports', which could lead to segfaults if these values were different, as we'd access unmapped memory. Fixes #2451 Change-Id: I4e7bbb81400462284014565099bec98f6171c8c9 Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12208 Reviewed-by: Tom Nabarro <tom.nabarro@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-11 07:44:23 +00:00
Tomasz Zawadzki	f9fccbae63	lib/vhost: separate out vhost_user backend callbacks Previously spdk_vhost_dev_backend held callbacks for vhost_blk and vhost_scsi functionality, along with ones that are called by the vhost_user backend. This patch separates out those callbacks into two structures: - spdk_vhost_dev_backend - to be implemented by vhost_blk and vhost_scsi - spdk_vhost_user_dev_backend - is only implemented by vhost_user backend, callbacks for session managment specific to that transport Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I348090df5dddeb2b1945b082b85aec53d03c781b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11812 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-11 07:44:09 +00:00
Tomasz Zawadzki	c010a71f27	lib/vhost: make packed_ring_recovery per controller Previously g_packed_ring_recovery was set globally. Setting that during controller creation, would affect all previously created controllers. This is now set on per-controller basis and only enabled if packed ring feature is used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Idcc7231471446c805154648ab835a6af78f6543c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12040 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-11 07:44:09 +00:00
sberbz	fd53562a87	vhost: parse JSON vhost_blk devices specific params Separate parsing generic rpc vhost params form device specific, this solution allow to create various device which share common parameters. Change-Id: I50b1a89a8260fb1394880a750591e95539995288 Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12026 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-11 07:44:09 +00:00
Jim Harris	5d4b553cd9	vmd: change DEBUGLOGs to INFOLOG None of the current DEBUGLOGs are in any kind of performance critical code path. Making them INFOLOG means that we can enable them even in release builds to get additional information if needed. (Note: planning to do this with other libraries and modules as well.) Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I61765fab843a06c36ac1979151589e8f57fea76e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12209 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-11 07:43:30 +00:00
John Levon	32e54c6b16	nvmf/vfio-user: refactor suppressed IRQ handling No functional change; this just makes the poll code a little easier to read. Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: If6d1dcd940ed5b461856b535b1bf01c4efa8612a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12076 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-05 11:48:45 +00:00
Shuhei Matsumoto	f41248ffde	bdev: Use spdk_bdev_open_ext() for some simple RPCs Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ie13f3153ea711f64f2099b2e4d37855b79977f82 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12148 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-05 11:48:31 +00:00
Jim Harris	183c348557	nvmf/rdma: issue fused cmds consecutively to tgt layer RDMA READs can cause cmds to be submitted to the target layer in a different order than they were received from the host. Normally this is fine, but not for fused commands. So track fused commands as they reach nvmf_rdma_request_process(). If we find a pair of sequential commands that don't have valid FUSED settings (i.e. NONE/SECOND, FIRST/NONE, FIRST/FIRST), we mark the requests as "fused_failed" and will later fail them just before they would be normally sent to the target layer. When we do find a pair of valid fused commands (FIRST followed by SECOND), we will wait until both are READY_TO_EXECUTE, and then submit them to the target layer consecutively. This fixes issue #2428 for RDMA transport. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I01ebf90e17761499fb6601456811f442dc2a2950 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12018 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2022-04-05 08:32:06 +00:00
Shuhei Matsumoto	428b17a0a8	bdev: Add spdk_for_each_bdev/bdev_leaf for clean up and further improvements To execute a callback function for each registered bdev or unclaimed bdev, add new public APIs, spdk_for_each_bdev() and spdk_for_each_bdev_leaf(). These functions are safe for race conditions by opening before and closing after executing the provided callback function. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I59b702ffec7b4fc5e9779de5a3a75d44922b829b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12088 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:30:47 +00:00
Shuhei Matsumoto	941ca7e09e	bdev: Factor out bdev close operation from spdk_bdev_close() Bdev open/close will be done for each bdev when traversing the bdev list. This patch is a preparation. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I2486bd823953fe020ed6106844877e1cf49d8a0d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12126 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:30:47 +00:00
Shuhei Matsumoto	b4bcf7721d	bdev: bdev_close() unlock g_bdev_mgr.mutex after spdk_io_device_unregister() spdk_io_device_unregister() send message to call its callback. So to make the following patches easier, consolidate g_bdev_mgr.mutex unlocks to the end of spdk_bdev_close(). Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ib3b5c72be06e764918da30d7aa9fbc2ccd33956e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12125 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:30:47 +00:00
Shuhei Matsumoto	ced08048ee	bdev: Factor out descriptor allocation from spdk_bdev_open_ext() Bdev open/close will be done for each bdev when traversing the bdev list. This patch is a preparation. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I4e4fe6f1248176631a74c09585c931b21eb49d2b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12124 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-05 07:30:47 +00:00
Alexey Marchuk	3185d3c92f	bdev: Report memory domains in bdev_get_bdevs RPC This change will simplify development/debugging. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ibde374089057a0684391f6519fa4e878d007408d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11049 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-04 09:57:56 +00:00
Alexey Marchuk	d7ac3d92e4	bdev/part: Use ext bdev API in IO path That will allow to pass ext opts to bdev layer Since in ext API metadata is passed as part of ext IO opts structure and ext opts can be NULL (e.g. upper layed used regular API), in that case we use *blocks_with_md API Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I1bfb3fcb11bf42e100ecc7e4058087f12086db3a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11048 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-04 09:57:56 +00:00
Alexey Marchuk	1299439f3d	bdev: pull/push data if bdev doesn't support memory domains If bdev doesn't support any memory domain then allocate internal bounce buffer, pull data for write operation before IO submission, push data to memory domain once IO completes for read operation. Update test tool, add simple pull/push functions implementation. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ie9b94463e6a818bcd606fbb898fb0d6e0b5d5027 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10069 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>	2022-04-04 09:57:56 +00:00
Shuhei Matsumoto	96c007d301	bdev: Add spdk_bdev_unregister_by_name() to handle race condtions To unregister a bdev more correctly, we had to call spdk_bdev_open_ext(), spdk_bdev_desc_get_bdev(), spdk_bdev_unregister(), and then spdk_bdev_close(). This was correct but complicated. Hence add a new public API spdk_bdev_unregister_by_name() which does the whole correct sequence of bdev unregistration. Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I9068d4ac49dca944436e0ba587308fd356dfef75 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12065 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-04 09:57:43 +00:00
Tomasz Zawadzki	6301f8915d	lib/sock: provide a hint to picking optimal poll group The process of matching qpair to poll group is split into two distinct parts that occur on different threads. See spdk_nvmf_tgt_new_qpair(). This results in a race condition for TCP between spdk_sock_map_lookup() and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group() and spdk_nvmf_poll_group_add() respectively. Fixes #2113 This patch picks a hint from nvmf_tcp for next poll group, which is then passed down to spdk_sock_map_lookup(). When matching placement_id exists, but does not have a poll group assigned - the hint will be used. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2022-04-01 12:41:26 +00:00
Chunsong Feng	0db0c443df	nvmf/rdma: Improve read performance in DIF strip mode The rdma buffer for stripping DIF metadata is added. CPU strips the DIF metadata and copies it to the rdma buffer, improving the rdma write bandwith. The network bandwidth during 4KB random read test is increased from 79 Gbps to 99 Gbps, the IOPS is increased from 2075K to 2637K. Fixes issue #2418 Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Change-Id: If1c31256f0390f31d396812fa33cd650bf52b336 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11861 Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2022-04-01 11:19:18 +00:00
paul luse	75209b1d53	lib/idxd: fail init if IOMMU is not enabled Currently the idxd driver requires VFIO so avoid unexpected errors if someone tries without it (with UIO). Temp workaround for issue #2316 Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: I430cd2193bc8dbd6939af7d0ca799832e7a73213 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11816 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2022-04-01 08:30:46 +00:00
Kozlowski Mateusz	adc36d5be3	lib/vhost: Fix ENOMEM resubmission for vhost_blk In the current behavior the iovcnt is lowered before sending it to the next BDEV in the stack - however if the returned value is ENOMEM (due to eg. not enough bdev requests in the pool), the request needs to be returned to its original state, as it would be resubmitted with skipped iov entries. Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com> Change-Id: I7240510a2ec04594b248f7347e86ac11ecfd26a0 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11976 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2022-04-01 08:30:28 +00:00
Chunsong Feng	05dd3c0bb2	dif: enhance copy API to support block-aligned bounce_iov When iovs are copied from bounce or to bounce, the bounce is usually alloced from data_buf_pool for better performance, and is multi iovs instead of a single buffer. Therefore, block-aligned bounce are supported. Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Change-Id: If56b21d9e46c73d4c956c227bec33ddd0ab9745b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11860 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2022-04-01 08:29:12 +00:00

1 2 3 4 5 ...

9258 Commits