Commit Graph

1727 Commits

Author SHA1 Message Date
Ankit Kumar
82c61e0678 lib/nvme: 0 based numd for reservation report
Fix for number of dwords which is 0 based as per spec.
Use bitwise operators instead of division and modulus.

Change-Id: Ib315bf9394ef599317f41429742e7b8054069549
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16814
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-02-24 12:46:40 +00:00
Ankit Kumar
9a1457ff1e lib/nvme: Add support for IO management commands
TP4146 introduced support for two new IO commands,
IO management receive and send.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Change-Id: Iaf37310b84e278df043dcf71a0c2ef912c2fca8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16520
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-02-15 10:37:56 +00:00
Ankit Kumar
cc7736c968 include/nvme_spec.h: add changes for fdp log pages
TP4146 added support for 4 new log pages.
These are FDP configurations, reclaim unit handle usage,
FDP statistics and FDP events.

Updated the identify example file accordingly.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Change-Id: I5a20b728605257774d72bc184b50bc5008e142ea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16518
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-02-15 10:37:56 +00:00
Ankit Kumar
7bbeb80a31 nvme: support 64 LBA formats for NVM and ZNS command set
Format LBA size (FLBAS) is updated to have:
Bit 3:0 as least significant 4 bits for format index
Bit 6:5 as most significant 2 bits for format index

NVMe format command fields are updated accordingly.

Add a new helper function to fetch the correct format index.
Update examples and unit test files accordingly.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Change-Id: I2d6d9045b9d65ae91cb18843ca75b59cc27ed2f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16515
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-02-15 10:37:56 +00:00
MengjinWu
2db14b40ec nvme-tcp: nvme-tcp does not depend on lib/thread
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I44d5107292e5f335148ffccf1980741eb356d628
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16680
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-02-14 09:04:29 +00:00
Konrad Sztyber
ac94b60b54 nvme/tcp: fail qpair when spdk_sock_flush() fails
If spdk_sock_flush() returns an error, there's no reason not to
disconnect the qpair, as it usually means that that socket's connection
has been terminated.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I54e9bebc38e2a24a3baf69eb18ec3c654b210318
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16644
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-02-13 13:50:15 +00:00
Konrad Sztyber
739c6d7c5a nvme/tcp: check for EAGAIN when flushing socket
The bahavior of spdk_sock_flush() was changed in 5433004ec to return the
number of flushed bytes and -1 with errno set to EAGAIN in case nothing
has been flushed (instead of returning 0).  Therefore, we shouldn't
treat EAGAIN as an error in nvme_tcp_qpair_process_completions().

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5473488b5b408cdc739921046f1a0cc2c98f98de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-02-13 13:50:15 +00:00
Shuhei Matsumoto
bbd3d96b85 nvme_rdma: Ignore response if its QP was already destroyed
This is a workaround but is necessary to fix the github issue #2874.
Due to some unknown reason, in nightly test with Intel e810 NICs
when a qpair is created with synchronous mode and connection errors
are detected, the qpair is destroyed even if requests for the qpair are
still inflight. Then, nvme_rdma_process_recv_completion() causes NULL
pointer acccess. To fix this NULL pointer access, change
nvme_rdma_process_recv_completion() to return immediately if rsp->rqpair
is NULL. Add a TODO comment to find a root cause and really fix the
issue.

One of the fixes for the issue #2874.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic810922f7ea1b32373b15f4e0cf7c2429659cbab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16431
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-01-25 10:01:03 +00:00
Shuhei Matsumoto
9aabfb59d9 nvme_rdma: Fix null pointer access and memory leaks for rqpair->reqs and rsps
Supporting SRQ caused two kinds of memory leaks. Fix both in this patch.

1. rqpair->rsps was leaked and null pointer access occurred

An error was detected during the nightly nvmf_delete_subsystem test.
The NVMe perf tool crashed with SIGABRT.

The reason of the crash was

nvme_rdma.c:2504:2: runtime error: member access within null pointer of type 'struct nvme_rdma_rsps'

This was caused by clearing rqpair->rsps before freeing rqpair->rsps.
rqpair->rsps should have been held until rqpair->rsps is freed. However,
when we support SRQ, rqpair->rsps was cleared when releasing rqpair->poller
by mistake. rqpair->rsps should be cleared only if SRQ is enabled because
in this case rqpair uses rsps of rqpair->poller.

2. rqpair->reqs and rsps are leaked for admin qpair at controller reset

To avoid unnecessary alloc and free for rqpair->rsps when enabling SRQ,
nvme_rdma_create_reqs() and nvme_rdma_create_rsps() were moved to
nvme_rdma_connect_established().

On the other hand, nvme_rdma_free_reqs() and nvme_rdma_free_rsps() were
called by nvme_rdma_ctrlr_delete_io_qpair().

However, at controller reset, admin qpair was just disconnected and
reconnected. In this case, nvme_rdma_create_reqs() and
nvme_rdma_create_rsps() were called again without calling
nvme_rdma_free_reqs() and nvme_rdma_free_rsps().

Hence, memory leak occurred.

To fix the memory leak, move nvme_rdma_free_reqs() and nvme_rdma_free_rsps()
from nvme_rdma_ctrlr_delete_io_qpair() to nvme_rdma_qpair_destroy().

One of the fixes fot the issue #2874

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I167ba908cff73d7a0be2248affce4c54f233da51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16384
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-25 10:01:03 +00:00
Tomasz Zawadzki
3359bf34d6 so_ver: increase all major versions
To allow SO_MINOR updates on LTS for the whole year it is supported,
the major version for all components needs to be increased.
This is to prevent scenario where two versions exists with matching
versions, but conflicting ABI.
Ex. Next SPDK release adds an API call increasing the minor version,
then LTS needs just a subset of those additions.

Increasing major so version after LTS, allows the future releases
to update versions as needed. Yet allowing LTS to increase minor
version separately.

Disabled test for increasing SO version without ABI change, as
that is goal of this patch. This check shall be removed with SPDK 23.05
release.
Looks like this was left over from prior LTS, to avoid that
make sure it is only skipped when running against v23.01.x as latest
release.

This patch:
- increases SO_VER by 1 for all components
- resets SO_MINOR to 0 for all components
- removes suppressions for ABI tests


Short reference to how the versions were changed:
MAX=$(git grep "SO_VER := " | cut -d" " -f 3 | sort -ubnr | head -1)
for((i=$MAX;i>0;i-=1)); do find . -name "Makefile" -exec \
	sed -i -e "s/SO_VER := $i\$/SO_VER := $(($i+1))/g" {} +;  done
find . -name "Makefile" -exec \
	sed -i -e "s/SO_MINOR := .*/SO_MINOR := 0/g" {} +

Change-Id: I3e5681802c0a5ac6d7d652a18896997cd07cc8bf
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16419
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-24 08:37:21 +00:00
Shuhei Matsumoto
bcd987ea2d nvme_rdma: Support SRQ for I/O qpairs
Support SRQ in RDMA transport of NVMe-oF initiator.

Add a new spdk_nvme_transport_opts structure and add rdma_srq_size
to the spdk_nvme_transport_opts structure.

For the user of the NVMe driver, provide two public APIs,
spdk_nvme_transport_get_opts() and spdk_nvme_transport_set_opts().

In the NVMe driver, the instance of spdk_nvme_transport_opts,
g_spdk_nvme_transport_opts, is accessible throughtout.

From an issue that async event handling caused conflicts between
initiator and target, the NVMe-oF RDMA initiator does not handle
the LAST_WQE_REACHED event. Hence, it may geta WC for a already
destroyed QP. To clarify this, add a comment in the source code.

The following is a result of a small performance evaluation using
SPDK NVMe perf tool. Even for queue_depth=1, overhead was less than 1%.
Eventually, we may be able to enable SRQ by default for NVMe-oF
initiator.

1.1 randwrite, qd=1, srq=enabled
./build/examples/perf -q 1 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  162411.97     634.42       6.14       5.42     284.07
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  163095.87     637.09       6.12       5.41     423.95
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  164725.30     643.46       6.06       5.32     165.60
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  162548.57     634.96       6.14       5.39     227.24
========================================================
Total                                                                     :  652781.70    2549.93       6.12

1.2 randwrite, qd=1, srq=disabled
./build/examples/perf -q 1 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  163398.03     638.27       6.11       5.33     240.76
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  164632.47     643.10       6.06       5.29     125.22
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  164694.40     643.34       6.06       5.31     408.43
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  164007.13     640.65       6.08       5.33     170.10
========================================================
Total                                                                     :  656732.03    2565.36       6.08       5.29     408.43

2.1 randread, qd=1, srq=enabled
./build/examples/perf -q 1 -s 1024 -w randread -t 30 -c 0xF -o 4096 -r '
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  153514.40     599.67       6.50       5.97     277.22
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  153567.57     599.87       6.50       5.95     408.06
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  153590.33     599.96       6.50       5.88     134.74
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  153357.40     599.05       6.51       5.97     229.03
========================================================
Total                                                                     :  614029.70    2398.55       6.50       5.88     408.06

2.2 randread, qd=1, srq=disabled
./build/examples/perf -q 1 -s 1024 -w randread -t 30 -c 0XF -o 4096 -r '
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  154452.40     603.33       6.46       5.94     233.15
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  154711.67     604.34       6.45       5.91      25.55
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  154717.70     604.37       6.45       5.88     130.92
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  154713.77     604.35       6.45       5.91     128.19
========================================================
Total                                                                     :  618595.53    2416.39       6.45       5.88     233.15

3.1 randwrite, qd=32, srq=enabled
./build/examples/perf -q 32 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r 'trtype:RDMA adrfam:IPv4 traddr:1.1.18.1 trsvcid:4420'
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  672608.17    2627.38      47.56      11.33     326.96
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  672386.20    2626.51      47.58      11.03     221.88
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  673343.70    2630.25      47.51       9.11     387.54
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  672799.10    2628.12      47.55      10.48     552.80
========================================================
Total                                                                     : 2691137.17   10512.25      47.55       9.11     552.80

3.2 randwrite, qd=32, srq=disabled
./build/examples/perf -q 32 -s 1024 -w randwrite -t 30 -c 0XF -o 4096 -r 'trtype:RDMA adrfam:IPv4 traddr:1.1.18.1 trsvcid:4420'
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  672647.53    2627.53      47.56      11.13     389.95
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  672756.50    2627.96      47.55       9.53     394.83
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  672464.63    2626.81      47.57       9.48     528.07
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  673250.73    2629.89      47.52       9.43     389.83
========================================================
Total                                                                     : 2691119.40   10512.19      47.55       9.43     528.07

4.1 randread, qd=32, srq=enabled
./build/examples/perf -q 32 -s 1024 -w randread -t 30 -c 0xF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  677286.30    2645.65      47.23      12.29     335.90
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  677554.97    2646.70      47.22      20.39     196.21
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  677086.07    2644.87      47.25      19.17     386.26
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  677654.93    2647.09      47.21      18.92     181.05
========================================================
Total                                                                     : 2709582.27   10584.31      47.23      12.29     386.26

4.2 randread, qd=32, srq=disabled
./build/examples/perf -q 32 -s 1024 -w randread -t 30 -c 0XF -o 4096 -r
========================================================
                                                                                                              Latency(us)
Device Information                                                        :       IOPS      MiB/s    Average        min        max
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  0:  677432.60    2646.22      47.22      13.05     435.91
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  1:  677450.43    2646.29      47.22      16.26     178.60
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  2:  677647.10    2647.06      47.21      17.82     177.83
RDMA (addr:1.1.18.1 subnqn:nqn.2016-06.io.spdk:cnode1) NSID 1 from core  3:  677047.33    2644.72      47.25      15.62     308.21
========================================================
Total                                                                     : 2709577.47   10584.29      47.23      13.05     435.91

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I843a5eda14e872bf6e2010e9f63b8e46d5bba691
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14174
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2023-01-17 23:53:01 +00:00
Shuhei Matsumoto
4999a9850c nvme_rdma: Move responses from rdma_qpair into a separate object
Move parallel arrays of response buffers and response SGLs from
qpair to a new responses object.

Use options to create the responses object.

Use spdk_zmalloc() to allocate the responses object because qpair
is also allocated by spdk_zmalloc().

The purpose is to share the code and the data structure between
SRQ is enabled and disabled.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ia23fe7328ae1f2f551fed5863fd1414f8567d602
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14172
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-17 23:53:01 +00:00
Richael Zhuang
070d61f2d6 nvme: add API to get outstanding reqs number
Added spdk_nvme_qpair_get_num_outstanding_reqs to get the number
of outstanding reqs for a specific qpair.

Change-Id: I55d75a7363ac63bd26db76594e70e8b17b3e5830
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2023-01-09 14:49:11 +00:00
Richael Zhuang
41bf6280e9 nvme: add num_outstanding_reqs in spdk_nvme_qpair
Added num_outstanding_reqs in struct spdk_nvme_qpair to record outstanding
req number in each qpair. This can be used by multipath to select I/O
path.

Increment num_outstaning_reqs when req is removed from free_req queue and
decrement it when req is put back in free_req queue.

Change-Id: I31148fc7d0a9a85bec4c56d1f6e3047b021c2f48
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15875
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-09 14:49:11 +00:00
Xue Liu
e9a94122b8 nvme/pcie: add memory barrier for LOONGARCH
Add memory barrier for LOONGARCH in nvme_pcie_qpair_process_completions.

Signed-off-by: Xue Liu <liuxue@loongson.cn>
Change-Id: Icc992ef612a00dd18ff33f70ab8f54e8c5d5c5b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16083
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2023-01-06 15:54:46 +00:00
Fengnan Chang
958d4e0e05 nvme: fix memleak when submit request failed
Some memory alloc in nvme_allocate_request_user_copy, and submit
through nvme_qpair_submit_request, if nvme ctrlr is failed or
qpair state not meet the requirements, submit will return -ENXIO,
and call nvme_free_request(), but it will not free
req->payload.contig_or_cb_arg, those memory only gets freed when the
request is actually completed, through nvme_user_copy_cmd_complete().
Let's fix this by add check when submit failed.

Fixes issue #2832
Change-Id: I54f0fc60dbb53ced9f52da7d89017be13db2eee1
Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15985
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-05 23:26:42 +00:00
Fengnan Chang
02ecb2dcba nvme: make submit request error handle in one place
rc to -ENXIO and goto error, make all error handle in one place,
so it's easy to add more check in later patch.

Change-Id: I13edeef75bbf6c52e18d6b94b78c2e560012bfee
Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16004
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2023-01-05 23:26:42 +00:00
Michael Haeuptle
7706450f2a nvme_rdma: Support TOS for RDMA initiator
The spdk_nvme_ctrlr_opts now supports a transport_tos option
that allows setting of the 'type of service' value in the IPv4 header.

This is needed to support lossless RoCE setups.

Note: Only RDMA is supported at this point.

Change-Id: I21825fc197c60f539a7d2d651a970ea380d8b56d
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-05 19:54:53 +00:00
Shuhei Matsumoto
ce92d919d7 nvme: Add a helper function to return status type string
Add spdk_nvme_cpl_get_status_type_string() to return ASCII
string for the type of an error.

Append a dummy entry to return "RESERVED" for unknown types.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibc07132ee067f146ac149884c6344f313bfcbfff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-04 08:22:31 +00:00
Shuhei Matsumoto
8f990f5e47 nvme: Update status-string array to add newly or missing status codes
spdk_nvme_cpl_get_status_string() will be used to count and display
NVMe specific errors via JSON-RPC. This patch is a preparation.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ia96890172d752d2906549e3033c0b26eef9c20bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15834
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2023-01-04 08:22:31 +00:00
GangCao
46d02f3e95 lib/nvme: add the NULL check after getting ns
Change-Id: Ib6188269dfce1a9229850b06dc61d8bfc0ede74a
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16072
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2023-01-03 07:59:59 +00:00
Michal Berger
3f912cf0e9 misc: Fix spelling mistakes
Found with misspell-fixer.

Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: If062df0189d92e4fb2da3f055fb981909780dc04
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15207
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-12-09 08:16:18 +00:00
Mike Gerdts
9d06166f5b nvme: annotate and log existing deprecation
Use the deprecation API to annotate and log the deprecation of
spdk_nvme_ctrlr_prepare_for_reset() using the tag
"nvme_ctrlr_prepare_for_reset".

Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I98fd840aa9acc028a49bb47daf4ab7e88f1eb818
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15756
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-12-08 12:59:32 +00:00
Shuhei Matsumoto
1c57fa1a95 nvme_rdma: Rename poll_group_set_cq() by qpair_set_poller()
In the following patches, nvme_rdma_poll_group_set_cq() will
touch not only CQ but also SRQ and receive WR objects.

All these resources are of a poller.

Hence for clarification, rename nvme_rdma_poll_group_set_cq()
by nvme_rdma_qpair_set_poller().

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic59ba5a45833e39b1b2647c000c8b953f1031d6b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14910
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
e22dcc075a nvme_rdma: Factor out reset failed sends/recvs operation
Factor out reset failed recvs operation into a helper function
nvme_rdma_reset_failed_recvs(). This will make the following
patches simpler.

For send operation, this change is not required yet, but in future
we may support something like shared SQ. Hence, we do this change
for send operation too.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ib44acebe63e97e5a60ea6fa701b49278c7f44b45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14171
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
4cef00cbbf nvme_rdma: Merge alloc_ and register_reqs/rsps into create_reqs/rsps functions
In the following patches, poll group will have rsps objects and to share
the code between poll group and qpair, option for creation will be used.

As a preparation, merge nvme_rdma_alloc_rsps() and
nvme_rdma_register_rsps() into nvme_rdma_create_rsps(). For consistency,
merge nvme_rdma_alloc_reqs() and nvme_rdma_register_reqs() into
nvme_rdma_create_reqs().

Update unit tests accordingly.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I92ec9e642043da601b38b890089eaa96c3ad870a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14170
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
8e48517f96 nvme_rdma: Defer send/recv objects allocation until connection is established
When SRQ is supported, recv objects will be allocated by poll group
and qpair will associated and use them. In this case, we do not want
qpair to allocate and free recv objects. When connection is established,
it will be decided if SRQ is used or not. Hence, defer recv objects
allocation until connection is established.

Send objects are not affected directly by SRQ, but
nvme_rdma_register_reqs() no longer does any registration and deferring
send objects allocation makes the code more consistent. Hence, defer
send objects allocation until connection is established too.

Even after this patch, we rely on nvme_rdma_ctrlr_delete_io_qpair()
to free resources completely.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ic151fad01009d92a7fc809a730e6e9dff1a365f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14169
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
6602291766 nvme_rdma: Move submit_recvs() from register_rsps() to connect_established()
Response objects will be in poll group when SRQ is enabled. But we want
to share the code to allocate and register response objects between SRQ
is enabled or disabled. To do it cleanly, move
nvme_rdma_qpair_submit_recvs() from nvme_rdma_register_rsps() to
nvme_rdma_connect_established(). A few clean up of error handling are
done in this patch. Unregistration will be done when qpair is
disconnected.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I38dc5a6cb84a6bf56c01d5fb7f2cf3d3b63918e0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14168
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
cd640f6275 nvme_rdma: Inline qpair_queue_send/recv_wr()
This will make the following patches simpler.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id3d7c025525b35c1c2b96027430789a8d8f2697b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14422
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
6275f8445f nvme_rdma: Inline post_recv()
Inline nvme_rdma_post_recv() into the callers.

We do not have any similar helper function for posting send WR.

This will make the following patches simpler and will be reasonable.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ia95a4b350942d20bdb65e84f7575c2dcf67c149b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14421
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
ecd9234d4d nvme_rdma: Extract conditional submit_sends/recvs from queue_send/recv_wr
Extract and inline the conditional nvme_rdma_qpair_submit_sends()
and nvme_rdma_qpair_submit_recvs() calls.

This will cralify the logic and make the following patches simpler.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibe217c6f4fb2880af1add8c0429f92b4de107da8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
851a8dfe42 nvme_rdma: rdma_req caches rdma_rsp and rdma_rsp caches recv_wr
When SRQ is supported, rsp array will be in either qpair or poller.
To make this difference transparent, rdma_req caches rdma_rsp and
rdma_rsp caches recv_wr directly instead of caching indecies.

Additionally, do a very small clean up together.
spdk_rdma_get_translation() gets a translation for a single entry
of a rsps array. It is more intuitive to use rsp.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I61c9d6981227dc69d3e306cf51e08ea1318fac4b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Shuhei Matsumoto
cce990607b nvme_rdma: Factor out send/recv completion from cq_process_completions()
Factor out processing recv completion and send completion into helper
functions to make the following patches simpler.

Additionally, invert if condition to check if both send and recv are
completed to make the following patches simpler.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Idcd951adc7b42594e33e195e82122f6fe55bc4aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14419
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-08 12:54:40 +00:00
Ben Walker
73b02ffdc3 nvme: In nvme_tcp_qpair_process_completions, do not call
nvme_tcp_read_pdu in a loop

nvme_tcp_read_pdu itself has a loop in it that runs until no more data
is available, so the extra loop does nothing.

Change-Id: I1471018e396c43187d1f06bd18ce8a6846a71c94
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15139
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-12-05 22:52:20 +00:00
Konrad Sztyber
35156582a7 nvme/tcp: add an errlog when sock_flush fails
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic14a1ff1120272a3afc86971b9670c10ef66523f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-12-01 12:49:04 +00:00
Jim Harris
2be196c609 nvme/pcie: validate that mptr is iova contiguous
Also add unit tests that explicitly test this
condition.  They fail without the nvme driver changes
in this patch.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa369be341eb4eba394f248990e56dce001d3940
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15579
Reviewed-by: Mariusz Barczak <mariusz.barczak@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-23 08:23:15 +00:00
Konrad Sztyber
72a6cd5381 nvme: execute hotplug monitor even if hotplug_fd < 0
NVMe controllers can be marked as removed even if we cannot receive
uevents (e.g. by the VMD driver), so we should process them regardless
of hotplug_fd.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iaaf13a136929200e824f7a6dd3b5584998801630
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15547
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-21 16:15:44 +00:00
Konrad Sztyber
86ba16c39c build: compile API functions with missing deps
We should always build all function that are part of the API, even if
some of the libraries they depend on are missing.  In that case, they
can return an error instead.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I72b450b3a1d62e222bd843e45be547d926414775
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2022-11-18 08:40:05 +00:00
paul luse
a6dbe3721e update Intel copyright notices
per Intel policy to include file commit date using git cmd
below.  The policy does not apply to non-Intel (C) notices.

git log --follow -C90% --format=%ad --date default <file> | tail -1

and then pull just the 4 digit year from the result.

Intel copyrights were not added to files where Intel either had
no contribution ot the contribution lacked substance (ie license
header updates, formatting changes, etc).  Contribution date used
"--follow -C95%" to get the most accurate date.

Note that several files in this patch didn't end the license/(c)
block with a blank comment line so these were added as the vast
majority of files do have this last blank line.  Simply there for
consistency.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id5b7ce4f658fe87132f14139ead58d6e285c04d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
2022-11-10 08:28:53 +00:00
Konrad Sztyber
cff39ee7d5 nvme: add missing \n in ctrlr init fail log
Additionally, print the string representation of the ctrlr state, as it
makes debugging init failures much easier.

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I572ef3d6f7d5bbd52039a8872733578c92be4c4a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-11-08 08:20:26 +00:00
Shuhei Matsumoto
ab839831f1 nvme_rdma: Remove workaround for Soft RoCE's bug from cq_process_completions()
We do not support Soft RoCE anymore. Remove a workaround for Soft RoCE's
bug that we amy receive a completion without error status after qpair is
disconnected/destroyed. Then add a assert to check if rdma_req->req is
not NULL. This will simplify the code and the following patches.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I80c349053adc0f79679eaf8a5d7265d555d3c2b0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14909
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-10-28 06:27:19 +00:00
Shuhei Matsumoto
1439f9c773 nvme_rdma: Pass poller instead of poll_group to cq_process_completions()
The following patches will support SRQ and SRQ will be per poller.
We will need SRQ in nvme_rdma_cq_process_completions().

It is not possible to identify poller if poll_group is passed to
nvme_rdma_cq_process_completions().

Based on these thoughts, add poll_group pointer to poller and pass
poller to nvme_rdma_cq_process_completions() instead of poll_group.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I322a7a0cc08bdcc8e87e720ad65dd8f0b6ae9112
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14282
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-10-28 06:27:19 +00:00
Shuhei Matsumoto
194047249b nvme_rdma: Get qpair from poll group using WC
NVMe-RDMA target has a helper function get_rdma_qpair_from_wc() and
uses it to identify a qpair from a WC.

NVMe-RDMA initiator has a similar function
nvme_rdma_poll_group_get_qpair_by_id().

NVMe-RDMA initiator will support SRQ in the following patches, and
it will want to identify a qpair from a WC.

get_rdma_qpair_from_wc() of NVMe-RDMA target uses wc->qp_num internally
anyway.

However, the upcoming custom transport for RDMA will have to use other
variables of WC.

Hence, it will be convenient to pass WC instead of qp_num if we consider
future enhancements.

Based on these thoughts, for NVMe-RDMA initiator rename
nvme_rdma_poll_group_get_qpair_by_id() by get_rdma_qpair_from_wc().
remove unnecessary declaration, and pass WC instead of qp_num.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I01ead4730207e2c6ac53b83f151bd5f977a11465
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14279
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-10-28 06:27:19 +00:00
Shuhei Matsumoto
6ea9de5fc8 nvme_rdma: Factor out poller destroy operation
Poller will have more shared resources when SRQ is supported.
This is a preparation.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ic3d1cb93dde3f53653a9536a103e5518cebd58e1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14173
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-10-28 06:27:19 +00:00
Shuhei Matsumoto
6a59daad2b nvme_rdma: Poll disconnect until completion if async mode is disabled
nvme_rdma_ctrlr_disconnect_qpair() does not poll the qpair until it is
actually disconnected if it is in a poll group even if its async mode
is disabled. Hence, spdk_nvme_ctrlr_free_io_qpair() removes the qpair
from a poll group when it is being disconnected.

On the other hand, I/O qpair is destroyed after it is actually
disconnected.

When SRQ is enabled and used, a SRQ is destroyed if the corresponding
poller does not have any I/O qpair after an I/O qpair is removed from
the poller.

In particular, if we use spdk_nvme_ctrlr_free_io_qpair(), a SRQ is
destroyed before the corresponding I/O qpairs are destroyed.
Destroying a SRQ failed because it is still referenced by I/O qpairs.

This bug was found when running the SPDK NVMe perf tool with SRQ.

The reason was we had nvme_rdma_poll_group_process_completions() to call
disconnected_qpair_cb after the qpair is actually disconnected.
However, it is ensured that nvme_rdma_poll_group_process_completions()
calls disconnected_qpair_cb for any disconnected qpair.

Hence, remove a check if qpair->poll_group is not NULL from
nvme_rdma_ctrlr_disconnect_qpair() and update the comment.

Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0fde0d827eec3280e1cc5a0fce34d163a6069bc4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-10-28 06:27:19 +00:00
Vasuki Manikarnike
3fcee8ddcc lib/nvme: Do not submit queued aborts if adminq is in failed state.
With RDMA, the admin poller can experience a remote disconnect when
processing completions. The admin qpair will be disconnected to handle
this. The disconnect code path will manually complete queued aborts.
However, the completion callback for the abort will attempt to resubmit
other queued aborts from the queue, which will result in a very large
stack and can eventually cause a segfault.
The fix is to not resubmit queued aborts if the admin qpair is in any
kind of failed state.

Change-Id: I4a6f959232c8a1bd30c87ca50459014e556cbaa0
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15114
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
2022-10-28 06:26:20 +00:00
Szulik, Maciej
51ae6d4002 nvme/tcp: add max_completion exit condition to loop inside read_pdu
A loop inside 'nvme_tcp_qpair_process_completions' makes
'max_completions' actually behaving like a minimum:
	do {
		rc = nvme_tcp_read_pdu(tqpair, &reaped);
		[...]
	} while (reaped < max_completions);

Before this change 'max_completion' constraint, in its true sense,
was actually not respected and a loop inside 'nvme_tcp_read_pdu'
could be executed indefinitely as long as a recv state changed.

To prevent this behavior, max_completion must be passed to
'nvme_tcp_read_pdu' and used as an additional exit condition.

Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com>
Change-Id: I28da962f4a62f08ddb51915b5d0dae9611a82dee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15136
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
2022-10-26 07:35:21 +00:00
GangCao
f20b99bbb3 lib/nvme/vfio: destruct ctrlr in failed cases
Change-Id: Ie7d7ab25055c26ea1c2ae4997bf7197a170de989
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15005
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2022-10-17 12:52:55 +00:00
Changpeng Liu
e50ade3153 vfio_user: remove CONFIG_VFIO_USER flag for client library
The client vfio_user library doesn't require this flag as
it is totally owned in SPDK, so remove it.

Change-Id: I8f7b1df18017ceac24dbb8a0417871f25f6bee0d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13895
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-09-29 19:42:56 +00:00
MengjinWu
03843f73cb lib/nvme: disable multi c2hs crc32 offload at host
An example:
There are 3 c2h data PDUs for one read request. Data digest is
enabled, accel_poller is enabled. The first PDU will be offload
to accel_poller. Then the others will use CPU to calc the crc32c.
If the last PDU is calc done and the first PDU is not calc down,
SPDK will direct success the read request, and free some objects.
When accel_poller calc down, it will find the request is freed,
and abort the SPDK.

Disable multi c2hs async process to prevent this situation.

Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I03c9e5b30622bbe84523c0836aa93cfed672896
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14079
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-09-21 17:01:46 +00:00