Zone append commands cannot be split.
_nvme_ns_cmd_rw() should never cause a NVME_PAYLOAD_TYPE_CONTIG
zone append request to be split.
This is currently true, but add an assert to make sure that
any refactoring to _nvme_ns_cmd_rw() does not break this promise.
Also add error handling, since release builds are built with
asserts disabled.
Follow-up patches will refactor _nvme_ns_cmd_rw().
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I5fd2440c4c9d6bd8d56f30354b208a9047b64729
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6246
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
This function allows applications to specify whether
they wish to allow probing a newly attached NVMe
PCIe SSD.
The env layer will only even probe devices that have
been allowed. By default, this is all devices, but
if the user has specified some list of
allowed PCI addresses (via spdk_env_opts pci_allowed)
then newly attached PCIe devices are implicitly not
allowed. This API allows applications to add
device addresses to the allowed list after the
application has started.
This API will be useful for use cases where multiple
SPDK processes are running on one server, and assignment
of PCIe SSDs to those processes are based on some function
of the SSD's PCIe address.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I919bc267f2ad9130ab5c875ff760a301028b047e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6184
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Fixes#1777.
When a qpair cannot be allocated because the transport connection fails,
the qpair was freed without unlinking it from the other structures.
This was leading to a segfault when attempting to create and free other
qpairs.
Also added a unit test to cover this case.
Change-Id: I74b78d1847f90117248b07203b43a11ff5cfa5d6
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6272
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If rdma_qp_disconnect is not correctly sent out, we will not wait
for the event.
Change-Id: I99701e421dc93909d481ccf35e9bfd8004e60da8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6163
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Add the real support in nvme tcp transport.
Change-Id: I2aa9b0284d6fe009925e67f602a055e787f77987
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5734
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch is used to add spdk_nvme_poll_group_get_optimal
public API.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iee34c89e0e1ff1f81167b18e198c144ca28f71de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3311
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The data path for PCIe and vfio-user transports are almost
same too, so move the code from nvme_pcie.c to nvme_pcie_common.c,
so that these APIs can be reused by vfio_user.
No logic change for this patch.
Change-Id: I82f480bba3bae0ce35e2a98f29839081095f7d50
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6040
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Intel P55XX SSDs can support SGL feature but can't use Dataset Management
command with SGL format, so add a quirk here for now, if the limitation was
fixed in future, we can remvoe this. Also SPDK doesn't privoide scatter buffer
API for DSM, so using PRP with DSM is totally fine.
Change-Id: Ibe92f4deb5b8bc2077115f5b7244bc17be4f3b23
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5858
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
No need these returns at the end of void functions.
So remove them.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I8889745f3ef82af513d03259a77a33c1f4f536cb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6015
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The Zone Append command is an optional command in the Zoned Namespace
Command Set.
Zone Append differs from a regular write, in that the command is not
given an exact LBA of where to write the data.
Instead the user has to set the zslba field to the start of a zone,
and the data will be appended to that zone.
The actual LBA where the data was stored is returned in the
spdk_nvme_cpl, where Dword0 contains 31:00 of the ALBA field,
and Dword1 contains bits 63:32 of the ALBA field.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Iabae1b3456bfbb62c07b63d79afe9a14e460fe83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6013
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Create a _nvme_get_host_buffer_sector_size helper function,
to avoid the same code being duplicated in several functions.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I8c14683c683a44e03c97eefa186833831f754bcc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6035
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It does not make sense to send in check_sgl == true,
when we are calling _nvme_ns_cmd_rw() with a payload
of type NVME_PAYLOAD_CONTIG.
_nvme_ns_cmd_rw() simply cannot "check SGL" if the payload
is not a SGL. Doing so regardless just makes the code harder
to read.
We still send in check_sgl == true, when we are calling
_nvme_ns_cmd_rw() with a payload of type NVME_PAYLOAD_SGL.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I56d49a2abf7819d20cf5974c9e0df8f04f1ccd10
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6009
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In the next patch this member will be used to track
both read and write offsets
Change-Id: I852125ff35257f9821ddf4a641d96afb29ebf0a0
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5924
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When NVMf target linked with vfio-user library, we can use
vfio-user client library to connect to the target.
Here is the three examples that can work with target:
identify -r 'trtype:VFIOUSER traddr:/var/run/muser/domain/muser0/8' -g
perf -r 'trtype:VFIOUSER traddr:/var/run/muser/domain/muser0/8' -g -q 1 -o 4096 -w read -t 10
reconnect -r 'trtype:VFIOUSER traddr:/var/run/muser/domain/muser0/8' -g -q 32 -o 4096 -w randrw \
-M 50 -t 10 -c 0xE
You can run the following test script test/nvmf/target/nvmf_vfio_user.sh to have a quick test,
currently enabled with NVMe Identify,Perf,Reconnect tools.
Change-Id: Ieb9842b2f372184fffbf7f23e4aad26feb47c350
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3839
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Libvfio-user assumes the memory translation is IOVA=VA mode,
since SPDK CI is running inside a VM, the memory mode is
IOVA=PA mode, so when testing NVMe vfio-user transport inside
a VM spdk_vtophys doesn't work with libvfio-user, so here
we add a function to return memory address based on TRTYPE.
Change-Id: I11d1c87197f7bbfc243b6bf368795c9a74bd1303
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5958
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are some common data structures and APIs in pcie transport
which can be used both for pcie and vfio-user transport, so move
the common code into a new header and source file.
No actual logic change just the code movement except remove the
static function declarations.
Change-Id: Ie9021e703a5780fdd6840f0e3cfea76a0017a811
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5923
Community-CI: Broadcom CI
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The new custom transport can enable NVMe driver running with
NVMe over vfio-user target.
Change-Id: I5f90e8516eaca08fc3eab658b29b760a03326ff7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5996
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add two async API for Directive Send and Directive Receive.
spdk_nvme_ctrlr_cmd_directive_send;
spdk_nvme_ctrlr_cmd_directive_receive;
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: Icb6974f74902df1512a5ffa9835188132634291b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5803
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
According to kernel, use an inline function spdk_nvme_bytes_to_numd
to transfer paload_size form bytes to numer of dwords.
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: I8b9ded122bbf4a3c8e46988993ea52404783c0b0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5926
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Allocate memory with zero number or size, maybe return a unique
pointer rather than NULL. Add a check before common allocation APIs.
Change-Id: I83e07cab5145035e705bc32364652be90f238633
Signed-off-by: Mao Jiang <maox.jiang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5809
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Current version provides unclear output
Change-Id: Ib044b00b5f91b1e363911f1b79c51c73c8a6920c
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5743
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Here "return rc == -EINPROGRESS ? 0 : rc;"
They are the same meaning in these two functions.
Keep the comments here. This makes more clear to readers.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I8590de3f0fe27337163ee8b02ea63e166f1bbe7c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5689
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
As log shows
00:06:32.300 [2020-12-18 21:13:35.511014] nvme_ctrlr.c:1414:spdk_nvme_ctrlr_reset: *ERROR*: Controller reinitialization failed.
00:06:32.300 [2020-12-18 21:13:35.511104] nvme_ctrlr.c: 925:nvme_ctrlr_fail: *ERROR*: ctrlr 192.168.100.8 in failed state.
00:06:32.300 [2020-12-18 21:13:35.511132] bdev_nvme.c: 392:_bdev_nvme_reset_complete: *ERROR*: Resetting controller failed.
00:06:32.300 [2020-12-18 21:13:35.511240] nvme_ctrlr.c: 925:nvme_ctrlr_fail: *ERROR*: ctrlr 192.168.100.8 in failed state.
00:06:32.300 [2020-12-18 21:13:35.511511] bdev_nvme.c: 556:bdev_nvme_failover: *NOTICE*: Unable to perform reset, already in progress.
if spdk_nvme_ctrlr_reset() failed, nvme_ctrlr_fail() is called, and
then if spdk_nvme_ctrlr_process_admin_completions() failed,
nvme_ctrlr_fail() is called.
We don't know which one comes first but nvme_ctrlr_fail() should do
nothing if the ctrlr is already failed.
Hence we should avoid setting ctrlr->is_failed and calling
nvme_transport_ctrlr_disconnect_qpair() twice.
However we should set ctrlr->is_removed if the parameter hot_remove is true.
We do these changes in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iac37c892e054fb59d78e69346ca7f0575d596235
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5694
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In some extreme use cases, an SPDK process could get
swapped out for a long period of time just after
we checked the state but before we called spdk_get_ticks().
So now we will only timeout if the timer expired before
we checked the state *and* the state did not advance.
It's possible we could just move the timeout check
to before the ctrlr->state switch, but I was
hesitant to change the flow for this case.
Fixes issue #1720.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I95b1db3365b5d2d8a65e528f53c302a724d44460
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5596
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When a device is removed, we should use the remove_cb
that was specified when the device was originally probed
and attached, if one was set.
Also add a new spdk_nvme_ctrlr_set_remove_cb API. This
can be used for cases where a different remove_ctx is
desired than was specified for the probe call. This
also enables setting a remove_cb when using connect APIs
which do not have a way currently to provide a remove_cb.
This also requires fixing the bdev nvme module, which
was depending on the previously errant behavior.
Fixes issue #1715.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id848b39040099ff7a21fe57ea6b194a8c25ae015
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5510
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
spdk_vtophys() already checks that, so we don't need
to check it in the NVMe driver again.
Change-Id: I74288ae8cab80e1be34583475fa02a3ae13e090c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5166
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
POSIX defines PRId64/PRIu64/PRIx64 for printing 64-bit values in a
portable way. Replace a few references to %ld to remove the assumption
about the size of a long. Similarly, use %z with size_t arguments.
Where the value being printed is an unsigned 64-bit value, use PRIu64
instead of %ld.
Explicitly test for not __linux__ where that is the intent, rather
than testing for __FreeBSD__.
Cast pointer to uintptr_t before aligning it, rather than using
a specific integer size which may not be large enough to store a
pointer.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: Icfe219e1bbb2d06b3ef05710fac5b7091d340251
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5142
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The zone append command, which is part of the Zoned Namespace Command Set,
has a maximum data transfer size that can be less than or equal to mdts.
Since zone append commands will not be allowed to be split, the user has
to be able to get the maximum zone append data transfer size. Add a
function that returns this limit.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I9da2672ea8a307ff62251c069a42f7540765e08b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5140
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Zone append is an optional command in the Zoned Namespace Command Set.
Add a convenience function to check if the controller supports the zone
append command.
The ratified NVMe TP 4056 added a CSI field (in cdw14) to the Get Log Page
command. However, since there already exist two public functions to get a
log page (spdk_nvme_ctrlr_cmd_get_log_page() and
spdk_nvme_ctrlr_cmd_get_log_page_ext()), avoid creating a third one for
now, since nvme_ctrlr_get_zns_cmd_and_effects_log() itself can leverage
one of the existing public functions.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I99516dbac8db6714488b4d6cabe64c27f46d6153
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5078
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Remove superfluous parentheses around ctrlr->cdata.mdts.
They provide no value while making the code harder to read.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I4342d87f0e33fd92fe76357eb0379fb1e9c8f98f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5138
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
While I assume that the initial thought of having the
NVME_CTRLR_STATE_WAIT_FOR_* state directly after the state which it is
waiting for, was to make it clear for the reader in which order the
states will be executed.
However, it feels silly to have the same code copy pasted everywhere.
Someone who needs to add a new state will still need to edit
nvme_ctrlr_state_string() and enum nvme_ctrlr_state, which still defines
the NVME_CTRLR_STATE_WAIT_FOR_* state directly after the state which it
is waiting for.
In one way, moving the NVME_CTRLR_STATE_WAIT_FOR_* states to the end of
nvme_ctrlr_process_init(), when reading nvme_ctrlr_process_init(), it is
actually easier to see the ordering of the states which actually do
something of significance.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia95ea5ac3c44a53179edbdc65cba45bec94e469f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5115
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Move the lock files from '/tmp' to '/var/tmp' cause user maybe delete files in /tmp
or remount /tmp by mistake, And the JSON-RPC domain socket located in '/var/tmp' also.
Signed-off-by: Weifeng Su <suweifeng1@huawei.com>
Signed-off-by: Shihao Sun <sunshihao@huawei.com>
Change-Id: I18d52f42462e8477fb35aeea9e38efc51610d17c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5096
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add handler for 'strstr, strrchr' function in 'NULL' return
that maybe cause memory access issue.
Signed-off-by: Weifeng Su <suweifeng1@huawei.com>
Signed-off-by: Shihao Sun <sunshihao@huawei.com>
Change-Id: I2525fbcd9f8ce0a78383305c735b2d27575f4bfe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5071
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Section 7.9 of the NVMe spec says that all nqns must
start with "nqn.".
Fixes issue #1669.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7fd0e6a0a397e831c4fa2377126b6b1e1b127d88
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5017
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is a problem with TCP zcopy enabled:
1. TCP initiator sends icreq and start polling a qpair. Polling of qpair
actively calls nvme_tcp_read_pdu function
2. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_CH state,
it reads 8 bytes of common PDU header. It determines the type of the PDU
and finds the size of PDU_PSH header.
3. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state.
It should read 120 bytes of icresp PDU. The number of bytes which needs to be
read is pdu->psh_len - pdu->psh_valid_bytes. qpair receives 120 bytes
(the full PDU) and calls nvme_tcp_pdu_psh_handle -> nvme_tcp_icresp_handle.
Here we check that we haven't yet received buffer reclaim notification and
simply return from this function. At the same time we continue to poll the qpair.
4. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state
and tries to read data from a socket again. The number of bytes is
pdu->psh_len - pdu->psh_valid_bytes. But now pdu->psh_len == pdu->psh_valid_bytes,
so we call nvme_tcp_read_data with zero length.
readv with zero length is commonly used to check errors on the socket,
but in our case there is no errors and readv returns 0.
5. nvme_tcp_read_data treats zero as error and return NVME_TCP_CONNECTION_FATAL.
Fix is to handle icresp, but leave qpair in INITIALIZING state until
we receive acknowledgement for icreqsend_ack. We also move qpair to
NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY recv_state so recv_pdu
will be zerofied and qpair will try to read a common PDU header.
But since it is not initialized yet, it won't receive anything
from the target.
Fixes issue #1633
Change-Id: I22cedefe530a8ac3b51495988ed6265d8fad15bb
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4969
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This fixes#1423 where the completion loop never
breaks when the NVMe ctrlr is no longer present.
This condition can happen during a hot remove.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: Ia238c8aeae720832068de28ce4d34a9d233344fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4831
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It is possible that a single probe_ctx could be used
to probe multiple newly attached nvme controllers. If
one of those controllers is removed during this process,
the rest of the controllers do not get probed and can
even get stuck in a zombie state.
It is better to just continue with probing the rest of
the controllers.
Fixes issue #1611.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4156ee8b50e8d52cfeee7224f210a58bb773e939
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4945
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
According to the SPEC we should support up to 8192 bytes
of ICD for admin and fabric commands. Transport configuration
parameter in_capsule_data_size is applied to all qpair types -
admin and IO. Also we allocate resources when we get a connection
request, so we don't know qpair type at this moment.
Create a list of buffer in TCP poll group to support ICD up
to 8192 bytes when configuration ICD is less than this value.
The number of elements in this pool is hardcoded, it is planned
to add a new configuration parameter later.
Fixes issue #1569
Change-Id: I8589e3e2ea95d515f5503c6de7c1ee40aaf7b6da
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4754
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In TCP NVME initiator with zero copy enabled requests might be
completed asynchronously - out of qpair_process_completions
context. At the same time we calculate requests completed
asynchronously so that generic NVME layer can resubmit
queued requests after calling qpair_process_requests (or
poll_group_process_requests).
But there is a time gap between async request complete and
qpair_process_completions and the user can submit new IO
thereby decrease the number of free TCP requests. That means
that there might be less free requests than we excpected when
we try to resubmit queued requests.
The solution is change ERRLOG to DEBUG log since it is not a
fatal case.
Change-Id: If045ecd331cc6693e8ef450d8e15432dfa5d8812
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4859
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The current approach checks "rc == 0". It worked before adding
polling of poll group since a single qpair should return 1
completion for its own icreq while poll group can return
several completions for all qpairs attached to this poll
group (but .e.g not for those qpair who is waiting for the
completion).
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I60d05d8d6640e4e2bbaf3cd533d2f5a3637adea1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4768
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add support for the ZNS zone management receive command.
An internal nvme_zns_zone_mgmt_recv() function is created
that matches the parameters of the zone management receive
function in the ZNS specification.
Convenience functions are provided for the following
Zone Receive Action: Report Zones.
Zone Receive Actions not implemented: Extended Report
Zones.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I23589a602336da5dffccec7230d07026a868e81b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4793
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>