For the following nvme controller statemachine states:
NVME_CTRLR_STATE_IDENTIFY_NS
NVME_CTRLR_STATE_IDENTIFY_ID_DESCS
NVME_CTRLR_STATE_IDENTIFY_NS_IOCS_SPECIFIC
The statemachine can either:
- Jump to succeeding state
- If active ns list is empty, jump directly to NVME_CTRLR_STATE_CONFIGURE_AER
- In the unlikely case if NVMe completion error, jump to NVME_CTRLR_STATE_ERROR
Simply this such that we either:
- Jump to succeeding state
- In the unlikely case if NVMe completion error, jump to NVME_CTRLR_STATE_ERROR
This will help to reduce the complexity of the nvme controller statemachine,
especially considering that there are new additional states
(NVME_CTRLR_STATE_IDENTIFY_NS_DIRECTIVE and
NVME_CTRLR_STATE_CONFIGURE_NS_STREAMS) currently on review that would continue
with the bad habit of having three possible jump states instead of just two.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I3242052b1108afcd8adbe6d0378b1358fef58ec8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6521
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Identify application prints the PMR details if it is supported
Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Change-Id: Iaba4c15e18e1402035b11a34b2defe8078855751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6209
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fixes#1777.
When a qpair cannot be allocated because the transport connection fails,
the qpair was freed without unlinking it from the other structures.
This was leading to a segfault when attempting to create and free other
qpairs.
Also added a unit test to cover this case.
Change-Id: I74b78d1847f90117248b07203b43a11ff5cfa5d6
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6272
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add two async API for Directive Send and Directive Receive.
spdk_nvme_ctrlr_cmd_directive_send;
spdk_nvme_ctrlr_cmd_directive_receive;
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: Icb6974f74902df1512a5ffa9835188132634291b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5803
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
As log shows
00:06:32.300 [2020-12-18 21:13:35.511014] nvme_ctrlr.c:1414:spdk_nvme_ctrlr_reset: *ERROR*: Controller reinitialization failed.
00:06:32.300 [2020-12-18 21:13:35.511104] nvme_ctrlr.c: 925:nvme_ctrlr_fail: *ERROR*: ctrlr 192.168.100.8 in failed state.
00:06:32.300 [2020-12-18 21:13:35.511132] bdev_nvme.c: 392:_bdev_nvme_reset_complete: *ERROR*: Resetting controller failed.
00:06:32.300 [2020-12-18 21:13:35.511240] nvme_ctrlr.c: 925:nvme_ctrlr_fail: *ERROR*: ctrlr 192.168.100.8 in failed state.
00:06:32.300 [2020-12-18 21:13:35.511511] bdev_nvme.c: 556:bdev_nvme_failover: *NOTICE*: Unable to perform reset, already in progress.
if spdk_nvme_ctrlr_reset() failed, nvme_ctrlr_fail() is called, and
then if spdk_nvme_ctrlr_process_admin_completions() failed,
nvme_ctrlr_fail() is called.
We don't know which one comes first but nvme_ctrlr_fail() should do
nothing if the ctrlr is already failed.
Hence we should avoid setting ctrlr->is_failed and calling
nvme_transport_ctrlr_disconnect_qpair() twice.
However we should set ctrlr->is_removed if the parameter hot_remove is true.
We do these changes in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iac37c892e054fb59d78e69346ca7f0575d596235
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5694
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In some extreme use cases, an SPDK process could get
swapped out for a long period of time just after
we checked the state but before we called spdk_get_ticks().
So now we will only timeout if the timer expired before
we checked the state *and* the state did not advance.
It's possible we could just move the timeout check
to before the ctrlr->state switch, but I was
hesitant to change the flow for this case.
Fixes issue #1720.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I95b1db3365b5d2d8a65e528f53c302a724d44460
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5596
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When a device is removed, we should use the remove_cb
that was specified when the device was originally probed
and attached, if one was set.
Also add a new spdk_nvme_ctrlr_set_remove_cb API. This
can be used for cases where a different remove_ctx is
desired than was specified for the probe call. This
also enables setting a remove_cb when using connect APIs
which do not have a way currently to provide a remove_cb.
This also requires fixing the bdev nvme module, which
was depending on the previously errant behavior.
Fixes issue #1715.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id848b39040099ff7a21fe57ea6b194a8c25ae015
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5510
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
POSIX defines PRId64/PRIu64/PRIx64 for printing 64-bit values in a
portable way. Replace a few references to %ld to remove the assumption
about the size of a long. Similarly, use %z with size_t arguments.
Where the value being printed is an unsigned 64-bit value, use PRIu64
instead of %ld.
Explicitly test for not __linux__ where that is the intent, rather
than testing for __FreeBSD__.
Cast pointer to uintptr_t before aligning it, rather than using
a specific integer size which may not be large enough to store a
pointer.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: Icfe219e1bbb2d06b3ef05710fac5b7091d340251
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5142
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The zone append command, which is part of the Zoned Namespace Command Set,
has a maximum data transfer size that can be less than or equal to mdts.
Since zone append commands will not be allowed to be split, the user has
to be able to get the maximum zone append data transfer size. Add a
function that returns this limit.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I9da2672ea8a307ff62251c069a42f7540765e08b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5140
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Zone append is an optional command in the Zoned Namespace Command Set.
Add a convenience function to check if the controller supports the zone
append command.
The ratified NVMe TP 4056 added a CSI field (in cdw14) to the Get Log Page
command. However, since there already exist two public functions to get a
log page (spdk_nvme_ctrlr_cmd_get_log_page() and
spdk_nvme_ctrlr_cmd_get_log_page_ext()), avoid creating a third one for
now, since nvme_ctrlr_get_zns_cmd_and_effects_log() itself can leverage
one of the existing public functions.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I99516dbac8db6714488b4d6cabe64c27f46d6153
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5078
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Remove superfluous parentheses around ctrlr->cdata.mdts.
They provide no value while making the code harder to read.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I4342d87f0e33fd92fe76357eb0379fb1e9c8f98f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5138
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
While I assume that the initial thought of having the
NVME_CTRLR_STATE_WAIT_FOR_* state directly after the state which it is
waiting for, was to make it clear for the reader in which order the
states will be executed.
However, it feels silly to have the same code copy pasted everywhere.
Someone who needs to add a new state will still need to edit
nvme_ctrlr_state_string() and enum nvme_ctrlr_state, which still defines
the NVME_CTRLR_STATE_WAIT_FOR_* state directly after the state which it
is waiting for.
In one way, moving the NVME_CTRLR_STATE_WAIT_FOR_* states to the end of
nvme_ctrlr_process_init(), when reading nvme_ctrlr_process_init(), it is
actually easier to see the ordering of the states which actually do
something of significance.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia95ea5ac3c44a53179edbdc65cba45bec94e469f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5115
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Section 7.9 of the NVMe spec says that all nqns must
start with "nqn.".
Fixes issue #1669.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7fd0e6a0a397e831c4fa2377126b6b1e1b127d88
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5017
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This fixes#1423 where the completion loop never
breaks when the NVMe ctrlr is no longer present.
This condition can happen during a hot remove.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: Ia238c8aeae720832068de28ce4d34a9d233344fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4831
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add two new helper functions, nvme_ctrlr_detach_async() and
nvme_ctrlr_detach_poll_async() to make the internal of
spdk_nvme_detach() asynchronous.
Use callback function to remove controller from the attached list after
completing shutdown and before freeing to avoid conflict between
attach and detach.
Update MOCKs in the corresponding unit test cases.
The next patch will add two public APIs spdk_nvme_detach_async()
and spdk_nvme_detach_poll_async() based on this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifbdfec2a1facde9354007c6248f280e245a36eed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Following the last patch, separate nvme_ctrlr_destruct()
into nvme_ctrlr_destruct_async() and nvme_ctrlr_destruct_poll_async(),
but keep nvme_ctrlr_destruct() by replacing the internal by
nvme_ctrlr_destruct_async() and nvme_ctrlr_destruct_poll_async().
Add shutdown_complete to nvme_ctrlr_detach_ctx. If shutdown_complete is true,
we can destruct the controller. The case that nvme_ctrlr_shutdown_async()
failed sets shutdown_complete to true. The case that nvme_ctrlr_disable()
is called sets shutdown_complete to true unconditionally.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I3994e259f9d3ccf8fede3ac03aadef911eefb9dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is the first of the patch series to make spdk_nvme_detach()
asynchronous.
We have lengthy shutdown notification, i.e., we have to wait a long time
until shutdown processing is completed, in some SSDs. If the running system
has many such SSDs, we see large intolerable delay.
SPDK provides a controller option, no_shn_notification as a workaround.
We can use the workaround if the use case of the detach is to switch to
the next application without system reboot. However, we cannot use the
workaround if we want to do system reboot after detach.
To mitigate such lengthy shutdown notification, we need to parallelize
detachment among SSDs.
Hence the patch series will introduce an asynchronous detach API and
will use the API to parallelize detachment.
This patch adds the following changes.
Introduce a context structure and separate nvme_ctrlr_shutdown()
itno nvme_ctrlr_shutdown_async() and nvme_ctrlr_shutdown_poll_async()
using the context structure.
Name the context structure as nvme_ctrlr_detach_ctx because it will be
used only in internal APIs. The upcoming public APIs will support
multiple detachment and will have the contest structure named as
spdk_nvme_detach_ctx.
Use TSC instead of counter because polling interval will be controlled
by the caller.
Use the convenient macro, SPDK_CEIL_DIV(), to round off the time
value in milliseconds.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9e2355fd24b6d6a4d6c1813577d53822304d4f33
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4414
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since namespace types were introduced in NVMe, the CC.CSS register
has a new value (SPDK_NVME_CC_CSS_IOCS) which indicates that all
supported command sets should be selected/enabled. This possibly
includes command sets other than NVM and ADMIN only.
Therefore, if a SPDK application wants enable all the command sets
that the controller supports, it has to explicitly set
opts->command_set to SPDK_NVME_CC_CSS_IOCS.
To avoid possibly a lot of SPDK applications having to set this
parameter, check if the user requested a command set explicitly,
if not, make SPDK automatically use the most reasonable default,
based on the supported bits set by the controller.
The most common case is that you want to enable (all) the command
sets that the controller supports.
A user will still be able to restrict the controller to only use
the NVM command set (or ADMIN only), by setting opts->command_set
to a specific value.
Since the current default command set value specified by
spdk_nvme_ctrlr_get_default_ctrlr_opts() is SPDK_NVME_CC_CSS_NVM,
which is defined as 0, we cannot know if the user specified a
command set explicitly or not.
To solve this, change the default command set value specified by
spdk_nvme_ctrlr_get_default_ctrlr_opts() to CHAR_BIT (0x8), which
is larger than the largest value that can be set in CS.CSS (which
is only 3 bits wide, thus 0x7).
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I45ec148d3667ab87c41fbfb6d6612a1e0e5c9d9c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4701
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There is an error when do following sequences:
1. Allocate an I/O queue pair
2. Do controller reset via spdk_nvme_ctrlr_reset
3. Allocate an I/O queue pair
becaues the free_io_qids was reset and didn't
restore.
Fix issue #1621.
Change-Id: Icd533f171079c12fe03be07e659e8eed9b082384
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4698
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch removes the string from register component.
Removed are all instances in libs or hardcoded in apps.
Starting with this patch literal passed to register,
serves as name for the flag.
All instances of SPDK_LOG_* were replaced with just *
in lowercase.
No actual name change for flags occur in this patch.
Affected are SPDK_LOG_REGISTER_COMPONENT() and
SPDK_*LOG() macros.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I002b232fde57ecf9c6777726b181fc0341f1bb17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4495
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Add a new state in the SPDK NVMe state machine in order to fetch
I/O Command Set Specific Namespace data structures.
Right now there is only support for the Zoned Namespace Command Set
Specific Identify Namespace data structure.
The NVM Command Set Specific Identify Namespace data structure is
all zeroes right now, reserved for future use.
The Key Value Command Set Identify Namespace data structure is not
all zeroes, however, adding support for Key Value is outside the
scope of this patch.
The new NVME_CTRLR_STATE_IDENTIFY_NS_IOCS_SPECIFIC state is added
after the NVME_CTRLR_STATE_IDENTIFY_ID_DESCS state. This is because
we need to have fetched the identifiers in the desc list in order
to know which command set a namespace belongs to.
A slightly nicer design might have been to refactor the NVMe state
machine to first fetch the id desc list, then the identify namespace
struct, and finally the identify IOCS specific namespace struct.
However, since this would have required a lot of changes, it didn't
really seem justified.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I62cbc533c2c3eec1ccf0ba9b1c414d5a70919cff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4368
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add a new state in the SPDK NVMe state machine in order to fetch
I/O Command Set Specific Controller data structures.
Right now there is only support for the Zoned Namespace Command Set
Specific Identify Controller data structure.
The NVM Command Set Specific Identify Controller data structure is
all zeroes right now, reserved for future use.
The Key Value Command Set Identify Controller data structure is also
all zeroes right now, reserved for future use.
The new NVME_CTRLR_STATE_IDENTIFY_IOCS_SPECIFIC state is added
after the NVME_CTRLR_STATE_IDENTIFY state. That way, if support for
the Zoned Namespace Command Set is enabled during probing, we will
fetch the Zoned Namespace Command Set Specific Identify Controller data
structure, regardless if any Zoned Namespaces are attached or not, and
no additional steps will be needed once a Zoned Namespace is attached.
Since we only have one command set to fetch, avoid creating
NVME_CTRLR_STATE_IDENTIFY_IOCS_SPECIFIC substates, although that will
probably be needed when support for another command set is added.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I95535b09b03b7ef2ee9a11eebdbd28aad66d65ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4367
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Meanwhile, to verify an issue about git push unittest failure.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Idac60e5832390eb8bdce68aee639be2e9ac6cff6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4373
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add ana_state and ana_group_id to struct spdk_nvme_ns and keep
them up-to-date by updating when spdk_nvme_ctrlr is created or
ANA change notice is received asynchronously. For both cases,
struct spdk_nvme_ctrlr holds the latest ANA state.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I311fe1c8015c8b8ac9659c38661244706c04b3e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4287
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add an internal API nvme_ctrlr_parse_ana_log_page() to parse an ANA
log page and execute the specified callback function for each
ANA group descriptor in the ANA log page.
We will be able to copy the ANA group descriptor to the caller instead.
To do that, we will need to inform the size of the descriptor first,
but the size will not be constant.
Passing parser to the API will be more convenient.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifd8fda30a83965948017fb8ad992c0d889197cde
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4279
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When creating a controller, allocate a buffer to the controller
and read ANA log page into the buffer.
When receiving ANA change notice, read ANA log page into the buffer
to keep the contents up to date.
The next patch will provide a public API to get the contents of
ANA log page the controller holds.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If5c653f4e80d157e5120bb754e6660250b2b8fa1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4233
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
With the introduction of namespace types, the identify command has
gained an additional parameter: Command Set Identifier (CSI).
This parameter is similar to the existing parameters NSID and CNTID,
and is not used by all CNS values.
Most notably, the CSI parameter is not used for the existing CNS
values 00h (ID NS) and 01h (ID CTRL).
There are new CNS values, e.g. 05h (ID IOCS specific NS), and
06h (ID IOCS specific CTRL), which do take the new CSI parameter.
The new CNS values instead return Command Set Specific data structures,
which is basically an additional data structure. Therefore, the CNS
values 00h and 01h are very much still in use.
(Even the NVM Command Set has a Command Set Specific data structure,
even though all fields in that data structure are currently reserved.)
Since the CSI parameter is unused by all the existing calls to
nvme_ctrlr_cmd_identify() (since none of the calls send in a CNS value
that uses CSI), simply send in 0 for all existing calls.
No functional change intended.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia2b2324393a0707152b2f8511f0a22ad4a12bd46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4309
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since the command set identifier might be accessed at several
different states in the nvme state machine, cache it so that
we don't need to loop through the ns id desc list every time.
This is similar to how other identify fields are cached using
nvme_ns_set_identify_data().
None of the identifiers in the desc list (including the new CSI)
can change over the life time of a namespace, so caching them
should be safe.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ie06180a4b3750dfa1a42f47afe0f7f9e3ec04ba9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4266
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If the nvme completion was an error, the function will return,
so there is no reason for an else statement.
In fact, the else statement in nvme_ctrlr_identify_ns_async_done()
differs from the coding style used in other nvme_ctrlr_identify_*
functions, and arguably makes the code harder to read.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: If76b823b7ca04ab98abb2912927c344ee9f12314
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4265
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In cases where the SPDK nvme driver is being used as a validation/test
vehicle, users may need to allocate a currently unused qid that can be
used for creating queues using the raw interfaces. One example would be
testing N:1 SQ:CQ mappings which are supported by PCIe controllers but
not through the standard SPDK nvme driver APIs.
These new functions fulfill this purpose, and ensure that the allocated
qid will not be used by the SPDK driver for any future queues allocated
through the spdk_nvme_ctrlr_alloc_io_qpair API.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I21c33596ec415c2816728a600972b242da9d971b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3896
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This may happen when resetting a controller, if the ADMIN queue failed
to reconnect, the controller is set to failed state, so for this case
we don't need to loop until timeout, just exit.
Change-Id: I2b37af5453086cd64f3609c41eb8f6475da55fd4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4143
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
If ctrlr->cdata.cmic.ana_reporting is 1, set the corresponding
field to true.
Then use its API in the identify application.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4e74bc4c114883e4aecdbee7a6f1a02027db23a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4156
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add support for getting the Command Set Identifier for a given namespace.
The SPDK_NVME_CAP_CSS_IOCS feature can be implemented on top of an old NVMe
specification. If the feature is set, retrieve the NS ID Descriptor List
regardless of the NVMe specification version. The quirk is still respected.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I7b257115ecb0d813ba75201c0f48960c7070dcc9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4085
Community-CI: Broadcom CI
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There are two bugs:
1, When the target response 0, it means target does't
support keep alive.
2, Change the interval time to us so when the keep alive
timeout is 1ms then the interval is 500us.
Fix github issue: #1565
Change-Id: I75707ab0e4e639209a9c50ef326492fae213044d
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4077
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is an oversight that can cause issues with looping
through the list if we end up allocating the same qpair
twice.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I513ea35398f4b724366c21be144531fbfbdb4347
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3835
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If the transport is broken, we should set errno code in
spdk_nvme_ctrlr_process_admin_completions instead of keeping silence.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ie73763e1329e12a8c82a0223d360991f86c39be3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3773
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This allows for much more granular control over the timeout.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ib23de21e60eec4207c55320579699edf284f4e16
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3794
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When a process cleans up IO qpairs from another crashed
process in a multi-process environment, we must not try to
abort reqs for that IO qpair. Any reqs will contain callbacks
for the crashed process which we must not try to execute in
a different process.
Fixes issue #1509.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5e58cce7bdb86e3feb4084733815c086901f867e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3536
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This function hasn't kept up properly with the states that
we use for tracking the qpair lifecycle.
Add checks for NVME_QPAIR_DISCONNECTING and NVME_QPAIR_DESTROYING.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I51607d4f00e94937b08fca28e766163580d46461
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3359
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When the user requests a non-default command set configuration, via the
probe_cb() supplied to spdk_nvme_probe(), which is not supported, then
the controller initialization will not proceed.
This patch changes that behavior into falling back to the NVM command
set and continue with the controller initialization. It is done by
assigning the NVM command set to opts.command_set such that the user
knows in attach_cb() with which command_set the controller is
configured/enabled.
The fallback is needed since the user does not have access to the
controller capabilities register. The strategy left for the user is thus
to try. However, this is an issue, as the user only has one attempt, as
subsequent calls to spdk_nvme_probe() will not trigger probe_cb() for
the controllers whose initialization did not proceed.
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: Ia414628fcd7d56956649647775462e62d98c0a90
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2931
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These SSDs set the oacs.security bit but do not actually
support OPAL. So do not set the controller flag indicating
SECURITY_SEND_RECV support in this case.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7fcfeafcc8d9439a1c53c60a1aea1801923a2ce5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3156
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
ctrlr->outstanding_aborts is counted only for submitted abort requests.
However ctrlr->outstanding_aborts had been decremented for queued
abort requests by mistake.
Subsequent patches will use parent-children for abort requests but
nvme_free_request() is not aware of such relationship.
Queued abort requests had not been canceled or aborted when controller
was destructed. Retry submitting queued abort requests had been
repeated recursively and had caused stack overflow.
This patch fixes all.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8ce0ae51ddd5ed3e1e8ac86329c8bdb7a9236b2f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2555
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is used to make spdk_nvme_connect can support
the old library for compatibility.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I49d92fb473c3cbabd8e1240785b920480202eee9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1998
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Purpose: Make the initilaization in order.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I66962073a02b6a4c2fc79ac343cdf5310075dd63
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2766
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In the case, there are several ctrlr used, it's better to
have the traddr to indicate which ctrlr has the issue to
shutdown.
Change-Id: Ie564bb70566ba5822938efc99125d063f7b4ae4a
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2588
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
remove only the spdk_ prefix from static functions in
the above libraries.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I59ce032c3312fa73f30c133fd62e603c1eee2859
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2365
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
NVMe spec defines "Keep Alive Timer" feature ID as optional and there
are targets that do not support this. SPDK fails to connect to such
targets.
This patch allows Get Feature "Keep Alive" target to fail with
INVALID_FIELD status. In this case we just continue with keep alive
timer value stored in controller opts structure. This value is already
communicated to target in CONNECT command.
Fixes#1328
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I52e7ea3cb66073ce6cc168a169989bd179041618
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1625
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This needs to be done for all qpairs in the poll group.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic3a84713a3f9941f90613152328d06ac8c1f586b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1954
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is really useful when the intent of failing the qpair is to
do something like fail over to a different controller structure
and we want back completions for everything outstanding from the
admin queue.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Icbfdf855ddb1a380da7b9036ab5da6faab862e00
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1815
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
By aborting all requests from every qpair when it is disconnected,
we can completely avoid having to abort requests when we enable the
qpair since nothing will be left enabled.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Iba3bd866405dd182b72285def0843c9809f6500e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1788
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When we destroy a qpair, we need to flush all of the I/O.
But some applications will try to resubmit that I/O. We need
to not re-queue those I/O while in the context of the destroy
call so as to avoid an infinite loop.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I3e4863a563d461092f6e6b4a893f965f41bf34e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1856
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The state should be changed and checked by the transport
layer. All transports should follow the same list of steps
when disconnecting/reconnecting.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If2647624345f2c70f78a20bba4e2206d2762f120
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1853
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This variable really indicates when a qpair is
no longer connected. So NVME_QPAIR_DISCONNECTED is
actually much more accurate.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ia480d94f795bb0d8f5b4eff9f2857d6fe8ea1b34
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1850
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Separate these two operations into different functions. It is
possible that a CMB may not be visible from the CPU, but still
be present and have data transferred to it by some other DMA
engine. Generalize the API to handle that case.
Change-Id: Ifcd282af0db734fe4a6ef2283ae8e8933d017809
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/787
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Instead of creating an allocator where the driver manages the space,
now, since using the CMB for queues and data has already been
disallowed, just create functions to map and unmap the entire CMB.
The user can manage the space.
Change-Id: I023994deda3b517e14d2ba464c7375bf22b58456
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/785
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
We should be giving completions for all requests when we destroy a qpair.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I802f5120f2e8289aa825872f8085ac21b5fce0f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1756
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
NVMe ctrlr init state machine shall be async whenever possible so it
is not blocking other code from processing. It can result in deadlock
when cmd producer and consumer are sharing the same thread.
This patch is making identify active ns async by introducing new
state to wait for completions.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I346d35bab4733d3941e023602854fdd5b1ef23b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1463
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI
It is a prework for changes related to ctrlr init state machine.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: If289580f65ae27468b659a7ea07a4e4298876e77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1489
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI
Currently nvme_completion_poll_status object is allocated using
malloc, so it may cotnain some garbage. In some scenarious
nvme_completion_poll_cb can be triggered before we enter
spdk_nvme_wait_for_completion_*. In that case status object
will be freed by nvme_completion_poll_cb if it contains a
garbage in `timed_out` field. Later spdk_nvme_wait_for_completion
will work with already freed memory.
Fix - allocate nvme_completion_poll_status object using
calloc and explicitly zerofy it before usage
Fixes#1292
Change-Id: Iac39653a6cd102471de16e65814f0760bbeda7d9
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1373
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Also modify some api documentation to indicate how the
new API should be used.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Icdbfb09aceda28635fdd191c520b36c692c2c100
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1340
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
connect_io_qpair essentially allows us to split the qpair allocation process
in half which will make it possible for us to do more sophisticated things
with RDMA qpairs in poll groups. as a companion to this new API, a connect_only
option has been added to the io_qpair_opts struct which instructs alloc_io_qpair
to only allocate the qpair and not connect it.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I9ba9502dd39436006a9ac71436dd1871d648ed1c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1123
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Dword alignment and granularity are required for the data blocks when
the controller reports this capability.
Change-Id: I6b6300515a528acb34a032050ceedf673a4b326c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1315
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The shutdown flag is only used when resubmitting the AER, and it will not
be updated when hot remove happened, so rename it to is_destructed.
Change-Id: Iafc27bd6cb23a851ed6c96470a2a45546a399c88
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1254
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
NVMe specification in ch.7.6 "Controller Initialization" suggests to
use only Set Features "Number of queues" command and says nothing
about Get Features. All required information is available after Set
Num Queues step.
Fixes#1270
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: Ide38ba9c7f063f1d6b13bfce4232c588cc906784
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1271
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Align rdma and tcp to respect opts. Reduce default number of entries
for admin queue so it becomes memory optimization.
Linux driver by default creates admin queue with 32 depth, there is no
good reason to enlarge that queue by default within SPDK NVMe driver.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I97ceea8f350c52313021a63190fb0980f604c48e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1110
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will be re-used in the muser transport of nvmf.
Change-Id: If00e6ea79ffdc0c3bda0402f39c5f9f4f411788b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/425
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This gets rid of some duplicate lines of code.
Change-Id: I24d4864921f6030672f3640b33f88f37a9e8175a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1136
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add transport_ack_timeout parameter to nvme controller opts.
This parameter allows to configure RDMA ACK timeout according
to the formula 4.096 * 2^(transport_ack_timeout) usec.
The parameter should be in range 0..31 where 0 means use
driver-specific default value.
Change-Id: I0c8a5a636aa9d816bda5c1ba58f56a00a585b060
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/502
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In the autotest, when calling kill_stub() function, there is error log
like this: "Device 0000:83:00.0 is still attached at shutdown!", so it's
better to detach the controller when exit the stub process.
But after call spdk_nvme_detach() in the stub process, there is another issue:
1. NVMe stub running as the primary process, and it will send 4 AERs.
2. Using NVMe reset tool as the secondary process.
When doing NVMe reset from the secondary process, it will abort all the
outstanding requests, so for the 4 AERs from the primary process, the 4
requests will be added to the active_proc->active_reqs list.
When calling spdk_nvme_detach() to detach a controller, there is a
assertion in the nvme_ctrlr_free_processes() at last to check the
active requests list of this active process data structure.
We can add a check before destructing the controller to poll the
completion queue, so that the active requests list can be flushed.
Change-Id: I0c473e935333a28d16f4c9fb443341fc47c5c24f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/977
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There are synchronous security send/receive APIs defined in nvme.h,
however, we still need the asynchronous APIs so that we can make the
OPAL library can be used in asynchronous way. As the asynchronous APIs
are already defined in nvme_ctrlr_cmd.c, so just export them to public
APIs.
Change-Id: I5646f342a4bf70faad37daa956476f05a1327bcc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/675
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
while the size of namespace is changed,
the resize event will be notified.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Allen Zhu <allenz@mellanox.com>
Change-Id: I5d85f17df898dc21c0ae1eb9f529dcb624a457ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/849
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This allows to avoid calculation of ioccsz bytes on each request
and removes access to "cold" ctrlr structures in data path.
Add UT to check validness of calculation
Change-Id: I55ceff99eb924156155e69a20f587a4f92b83f0b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/519
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For nvme_ctrlr_cmd_format command status should be used as
nvme_completion_poll_cb callback argument instead of pointer to
local variable.
Change-Id: Id65cb395d137c4e907c1ef019b131e8822ddfe34
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483513
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
A pointer to a stack variable is passed as an argument to
nvme_completion_poll_cb function, later this variable is used
to track completion in the spdk_nvme_wait_for_completion() function.
If normal scenario a request submitted to the admin queue will be completed
within the function which submitted the request.
spdk_nvme_wait_for_completion() calls nvme_transport_qpair_process_completions
which may return an error to the caller, the caller may exit from the
function which submitted the request and the pointer to the stack variable
will no longer be valid. Thereby the request may not be completed at that time
and completed later (e.g. when the controller/qpair are destroyed)
and that will lead to call to nvme_completion_poll_cb with the pointer
to invalid stack variable.
Fix - Dynamically allocate status structure to track the completion;
Add a new field to nvme_completion_poll_status structure to track status
objects that need to be freed in a completion callback
Fixes#1125
Change-Id: Ie0cd8316e1284d42a67439b056c48ab89f23e0d0
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481530
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
'delay_pcie_doorbel' parameter in 'spdk_nvme_io_qpair_opts' structure
was renamed to 'delay_cmd_submit' to make it suitable for every
transport. Old name is also kept for backward compatibility.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I09ef8028133c4a3d4a5bbc5329ced1f065bcaa46
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
nvme_qpair_get_state fits more closely with the semantics in other
modules.
Change-Id: I6ea8e02abe27253d9b4d779a43ac1963be56356a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476920
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
The qpair state transport_qpair_is_failed is actually equivalent to
NVME_QPAIR_IS_CONNECTED in the qpair state machine.
There are a couple of places where we check against
transport_qp_is_failed and then immediately check to see if we are in
the connected state. If we are failed, or we are not in the connected
state we return the same value to the calling function.
Since the checks for transport_qpair_is_failed are not necessary, they
can be removed. As a result, there is no need to keep track of it and it
can be removed from the qpair structure.
Change-Id: I4aef5d20eb267bfd6118e5d1d088df05574d9ffd
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475802
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Users already have to poll the admin queue, so embed the io_msg
queue polling there to simplify the API.
Change-Id: I4d4d3be100be0798bee4096e0bbda96e20d2405e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472833
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Disconnecting qpairs from the admin thread during a reset led to an
inevitable race with the data thread. QP related memory is freed during
the disconnect and cannot be touched from the other threads.
The only way to fix this is to force the qpair disconnect onto the
data thread.
This requires a small change in the way that resets are handled for
pcie. Please see the code in reset.c for that change.
fixes: bb01a089
Change-Id: I8a39e444c7cbbe85fafca42ffd040e929721ce95
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472749
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If we disconnect qpairs without taking the lock, we run the risk of
trying to double free qpair resources before they have been marked as
NULL.
For example, polling on one thread and calling
nvme_rdma_qpair_disconnect from one thread while doing an
nvme_ctrlr_reset on another thread. nvme_ctrlr_reset will call down to
nvme_rdma_qpair_disconnect on the same qpair and without any locking it
can result in trying to destroy the qpair resources multiple times.
Change-Id: I9eef6f2f92961ef8e3f8ece0e4a3d54f3434cff8
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472413
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This can be useful when trying to perform multipath failover at the
application level. However, the controller must be in the failed state
before calling this function.
Change-Id: I5403c0036fed5dd3600ee20592925297494ba8aa
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470699
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These will be useful helper functions for the trid modification code
that gets introduced later.
Change-Id: Ief73e3045710bf35c511794c19b4dfefb93018f1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471780
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
While it is unlikely that a single qpair will be failed, it is important
to make it possible to reconnect a single qpair.
This function is also handy at the application layer when going through
a reconnect workflow. If we get -ENXIO from a qpair when we poll, we
will turn around and call this function. If we get -ENXIO from this
function, then we know the whole controller is failed and we need to do
a reset.
Change-Id: I6a8ea0ce27fce2f5fc0a5b3db05834acd68e6a39
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471417
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Failed is not a final state for either fabric or pcie controllers. We
have historically not allowed resets in the failed state, but we should.
Instead of checking for the failed state, we should check for the
removed state. If the controller is removed, then we cannot even attempt
a reset.
Change-Id: I2c1a3d85db84f84cd1895cbfaf16575c8b496155
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We no longer need the private function with a public wrapper.
Change-Id: I0d24dfb282461174729d3eb649c78ac27e42fc8d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471552
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
These will form the base of a little state machine for managing the nvme
qpair structure.
Change-Id: If6f6df38cc17221ac8fcb7d8c0d7e2e808897a99
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470534
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The driver has historically waited until we have to do a listen
before enabling the admin qpair. That is a very PCIe-centric mindset.
For fabric controllers, a lot of the early initialization operations such
as get_cc and set_cc are handled through the admin qpair so it should be
enabled before we begin the initialization process.
As a side effect of this cahnge, the internal API
nvme_ctrlr_enable_admin_qpair has been removed. It would have turned
into a one-liner.
Change-Id: Icd162657d01a85c227a3f20c295d0208e07ce44d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471743
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
This reverts commit bc4e31d6b2.
This change was accidentally merged after it was decided to go with a
different architecture.
Change-Id: Ifc9d8b08bd1fcbc4ace8dd6fb4bd0014330916ed
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471144
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The remove callback is a built in way of alerting the user application
that we have removed a controller. Once we fail a controller, we never
move it back out of that state so it is in essence removed.
Change-Id: Iaad6bef0994e9ddd5a424f6b83502f9191b2de49
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469637
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This paves the way for doing multiple reconnect attempts before failing
the controller.
Change-Id: I1ff4ee6d41a5ffb47dd186d76793d670287c4783
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469934
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
By moving the contents of spdk_nvme_ctrlr_reset to a new internal
function, I am paving the way for providing two reset paths. One, which
can be used by the user as an external API function and which provides
the same legacy behavior. Specifically, that it will always fail the
ctrlr after an attempted reset, and a second, internal path, which will
be used by the qpair reconnect code which will defer failing the qpair
to the qpair code.
Change-Id: I9ec9df55c1fecc2f00476c175bcf988207c31257
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469933
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If we are to have multiple reconnect attempts, we have to control
whetehr the controller is placed in the failed state from outside the
reset function itself. This will allow us to fail the controller only
after all of our retries are exhausted.
Change-Id: Ia82e10325272f25b2b8527336dc3bc507c93b401
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This was being ignored, and can cause some problems when trying to reset
a defunt controller over a fabric.
Change-Id: I32c11a0e2df0e140e20f870fe0fb5b9045a567b3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469638
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In most places, we are passing NVME_TIMEOUT_INFINITE as the
timeout_in_ms argument to nvme_ctrlr_set_state, presumably in an attempt
to specify an infinite timeout. However, nvme_ctrlr_set_state only
checked against 0 when setting the actual timeout, and we didn't have
any logic to check for overflow so we just ended up setting random
timeout_tsc values which changes the behavior of the
nvme_ctrlr_process_init function in several places.
So, change NVME_TIMEOUT_INFINITE to 0, and add some integer overflow
checking to nvme_ctrlr_set_state.
Change-Id: Ic9d0cc57ed153df30c3b20313c3742072a5f992d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469485
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
SPDK_ERRLOG lists the function name, so remove old references that
assume it doesn't and reprint the function name.
Change-Id: I69da6ca0a25bf0eda07d8dad52bcfadf964ac715
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469487
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Weighted Round Robin can be enabled for users, and users
can allocate different priority IO queues for different
purpose. For now we will enable this feature in the
NVMe driver first, following patches will enable this
feature in bdev layer.
Change-Id: I0f799236ca04eb85ef3c9f972ed63ff2718563ba
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466852
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Currently we *always* wait 2 seconds before starting
controller initialization during attach. This
works around an issue where some older Intel NVMe SSDs
could not handle MMIO writes too soon after a PCIe
FLR (which would be triggered when VFIO was enabled).
After further discussion with Intel experts, we know
the SSD models that exhibit this issue. So we can
quirk this so that only the older SSDs incur the extra
delay.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ieb408c24f6afd5bd5147d1c87239aa20f2d13511
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466064
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
spdk_nvme_detach() will do the normal shutdown notification for
most cases, and it will take some time e.g. 2 seconds to finish
the process for PCIe based controllers. If users' environment
has several drives, each drive will call spdk_nvme_detach() one
by one, and the shutdown process may take very long time.
Since users know exactly what they would like to do for the next
step, so here we provide an option to users, users can enable it
to skip the shutdown notification process so that they can have
very quick shutdown process, and when starting next time, the
controller can be enabled again.
Change-Id: Ie7f87115d57776729fab4cdac489cae6dc13511b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463949
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We have defined NVMe controller initialization 'transport_retry_count' option, so
global 'spdk_nvme_retry_count' can be removed, we will remove the variable with
PCIe transport first, and make the retry count can be configured via RPC.
Change-Id: I4d54f78c8da2180d536635587e7291f44a57c4fb
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464472
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously comparing the transport supported value and the target value
was done in RDMA transport layer. However this comparison should be
done in the generic layer like the maximum IO transfer size. Hence
change the comparison to do in the generic layer in this patch.
Besides, for MSDBD, the value 0 indicates no limit but we had handled
this as maximum number of SGS entries was 0 by mistake. This patch fixes
the bug together.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I54365cf114169b10180ec2c659f9c7302672674c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
spdk_dma_*malloc() is about to be deprecated.
Change-Id: I6c308ee546c28c479ceb903bc1749bf5209dc6fe
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448172
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <uma.willpower@gmail.com>
Adds fields to structure spdk_nvme_io_qpair_opts.
These fields allow specifying the locations of memory buffers used
for the submission and/or completion queues.
By default, vaddr is set to NULL meaning SPDK will allocate the memory to be used.
If vaddr is NULL then paddr must be set to 0.
If vaddr is non-NULL, and paddr is zero, SPDK derives the physical
address for the NVMe device, in this case the memory must be registered.
If a paddr value is non-zero, SPDK uses the vaddr and paddr as passed.
SPDK assumes that the memory passed is both virtually and physically
contiguous.
If these fields are used, SPDK will NOT impose any restriction
on the number of elements in the queues.
The buffer sizes are in number of bytes, and are used to confirm
that the buffers are large enough to contain the appropriate queue.
These fields are only used by PCIe attached NVMe devices. They
are presently ignored for other transports.
Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Change-Id: Ibfab3939eefe48109335f43a1167082dd4865e7c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454074
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Prior merge contained all of the code EXCEPT for the user-callable function.
Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Change-Id: I1cb7105ab85ffae8ed4f600261fed86c9c778893
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456282
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie32a1bb144c239b923b5cbb9e608a7dfc9c05208
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456076
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently the nvme driver will always log any
request completed with error status. Some
applications may not want this behavior. So provide
an option to disable it at the controller level.
When this option is enabled, any failed requests
from queues associated with that controller
(including the admin queue) will not log the
failed request.
Of course the application will still receive
the failed status code and can decide to do its
own logging there.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia093fcd23cf321a820fd53183ee7e2dac4f9d378
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will (finally) enable resets for fabrics
controllers.
Move some of the work previously done in enable_admin_queue
up to this new disconnect/connect logic.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6239f0c0f36192db921d33f2322b1874b9382a01
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453939
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We cannot complete error reqs from spdk_nvme_ctrlr_reset -
this could result in completions on threads not expected
by the user for I/O queues.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2e266a2618f1791ef1a1b713d1940357f23f7bff
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The RDMA transport was the only one implementing this
function, and it only does a connect - not a disconnect
followed by a connect.
A later patch will add a matching disconnect function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib68eb0ff2f8e59f437d6d8831bb37dfddf83e9a4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453929
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
These were accidentally removed in a previous
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idab274427c064ff8aff1cdca2dd80d7d24e8cce4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453747
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This transport function is a complete nop now, so
remove it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5cc6ac75795a3cf5311f24e2ac293fb53d4b9f8c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453487
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The nvme_qpair_disable functions will be going away in
an upcoming patch, so move this one bit of functionality
into a helper function in advance.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I61c2de535c2230b988d56dea13b00f39cb59dcfa
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453483
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We submit AERs to all controllers - both pcie and
fabrics. But currently we only manually abort the
aers when disabling the qpair for pcie. Make this
common instead by creating a new transport function
for aborting aers.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1e926b61b8035488cdc6e8cb4336b373732f985e
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453482
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This better explains what the function is doing,
and makes the name more general so we can use it
for the adminq as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6b55761cb141a9a79cdef876be47995d8813b312
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453480
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Fixes#722. The state was set in nvme_ctrlr_identify_id_desc_async
Signed-off-by: cranechu <cranechu@gmail.com>
Change-Id: I232f0035e8c45d49eca2de7174c91860a299d804
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449527
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
120 seconds is too long for controllers which can't be
setup during initialization, because this value is only
used for Admin commands so also rename as it is.
Change-Id: I0a3d3192252c0f6fc0bef4d8b868eaef2ae40fe3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448601
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The phys_addr param in spdk_*malloc() is about to be
deprecated, so use a separate spdk_vtophys() call to
retrieve physical addresses.
This patch also adds error checks against SPDK_VTOPHYS_ERROR.
The error handling paths are already there to account for
spdk_*malloc() failures themselves, so reuse them in case
of vtophys failures.
Change-Id: I377636e66b8c570d013c1bb2021f04bce4e6c0ce
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/416998
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Avoid ringing the submission queue doorbell until the
call to spdk_nvme_qpair_process_completions().
Change-Id: I7b3cd952e5ec79109eaa1c3a50f6537d7aaea51a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447239
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
New API added for upper level to get controllers'
supported flags.
Change-Id: I51e9d0e57c355fa37f092602a94f4c08deb8898c
Signed-off-by: Chunyang Hui <chunyang.hui@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446091
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The nsdata assignment is strangely aligned with some
variable declarations - fix it to make it more clear.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I43b1a6d5a69ca035a21f3996e8f859a45bd10b9c
Reviewed-on: https://review.gerrithub.io/c/446447
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
VMWare Workstation NVMe emulation does not seem to write the
SHST_COMPLETE bit within 10 seconds, resulting in an ERRLOG
during detach/shutdown. So add a quirk to cover these VMWare
SSDs. But rather than squashing the ERRLOG completely for
these SSDs, just add a message instead indicating this is
somewhat expected on these VMWare emulated SSDs.
Fixes issue #676.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3dfcb631feda639926fd712f1f41abb66cbf2096
Reviewed-on: https://review.gerrithub.io/c/445942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Althrough SPDK already provides a API to users which
can process runtime timeout NVMe commands, but it's
nice to have another API here, SPDK NVMe driver can
use it to break the endless wait. Also use the API
first in the initialization process, because we don't
want to add another initialization state with Intel
only supported log pages.
Change-Id: Ibe7cadbc59033a299a1fcf02a66e98fc4eca8100
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444353
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The purpose this patch is to fix the following issue:
https://github.com/spdk/spdk/issues/568.
The root cause of issue is in nvme_rdma_fail_qpair
since we want to recycle all outstanding rdma_reqs.
There is an aer req, the callback of which is:
nvme_ctrlr_async_event_cb. In this function, we
will call nvme_ctrlr_construct_and_submit_aer again,
however the nvme controller is already in shutdown state.
(The ctrlr->vcprop.cc.bits.en is set to 0).
Change-Id: I422f0fe5faf472e9a1cb6bbd174e806e6405b95c
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440014
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently for all the Intel drives nvme driver tries
to add Intel VS log pages support. When this log pages
are not supported whole init process fails.
This patch changes this behaviour by allowing to init
Intel drives which rejects VS log pages. This is valid
scenario for drives which are in states other than
healthy. Such a drives are still accesible via admin
queue, but does not expose some of the features, such
as this particular VS log pages.
Change-Id: I3764f2d67fd7153b6b1889273a9fedeb9c4213d3
Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437162
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The nvme/identify cmd issued some cmds to a ctrlr irrespective
of its type, and when the target was a Discovery ctrlr which only
accepts a very limited cmd set, that would result in errors observable
both on the initiator side (from nvme/identify) and in the output on
the target (nvmf_tgt). Introduce new API, spdk_nvme_ctrlr_is_discovery(),
and alter identify to make use of that in determining which commands
to send to the target.
Change-Id: I974a569843f1d2b9e1ece7bd3bf9ceee1bfae872
Signed-off-by: Lance Hartmann <lance.hartmann@oracle.com>
Reviewed-on: https://review.gerrithub.io/436225
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Change-Id: Ia65e235a85207c128ba274e1bab38d6c35344239
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/435563
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When the applications call spdk_nvme_ctrlr_alloc_io_qpair,
there will be cmd to the admin qpairs in nvme_ctrlr_get_cc,
so there is contention. We should use the lock to protect
nvme_ctrl_get_cc. Otherwise, the multiple threads will have
contention on the admin qpair, thus there will be coredump issue.
We get the bug when testing NVMe-oF TCP transport, and this
patch can address this issue.
Change-Id: I7247f98cdf890c2eafaf8fb94580ecd714010bd5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/435577
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This case isn't particularly supported, but still
caused a memory leak and rendered the pci device
inaccessible for the rest of the primary process
lifetime.
This happens when a controller is removed from the
primary process while a secondary process still
uses it. The controller will likely misbehave without
its primary process managing it, but at least there
won't be a leak.
Change-Id: I67581cffa33ce14ff516b5743d13c9ef7b351625
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/434408
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently there are no timeout mechanism for Admin commands
when initialization, the NVMe driver may enter infinite loop.
While here, add a new parameter to the controller initialization
options, NVMe controller will report an error when timeout
happens during initialization.
Change-Id: Id0c6b6fa15abe5227b486bee95c8e02914b0d358
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/424622
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>