This keeps track if an nvme_ctrlr was created
implicitly by the discovery service.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I493b7cacfe563737f45a1fffca98855a1929a751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11817
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
These parameters will be used for any controller created
by the discovery service.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I221b791f38b9c5797ba084c647a98b82c102a121
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
We used to wait until the discovery service could
connect to the discovery subsystem before calling
the callback function provided by the caller (mainly
the start_discovery RPC).
Moving forward, we will be handling the case where
the discovery subsystem is unavailable temporarily.
For now, let's not fail the bdev_nvme_start_discovery
call if we cannot connect to the discovery subsystem.
This will keep the initial service start path the same
as the path where the discovery subsystem is temporarily
unavailable. In the future, we can consider adding
functionality to the start_discovery RPC that waits
up to X number of seconds to see if we were able to
connect and fail otherwise.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icb05523b9d59f508bfbc0233595c8bf58c10488f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11768
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is another preparation to disconnect qpair asynchronously.
Add nvme_qpair object and move the qpair and poll_group pointers and
the io_path_list list from nvme_ctrlr_channel to nvme_qpair. nvme_qpair
is allocated dynamically when creating nvme_ctrlr_channel, and
nvme_ctrlr_channel points to nvme_qpair.
We want to keep the times of references at I/O path. Change nvme_io_path
to point nvme_qpair instead of nvme_ctrlr_channel, and add
nvme_ctrlr_channel pointer to nvme_qpair.
nvme_ctrlr_channel may be freed earlier than nvme_qpair. nvme_poll_group
lists nvme_qpair instead of nvme_ctrlr_channel and nvme_qpair has a
pointer to nvme_ctrlr.
By using the nvme_ctrlr pointer of the nvme_qpair, a helper function
nvme_ctrlr_channel_get_ctrlr() is not necessary any more. Remove it.
Change-Id: Ib3f579d3441f31b9db7d3844ec56c49e2bb53a5d
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11832
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add three options for I/O error resiliency to spdk_nvme_bdev_opts.
Then the RPC bdev_nvme_set_options can configure these.
These can be overridden if these are given by the RPC bdev_nvme_attach_controller.
Change-Id: If3ee23aeef8b7585fe0fb5ec4695df5866fc1e74
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11830
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently shadow doorbell updates are not counted; add statistics for
those, and rename the other statistic for clarity.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I211a77902e38265c99b15862034c6d022dc582a0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The following patches will enable us to specify I/O error resiliency
options per nvme_ctrlr as global options. To do it easier, move
per controller options about I/O error resiliency into struct nvme_ctrlr_opts.
prchk_flags is not exactly for resiliency but move it into struct
nvme_ctrlr_opts too.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I85fd1738bb6e293cd804b086ade82274485f213d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11829
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The following patches will add options per struct nvme_ctrlr in the
NVMe bdev module. bdev_opts will be used for it.
Additionally, fabrics_connect_timeout_us is set directly to
spdk_nvme_ctrlr_opts. So remove it from the RPC request structure.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I981cda5e69375edc43a8581cd3b43497c38a3d56
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11827
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
apply_firmware_complete bug fix: after each
firmware image download command finished,
apply_firmware_complete is called and issue the next
firmware image download command, and get another bdev_io.
After last command, apply_firmware_complete_reset
only release the last bdev_io, and all the ios in previous
commands are not release.
So after rpc_bdev_nvme_apply_firmware cycling,
the io pool will be used up and cause assert.
Signed-off-by: Gu, Zhimin <kookoo.gu@intel.com>
Change-Id: Icb1c722d85b1985521e5f25031ae70557b7ba84a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11586
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Provides RPCs for the qpair error injection APIs to bdev_nvme.
These RPCs are useful in testing NVMeoF/NVMe behavior for various
error scenarios in production.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: I0db7995d7a712d4f8a60e643d564faa6908c3a55
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10992
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It may take a long time to detect network transport error
when e.g. port is removed on remote target. This timeout
depends on 2 parameters - retry_count and ack_timeout.
bdev_nvme_set_options supports configuration of retry_count
but transport_ack_timeout is missed. Note: this parameter
is used by RDMA transport only.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I7c3090dc8e4078f64d444e2392a9e0a6ecdc31c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11175
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: <tanl12@chinatelecom.cn>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The state of a nvme_ctrlr can be more fine grained than a boolean
and such state gives more information to end users for debug or
root cause analysis.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3e2459f449e2dac73f04b155e38b696495f1a335
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10183
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
If ctrlr_loss_timeout_sec is set to -1, reconnect is tried repeatedly
indefinitely, and I/Os continue to be queued.
This patch adds another option fast_io_fail_timeout_sec, a flag
fast_io_fail_timedout to nvme_ctrlr.
If the time fast_io_fail_timeout_sec passed after starting reset,
set fast_io_fail_timedout to true not to use the path for I/O submission.
fast_io_fail_timeout_sec is initialized to zero as same as
ctrlr_loss_timeout_sec and reconnect_delay_sec.
The name of the parameter follows the famous DM-multipath, its fast_io_fail_tmo.
Change-Id: Ib870cf8e2fd29300c47f1df69617776f4e67bd8c
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10301
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously reconnect retry was not controlled and was repeated indefinitely.
This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec,
to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and
reconnect_delay_timer to nvme_ctrlr to control reconnect retry.
Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to
zero. This means reconnect is not throttled as we did before this patch.
A few more changes are added.
Change nvme_io_path_is_failed() to return false if reset is throttled
even if nvme_ctrlr is reseting or is to be reconnected.
spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN
infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec.
Not only ctrlr reset but also non-multipath ctrlr failover is controlled.
So we need to include path failover into ctrlr reconnect.
When the active path is removed and switched to one of the alternative paths,
if ctrlr reconnect is scheduled, connecting to the alternative path is left
to the scheduled reconnect.
If reset or reconnect ctrlr is failed and the retry is scheduled,
switch the active path to one of alternative paths.
Restore unit test cases removed in the previous patches.
Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
This RPC will stop the specified discovery service,
including detaching from any controllers that were
attached as part of that discovery service.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9222876457fc45e1acde680a7bd1925917c22308
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10832
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This patch adds the framework for a discovery
service in the bdev/nvme module.
Users can specify an IP/port of a discovery service.
The bdev/nvme module will connect to a discovery
controller, get the discovery log page, and then
register for AERs. It will connect to each
subsystem specified in the initial log page.
AER completions will trigger fetching the log
page again, at which point new subsystems will
be connected to, or removed subsystems will be
detached.
This patch does the following:
* Adds the new start_discovery RPC
* Connects to the discovery controller
* Gets the discovery log page
* Registers for AERs
* Detach from discovery controllers at shutdown
Subsequent patches in this series will:
* Connect to subsystems listed in discovery log page
* Detach from subsystems that were listed in earlier
discovery log pages but subsequently removed
* Add a stop_discovery RPC
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I54bfa896a48c5619676f156b5ea9f2d1f886c72f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10694
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
In the following patches, bdev_nvme_reset() will execute the reset ctrlr
operation on the nvme_ctrlr->thread until completion as bdev_nvme_admin_passthru()
does. Hence change the callback rpc_bdev_nvme_reset_controller_cb
to redirect to the orig_thread by using a dynamically allocated context.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8ee61857ac034024d00190875740a675ef1db8b0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10073
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
After multipath feature is supported, one bdev will have more than one
nvme ctrlr. Fore ease of view, display each ctrlr's trid info.
Moreover, rename nvme_bdev_ctrlr_get as nvme_bdev_ctrlr_get_by_name here
to keep consistent with nvme_ctrlr_get_by_name.
Signed-off-by: Kai Li <lik271@chinatelecom.cn>
Change-Id: I417506699bbea6ed13dac0fee942749757d2ae47
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10129
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add bdev_retry_count to spdk_bdev_nvme_opts and retry_count to
nvme_bdev_io, respectively.
Set type of both to int because we want use -1 for infinite retry.
Set the default value of bdev_retry_count to zero for the backward
compatibility.
bdev_retry_count is configurable by the RPC bdev_nvme_set_options.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9bc746fcea54aa8722c76f79c70c2ae2b375aa53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9864
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
retry_count of struct spdk_bdev_nvme_opts controls the number of retries
in the transport layer, and is set to transport_retry_count of struct
spdk_nvme_ctrlr_opts.
The next patch will add bdev_retry_count to struct spdk_bdev_nvme_opts
to control the number of retries in the bdev layer.
For clarification, rename retry_count to transport_retry_count of
struct spdk_bdev_nvme_opts. Then deprecate the retry_count parameter
and add and use an new parameter transport_retry_count instead for
the RPC bdev_nvme_set_options.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0689c54aa1c96ee99d24236e8ff1a594ad7208e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9924
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
You can now detach specific paths based on the host parameters. This is
useful for two paths to the same target that use different local NICs.
Change-Id: I4858bfda7d940052ca77ffb0bbe764a688fb315d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9827
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
The all setters of the reset_cb_fn use boolean as the result eventually.
Using boolean as the result earlier makes the code simpler and the
following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9ec4f670ed7a000a824ab7e261bab07a3dc21f43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9814
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Pointer 'ctx->req.multipath' returned from call to
function 'strdup' may be NULL. Reported by Klocwork.
Change-Id: Id4a188ec5340f02c9bd0643db0acb03409dd5829
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9843
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Specifying only a transport id is not enough. We need to be able to
describe the host parameters too.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iadbea553aee4b38e7cacab0b486e7e5746d0d1ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9825
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is the currently active path identifier in a failover scenario. The
path is defined by more than just the transport identifier, so fix the
name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I682c6f4c54f75307e2615bf80e70358180d99fe2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9576
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The new path must differ in some way.
Change-Id: I98fd2b2cb3220b482efe0a19bfce94ec4e72bec2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9418
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This parameter may have the values "disable" or "failover". The default
is failover to match existing behavior. In the future we expect to
change the default behavior to disable.
Further, we expect to add an "enable" option soon to do full multipath.
Change-Id: Iebbdc9b95f23101f18d64e085933463498e627be
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9343
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It can match by any provided parameter to remove paths.
Change-Id: I5e7a87342bbb90943dc97fb52f142814fcf0acfa
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9453
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Instead of storing an spdk_nvme_transport_id, store the object that
contains it. This will make a few later patches easier.
Change-Id: I36b74889fe39af3b7ab2b900fb3ea4b3f39e1f83
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9484
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Enable dump of transport stats in functional test.
Update unit tests to support the new statistics
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I815aeea7d07bd33a915f19537d60611ba7101361
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8885
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch enables us to aggrete multiple ctrlrs in the same NVM
subsystem into a single bdev ctrlr to create multipath.
This patch has a critical limitation that ctrlrs which are aggregated
need to have no namespace. Hence any nvme bdev is not created.
However it will be removed in the next patch.
The design is as follows.
A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in
the same NVM subsystem. The name of the nvme_ctrlr is changed to be
the name of the nvme_bdev_ctrlr.
NVMe bdev module has both the failover feature and the multipath
feature now. To choose which of failover or multipath to use, add an new
parameter multipath to the RPC bdev_nvme_attach_controller.
When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover
feature if multipath is false, we use the multipath feature if multipath is
false.
nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the
global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via
trid as a key. nvme_bdev_ctrlr is not registered as io_device.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Following the last patch, include hostid into ctrlr_opts rather than
passing it as a parameter for bdev_nvme_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d04db1c5767ec76a9a7cd255c3a8d56b0b8f583
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9344
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This only existed to share code between OCSSD and regular NVM
namespaces. Now OCSSD is gone, so just merge the files into bdev_nvme.
Change-Id: Idb73cc05d67144de5dd20af8db24c8f6974d10a7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9337
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Do simple validity checks first, then check for duplicate controllers.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ic21d64574b37a3d9148e5cd6d5a7599449ea8fe1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9341
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
bdev_nvme_create() is called only by a single caller and hostnqn is
just copied to ctrlr_opts even if it is passed separately.
Hence include hostnqn into ctrlr_opts rather than passing it as a
parameter for bdev_nvme_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I75b640bcecefa94950b0c19936fab0571c428125
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9332
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Implement bdev_nvme_reset_controller rpc, which allows the NVMe
controller to be reset over RPC. Implement bdev_nvme_reset_rpc()
which starts the reset of the controller and returns the result of
the controller reset via the callback function after it completes.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: Id98d5e56feb315b7e44e9bb5e5f495e9b1cd1de0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8456
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Replace two helper functions, nvme_bdev_first_ctrlr() and
nvme_bdev_next_ctrlr() by an new helper function nvme_ctrlr_for_each().
This will make us easier to guard data structure correctly in the
following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibd81286e454fd6127fd150a7d48d8381bd1b89d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8721
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This object aggregates multiple I/O qpairs for their completion
operations and may be a higher layer object. However, the
aggregation is only to poll completions efficiently. Hence if we
follow the new naming rule, nvme_poll_group is better than
nvme_ctrlr_poll_group and nvme_bdev_poll_group, and rename
nvme_bdev_poll_group by nvme_poll_group.
Besides, many functions in NVMe bdev module have a naming rule,
bdev_nvme + verb + objective
Follow this rule for a few functions related with nvme_poll_group.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5e5cb6001d4a862c2121b7265cbbffe0c2109785
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8720
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>