After multipath feature is supported, one bdev will have more than one
nvme ctrlr. Fore ease of view, display each ctrlr's trid info.
Moreover, rename nvme_bdev_ctrlr_get as nvme_bdev_ctrlr_get_by_name here
to keep consistent with nvme_ctrlr_get_by_name.
Signed-off-by: Kai Li <lik271@chinatelecom.cn>
Change-Id: I417506699bbea6ed13dac0fee942749757d2ae47
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10129
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Current code only print the last namespace of nvme bdev, fix the print
way to show all the namespace.
And this patch will be prepared for the next patch to show io path status for multipath, like: which one is the primary or the backup, and the old status and current status,etc.
Signed-off-by: tanlong <948985618@qq.com>
Change-Id: I4fca154df52c929b8d046198934db0e58586c378
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10140
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
If I/O got ANA error, ANA state may be out of date. So in this case
read ANA log page and update ANA states. Mark nvme_ns to be updating
to avoid using while updating ANA state.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia43d38b3a589c84d6d0479dedcced033e76fb194
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9458
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
If an I/O failed by ANA error, the corresponding ANA state might be
out of date. In the following patches, for this case, read the latest
ANA log page and update the ANA state. Such reading ANA log page may be
done on multiple threads concurrently including AER ANA change.
Hence protect ANA log page by adding an new flag ana_log_page_updating
to struct nvme_ctrlr and using it.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8bb84091d50a5fdc0d9893b585be972dfd31c0f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9526
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add bdev_retry_count to spdk_bdev_nvme_opts and retry_count to
nvme_bdev_io, respectively.
Set type of both to int because we want use -1 for infinite retry.
Set the default value of bdev_retry_count to zero for the backward
compatibility.
bdev_retry_count is configurable by the RPC bdev_nvme_set_options.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9bc746fcea54aa8722c76f79c70c2ae2b375aa53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9864
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
retry_count of struct spdk_bdev_nvme_opts controls the number of retries
in the transport layer, and is set to transport_retry_count of struct
spdk_nvme_ctrlr_opts.
The next patch will add bdev_retry_count to struct spdk_bdev_nvme_opts
to control the number of retries in the bdev layer.
For clarification, rename retry_count to transport_retry_count of
struct spdk_bdev_nvme_opts. Then deprecate the retry_count parameter
and add and use an new parameter transport_retry_count instead for
the RPC bdev_nvme_set_options.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0689c54aa1c96ee99d24236e8ff1a594ad7208e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9924
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
You can now detach specific paths based on the host parameters. This is
useful for two paths to the same target that use different local NICs.
Change-Id: I4858bfda7d940052ca77ffb0bbe764a688fb315d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9827
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
The previous patch supported I/O retry when no available io_path
was found at submission.
This patch supports I/O retry when we get I/O path error at completion.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I93a1664944b15ab0a826a321e2ea7a2574263afe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9850
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
If ANA state is inaccessible or qpair is disconnected, I/O cannot
be submitted.
But if qpair is connected, ANA state may become accessible, or if
qpair is disconnected, it may become connected via resetting.
Hence even if find_io_path() returned NULL, queue I/O and retry it
one second later if qpair is connected or ctrlr is resetting.
Sort retried I/Os by expiration values in ticks, and activate a timed
poller per nvme_bdev_channel only if there is any retried I/O. So
the poller function bdev_nvme_retry_ios() always returns BUSY because
if the poller runs earlier than the closest retried I/O or runs when
there is no retried I/O, it is more like a bug of the framework.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id28110a0d63ebc1c5772814e2ff8a47934df1644
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9830
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
If NN is very large this saves a lot of memory. This lookup is
not generally used in the I/O path anyway.
Change-Id: I98e190006843ad5d0bac8483bf9feb800d4a665a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9884
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
bdev_nvme_find_io_path() selects an io_path whose qpair is connected
and ANA state is optimized or non-optimized.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I79c978795562b606ee27aa43020684d8bcbf50c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9405
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reset all controllers of a bdev controller sequentially. When resetting
a controller is completed, check if there is next controller, and
start resetting the controller.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I169a84b931c6b03b36bb971d73d5a05caabf8e65
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7274
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously the NVMe bdev module had completed the outstanding reset and
then canceled pending resets. This was complex.
On the other hand, the generic bdev layer cancels pending resets
and then completes the outstanding reset.
Following the generic bdev layer simplifies the code and makes us easier
to control retry reset, delay retry reset by a few seconds, or stop retry
after repeated failures and then delete ctrlr.
Update unit tests accordingly.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9a68422918ebcb052b3a281316ffba9b3450ecd4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9816
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Restore the previous nice idea and unify canceling pending resets
into bdev_nvme_complete_pending_resets().
cb_arg of spdk_for_each_channel() was reserved for a different
purpose but it was gone.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1c01051ae9d8eccc7fc3ec485dd344ff9f55087b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9815
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The all setters of the reset_cb_fn use boolean as the result eventually.
Using boolean as the result earlier makes the code simpler and the
following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9ec4f670ed7a000a824ab7e261bab07a3dc21f43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9814
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
bdev_nvme_admin_passthru() chooses the first ctrlr which is not failed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If41a1d1e1bde4bddfa92e5a385509daa3f0ce4de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9525
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
bdev_nvme_admin_passthrough(), bdev_nvme_reset_io(), and
bdev_nvme_abort() do not use io_path. So simply fall through even
if the optimal io_path is not found for these, and clear the
cached io_path for these.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib26fcbf99c95bbfb6e825c1b7c6455241c198d92
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9675
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Even if the NVMe bdev module supports I/O retry, it will not retry reset or abort.
For clarification, bdev_nvme_reset_io() and bdev_nvme_abort() call
bdev_nvme_io_complete() themselves for error cases.
For bdev_nvme_abort(), we do not need to differentiate error processing
among return values. Simply complete the request with failure.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id5d51cbba47c6360a6177efd7d5f2e978c48ee9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9674
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Even when I/O retry is supported, reset will not be retried. However,
bdev_nvme_io_complete() will process I/O retry. Hence inline
bdev_nvme_io_complete() into bdev_nvme_reset_io_complete() to exclude
reset from I/O retry. The result of reset is success or failure, so omit
the -ENOMEM case.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I667e74cbbac4a13cefb6896f898476ba48bcd0fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9687
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Save the io_path to which the current I/O is submitted into struct
nvme_bdev_io to know which io_path caused I/O error, or to reuse it
to avoid repeated traversal.
Besides, add a simple helper function nvme_io_path_is_available() to
check if the io_path can be reused.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idadd3a3188d9b333b335ea79e20a244aa0ce0bdd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9296
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We have io_path structure now and returning io_path rather than
ns and qpair match the function name. The following patches will
cache the returned io_path into nvme_bdev_io.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5d773da18591fc324667f6b5c489a38f497bf3d8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9295
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch removes the critical limitation that ctrlrs which are
aggregated need to have no namespace. After this patch, we can
add multiple namespaces into a single nvme_bdev.
The conditions that such namespaces satisfy are,
- they are in the same NVM subsystem,
- they are in different ctrlrs,
- they are identical.
Additionally, if we add one or more namespaces to an existing
nvme_bdev and there are active nvme_bdev_channels, the corresponding
I/O paths are added to these nvme_bdev_channels.
Even after this patch, ANA state is not utilized in I/O paths yet.
It will be done in the following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I15db35451e640d4beb99b138a4762243bee0d0f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8131
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It can be much simpler when inlined.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ifad89d7b9557a45eee601fc2004066a0fd9289c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9826
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Specifying only a transport id is not enough. We need to be able to
describe the host parameters too.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iadbea553aee4b38e7cacab0b486e7e5746d0d1ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9825
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is the currently active path identifier in a failover scenario. The
path is defined by more than just the transport identifier, so fix the
name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I682c6f4c54f75307e2615bf80e70358180d99fe2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9576
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This defines a unique path between a host and a target.
Change-Id: Ia3d24c1b34199a8b596aaf17900ca9694a9da77d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9505
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It's not providing a lot of value, while being pretty problematic, as
the read is blocking and cannot be easily changed to be non-blocking, as
dump_info_json is a synchronous interface.
Now that dump_info_json isn't using any synchronous interfaces from the
NVMe driver, we can send a bdev_get_bdevs call in the async_init.sh
test to verify that.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ida31c8d1000a52b0782f698afe46b071ed4e41df
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9488
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch replaces the synchronous `spdk_nvme_detach()` calls with its
asynchronous counterparts in the controller unregister path.
An additional poller is introduced to periodically poll the NVMe driver
for detach completion. Once the detach is completed, the poller is
unregistered and the nvme_ctrlr is destroyed. The poller uses the same
period (1ms) as the async probe poller.
Since reset and detach cannot happen at the same time, reset_poller was
renamed to reset_detach_poller and it can now store the pointer either
to the reset or detach poller, depending on the circumstances.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5eb2dd6383d98d25d1f9748af08c1a13d18acb0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8729
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is done in preparation for using the non-blocking versions of the
spdk_nvme_detach API, which will delay controller's delation until the
detach is completed asynchronously.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia785408c9a94427e60bf239e6036a5e89d589f61
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8727
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
It can match by any provided parameter to remove paths.
Change-Id: I5e7a87342bbb90943dc97fb52f142814fcf0acfa
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9453
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Instead of storing an spdk_nvme_transport_id, store the object that
contains it. This will make a few later patches easier.
Change-Id: I36b74889fe39af3b7ab2b900fb3ea4b3f39e1f83
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9484
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Allow to return more than one memory domain.
This change aligns bdev and nvme API and provides
more flexibility for custom transports.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica9b12ad8463c361be6cb62ee2c0513eec0b486d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9546
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch enables us to aggrete multiple ctrlrs in the same NVM
subsystem into a single bdev ctrlr to create multipath.
This patch has a critical limitation that ctrlrs which are aggregated
need to have no namespace. Hence any nvme bdev is not created.
However it will be removed in the next patch.
The design is as follows.
A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in
the same NVM subsystem. The name of the nvme_ctrlr is changed to be
the name of the nvme_bdev_ctrlr.
NVMe bdev module has both the failover feature and the multipath
feature now. To choose which of failover or multipath to use, add an new
parameter multipath to the RPC bdev_nvme_attach_controller.
When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover
feature if multipath is false, we use the multipath feature if multipath is
false.
nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the
global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via
trid as a key. nvme_bdev_ctrlr is not registered as io_device.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously, if an I/O qpair is disconnected, we tried reconnecting
the qpair. However, this reconnect operation was very likely to fail
and will not match the upcoming asynchronous connect/reconnect
operation. We need an extra callback to make this reconnect operation
asynchronous, but we do not want to have it.
Hence if an I/O qpair is disconnected, we free the I/O qpair and then
reset the corresponding nvme_ctrlr immediately. If the admin qpair is
also disconnected, the nvme_ctrlr is reset immediately. However this
event may never happen. So we do not wait for the error of the admin
qpair.
The NVMf host may disconnect connections by itself intentionally.
In this case, resetting the nvme_ctrlr will surely fail. But resetting
the nvme_ctrlr frees all I/O qpairs of the nvme_ctrlr and these I/O
qpairs are not created again until resetting the nvme_ctrlr succeeds.
Resetting the nvme_ctrlr once at most is more efficient than repeating
reconnecting the I/O qpair. So this change is valuable even for such
intentional disconnection. However, it is helpful to know the event that
I/O qpair is disconnected. Hence change DEBUGLOG to NOTICELOG in the
disconnected callback. The disconnected callback is not repeated, and
we do not need to worry about NOTICELOG flooding.
Refine the unit test case to verify this change.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I376b749c2f55d010692bf916370e8bb4249b795f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9515
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Poll group has a list of the associated ctrlr channels to update
them when the corresponding I/O qpair is disconnected to do path
failover.
Another idea is that poll group has a list of the associated bdev
channels. However, two or more bdev channels may share a single
ctrlr channel and I/O qpair is per ctrlr channel. What we want to do
by this addition is to stop I/O submission to any failed I/O qpair
and choose alternative I/O qpair. Hence we take the first idea.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia287bd1b803313e66b8505a19694a40133b675f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8124
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously nvme_ctrlr_channel had a pointer to a nvme_ctrlr and used it
throughout. However the use cases are not performance critical and
are better to convert from nvme_ctrlr_channel to nvme_ctrlr by
spdk_io_channel_get_ctx() and spdk_io_channel_get_io_device().
Provide a convenient macro nvme_ctrlr_channel_get_ctrlr() to do it.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie22e8c318af88e09b95824c67b3e874d85425f1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9195
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
bdev_nvme_find_admin_path() is used only in a place and it's role
is to find a non-failed ctrlr even after multipath is supported.
Inline it into bdev_nvme_admin_passthru() will be better.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If58507c49b43d047e1f3ef25bbdfb571c36a1956
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9194
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Previously bdev_nvme_reset() returned -EBUSY if ctrlr is being
destructed and returned -EAGAIN if ctrlr is being reset.
These did not match what spdk_nvme_ctrlr_reset() returned.
Reset operation will be more important than current when multipath
is supported and reset operation is made asynchronous.
Hence change bdev_nvme_reset() to follow spdk_nvme_ctrlr_reset().
bdev_nvme_reset() returns -ENXIO if ctrlr is being destructed and
returns -EBUSY if ctrlr is being reset.
Additionally change the return value of bdev_nvme_failover()
accordingly. After the change bdev_nvme_failover() returns -ENXIO
if being destructed and returns -EBUSY if ctrlr is being reset.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie2c6f8601050b1043d83de9cf01490751784e4e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8859
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Following the last patch, include hostid into ctrlr_opts rather than
passing it as a parameter for bdev_nvme_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d04db1c5767ec76a9a7cd255c3a8d56b0b8f583
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9344
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_ctrlr_populate_namespace_done() needs nvme_ns, probe_ctx, and completion
status.
When multipath is supported, nvme_ctrlr_populate_namespace_done() will
be called from the spdk_for_each_channel() callback. The spdk_for_each_channel()
callback can have one context and one completion status.
Hence nvme_ns or probe_ctx need to have another.
If an nvme_ctrlr has multiple namespaces, a probe_ctx is shared by
multiple nvme_ns.
So let nvme_ns has probe_ctx.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iedaaab80616d34d01935f4ebc31e1f3b84e09e32
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9047
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For multipath, when an new nvme_ns is added to an existing nvme_bdev,
we need to add the corresponding qpair to all existing nvme_bdev_channels
dynamically. The RPC bdev_nvme_attach_controller has to return after that.
Hence populating namespace needs to be asynchronous.
On the other hand, when an nvme_ns is removed from an existing nvme_bdev,
we need to remove the corresponding qpair from all existing nvme_bdev_channels
dynamically. Hence depopulating namespace needs to be asynchronous.
By the recent refactoring, two callbacks, nvme_ctrlr_populate_namespace_done()
and nvme_ctrlr_depopulate_namespace_done() were removed accidentally.
We need to restore these.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I37ea2420588cef3a18648dec053f8bd2b884e86b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9392
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
A few functions can be made private in bdev_nvme.c. Hence delete
these function declarations from bdev_nvme.h.
Besides, bdev_nvme_remove_trid() is only a function declaration.
Hence delete it together in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7bf9ae6da332c6426f6eb40926e7c4130d9acd8a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9444
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This only existed to share code between OCSSD and regular NVM
namespaces. Now OCSSD is gone, so just merge the files into bdev_nvme.
Change-Id: Idb73cc05d67144de5dd20af8db24c8f6974d10a7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9337
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_ctrlr_populate_namespaces
Avoid relying on this number. Different targets have interpreted its
meaning in different ways and it cannot be used anymore in practice. It
may also be very, very large.
Change-Id: I94e8eae49d6ccdbd8be302b30a120d89242b6d39
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9316
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Try to use these accessors instead of directly using the namespaces
array. This will make changing the data structure easier later on.
Change-Id: I3367d0e0065894f3aa199ed1698d27976b4cbbb5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9315
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If the number of namespaces is very large, this can cause excessive
memory allocation. This is especially true because when the number of
namespaces is large, it is almost always very sparsely populated.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I27d94956c222ae3c49c6a7422164ae3a8ec8d963
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9302
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It is more flexible now as it is possible to get nvme ns handle to do
additional management or queries, however if nvme ctrlr handle is needed
there is already public nvme API for that with nvme ns as an input.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I5493168ad31cc95687962288d57fb5457f2d7dd6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9357
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>