Spdk/module/bdev/nvme/bdev_nvme.h

374 lines
12 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: BSD-3-Clause
* Copyright (C) 2016 Intel Corporation. All rights reserved.
* Copyright (c) 2019 Mellanox Technologies LTD. All rights reserved.
bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed Previously reconnect retry was not controlled and was repeated indefinitely. This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec, to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and reconnect_delay_timer to nvme_ctrlr to control reconnect retry. Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to zero. This means reconnect is not throttled as we did before this patch. A few more changes are added. Change nvme_io_path_is_failed() to return false if reset is throttled even if nvme_ctrlr is reseting or is to be reconnected. spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec. Not only ctrlr reset but also non-multipath ctrlr failover is controlled. So we need to include path failover into ctrlr reconnect. When the active path is removed and switched to one of the alternative paths, if ctrlr reconnect is scheduled, connecting to the alternative path is left to the scheduled reconnect. If reset or reconnect ctrlr is failed and the retry is scheduled, switch the active path to one of alternative paths. Restore unit test cases removed in the previous patches. Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
2022-01-13 07:03:36 +00:00
* Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
nvme: Added support for TP-8009, Auto-discovery of Discovery controllers for NVME initiator using mDNS using Avahi Approach: Avahi Daemon needs to be running to provide the mDNS server service. In the SPDK, Avahi-client library based client API is implemented. The client API will connect to the Avahi-daemon and receive events for new discovery and removal of an existing discovery entry. Following sets on new RPCs have been introduced. scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp User shall initiate an mDNS based discovery using this RPC. This will start a Avahi-client based poller looking for new discovery events from the Avahi server. On a new discovery of the discovery controller, the existing bdev_nvme_start_discovery API will be invoked with the trid of the discovery controller learnt. This will enable automatic connection of the initiator to the subsystems discovered from the discovery controller. Multiple mdns discovery instances can be run by specifying a unique bdev-prefix and a unique servicename to discover as parameters. scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto This will stop the Avahi poller that was started for the specified service.Internally bdev_nvme_stop_discovery API will be invoked for each of the discovery controllers learnt automatically by this instance of mdns discovery service. This will result in termination of connections to all the subsystems learnt by this mdns discovery instance. scripts/rpc.py bdev_nvme_get_mdns_discovery_info This RPC will display the list of mdns discovery instances running and the trid of the controllers discovered by these instances. Test Result: root@ubuntu-pm-18-226:~/param-spdk/spdk/build/bin# ./nvmf_tgt -i 1 -s 2048 -m 0xF root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [ { "name": "cdc_auto", "svcname": "_nvme-disc._tcp", "referrals": [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } } ] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [ { "name": "cdc_auto02n1", "aliases": [ "600110d6-1681-1681-0403-000045805c45" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-1681-1681-0403-000045805c45", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T0", "serial_number": "00-681681dc681681dc", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto00n1", "aliases": [ "600110da-09a6-09a6-0302-00005eeb19b4" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 2048, "uuid": "600110da-09a6-09a6-0302-00005eeb19b4", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0" }, "ctrlr_data": { "cntlid": 1, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P2T0", "serial_number": "00-ab09a6f5ab09a6f5", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n1", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8c" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8c", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n2", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8d" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8d", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 2, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [] root@ubuntu-pm-18-226:~/param-spdk/spdk# Signed-off-by: Parameswaran Krishnamurthy <parameswaran.krishna@dell.com> Change-Id: Ic2c2e614e2549a655c7f81ae844b80d8505a4f02 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15703 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-by: <qun.wan@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-30 20:11:23 +00:00
* Copyright (c) 2022 Dell Inc, or its subsidiaries. All rights reserved.
*/
#ifndef SPDK_BDEV_NVME_H
#define SPDK_BDEV_NVME_H
#include "spdk/stdinc.h"
#include "spdk/queue.h"
#include "spdk/nvme.h"
#include "spdk/bdev_module.h"
nvme: Added support for TP-8009, Auto-discovery of Discovery controllers for NVME initiator using mDNS using Avahi Approach: Avahi Daemon needs to be running to provide the mDNS server service. In the SPDK, Avahi-client library based client API is implemented. The client API will connect to the Avahi-daemon and receive events for new discovery and removal of an existing discovery entry. Following sets on new RPCs have been introduced. scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp User shall initiate an mDNS based discovery using this RPC. This will start a Avahi-client based poller looking for new discovery events from the Avahi server. On a new discovery of the discovery controller, the existing bdev_nvme_start_discovery API will be invoked with the trid of the discovery controller learnt. This will enable automatic connection of the initiator to the subsystems discovered from the discovery controller. Multiple mdns discovery instances can be run by specifying a unique bdev-prefix and a unique servicename to discover as parameters. scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto This will stop the Avahi poller that was started for the specified service.Internally bdev_nvme_stop_discovery API will be invoked for each of the discovery controllers learnt automatically by this instance of mdns discovery service. This will result in termination of connections to all the subsystems learnt by this mdns discovery instance. scripts/rpc.py bdev_nvme_get_mdns_discovery_info This RPC will display the list of mdns discovery instances running and the trid of the controllers discovered by these instances. Test Result: root@ubuntu-pm-18-226:~/param-spdk/spdk/build/bin# ./nvmf_tgt -i 1 -s 2048 -m 0xF root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [ { "name": "cdc_auto", "svcname": "_nvme-disc._tcp", "referrals": [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } } ] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [ { "name": "cdc_auto02n1", "aliases": [ "600110d6-1681-1681-0403-000045805c45" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-1681-1681-0403-000045805c45", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T0", "serial_number": "00-681681dc681681dc", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto00n1", "aliases": [ "600110da-09a6-09a6-0302-00005eeb19b4" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 2048, "uuid": "600110da-09a6-09a6-0302-00005eeb19b4", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0" }, "ctrlr_data": { "cntlid": 1, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P2T0", "serial_number": "00-ab09a6f5ab09a6f5", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n1", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8c" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8c", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n2", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8d" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8d", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 2, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [] root@ubuntu-pm-18-226:~/param-spdk/spdk# Signed-off-by: Parameswaran Krishnamurthy <parameswaran.krishna@dell.com> Change-Id: Ic2c2e614e2549a655c7f81ae844b80d8505a4f02 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15703 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-by: <qun.wan@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-30 20:11:23 +00:00
#include "spdk/jsonrpc.h"
bdev/nvme: Aggregate multiple ctrlrs into a single bdev ctrlr This patch enables us to aggrete multiple ctrlrs in the same NVM subsystem into a single bdev ctrlr to create multipath. This patch has a critical limitation that ctrlrs which are aggregated need to have no namespace. Hence any nvme bdev is not created. However it will be removed in the next patch. The design is as follows. A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in the same NVM subsystem. The name of the nvme_ctrlr is changed to be the name of the nvme_bdev_ctrlr. NVMe bdev module has both the failover feature and the multipath feature now. To choose which of failover or multipath to use, add an new parameter multipath to the RPC bdev_nvme_attach_controller. When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover feature if multipath is false, we use the multipath feature if multipath is false. nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via trid as a key. nvme_bdev_ctrlr is not registered as io_device. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-07 16:13:07 +00:00
TAILQ_HEAD(nvme_bdev_ctrlrs, nvme_bdev_ctrlr);
extern struct nvme_bdev_ctrlrs g_nvme_bdev_ctrlrs;
extern pthread_mutex_t g_bdev_nvme_mutex;
extern bool g_bdev_nvme_module_finish;
nvme: Added support for TP-8009, Auto-discovery of Discovery controllers for NVME initiator using mDNS using Avahi Approach: Avahi Daemon needs to be running to provide the mDNS server service. In the SPDK, Avahi-client library based client API is implemented. The client API will connect to the Avahi-daemon and receive events for new discovery and removal of an existing discovery entry. Following sets on new RPCs have been introduced. scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp User shall initiate an mDNS based discovery using this RPC. This will start a Avahi-client based poller looking for new discovery events from the Avahi server. On a new discovery of the discovery controller, the existing bdev_nvme_start_discovery API will be invoked with the trid of the discovery controller learnt. This will enable automatic connection of the initiator to the subsystems discovered from the discovery controller. Multiple mdns discovery instances can be run by specifying a unique bdev-prefix and a unique servicename to discover as parameters. scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto This will stop the Avahi poller that was started for the specified service.Internally bdev_nvme_stop_discovery API will be invoked for each of the discovery controllers learnt automatically by this instance of mdns discovery service. This will result in termination of connections to all the subsystems learnt by this mdns discovery instance. scripts/rpc.py bdev_nvme_get_mdns_discovery_info This RPC will display the list of mdns discovery instances running and the trid of the controllers discovered by these instances. Test Result: root@ubuntu-pm-18-226:~/param-spdk/spdk/build/bin# ./nvmf_tgt -i 1 -s 2048 -m 0xF root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [ { "name": "cdc_auto", "svcname": "_nvme-disc._tcp", "referrals": [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } } ] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [ { "name": "cdc_auto02n1", "aliases": [ "600110d6-1681-1681-0403-000045805c45" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-1681-1681-0403-000045805c45", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T0", "serial_number": "00-681681dc681681dc", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto00n1", "aliases": [ "600110da-09a6-09a6-0302-00005eeb19b4" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 2048, "uuid": "600110da-09a6-09a6-0302-00005eeb19b4", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0" }, "ctrlr_data": { "cntlid": 1, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P2T0", "serial_number": "00-ab09a6f5ab09a6f5", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n1", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8c" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8c", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n2", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8d" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8d", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 2, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [] root@ubuntu-pm-18-226:~/param-spdk/spdk# Signed-off-by: Parameswaran Krishnamurthy <parameswaran.krishna@dell.com> Change-Id: Ic2c2e614e2549a655c7f81ae844b80d8505a4f02 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15703 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-by: <qun.wan@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-30 20:11:23 +00:00
extern struct spdk_thread *g_bdev_nvme_init_thread;
#define NVME_MAX_CONTROLLERS 1024
enum bdev_nvme_multipath_policy {
BDEV_NVME_MP_POLICY_ACTIVE_PASSIVE,
BDEV_NVME_MP_POLICY_ACTIVE_ACTIVE,
};
enum bdev_nvme_multipath_selector {
BDEV_NVME_MP_SELECTOR_ROUND_ROBIN = 1,
BDEV_NVME_MP_SELECTOR_QUEUE_DEPTH,
};
typedef void (*spdk_bdev_create_nvme_fn)(void *ctx, size_t bdev_count, int rc);
typedef void (*spdk_bdev_nvme_start_discovery_fn)(void *ctx, int status);
typedef void (*spdk_bdev_nvme_stop_discovery_fn)(void *ctx);
struct nvme_ctrlr_opts {
uint32_t prchk_flags;
int32_t ctrlr_loss_timeout_sec;
uint32_t reconnect_delay_sec;
uint32_t fast_io_fail_timeout_sec;
bool from_discovery_service;
};
struct nvme_async_probe_ctx {
struct spdk_nvme_probe_ctx *probe_ctx;
const char *base_name;
const char **names;
uint32_t count;
struct spdk_poller *poller;
struct spdk_nvme_transport_id trid;
struct nvme_ctrlr_opts bdev_opts;
struct spdk_nvme_ctrlr_opts drv_opts;
spdk_bdev_create_nvme_fn cb_fn;
void *cb_ctx;
uint32_t populates_in_progress;
bool ctrlr_attached;
bool probe_done;
bool namespaces_populated;
};
struct nvme_ns {
uint32_t id;
struct spdk_nvme_ns *ns;
struct nvme_ctrlr *ctrlr;
struct nvme_bdev *bdev;
uint32_t ana_group_id;
enum spdk_nvme_ana_state ana_state;
bool ana_state_updating;
bool ana_transition_timedout;
struct spdk_poller *anatt_timer;
struct nvme_async_probe_ctx *probe_ctx;
TAILQ_ENTRY(nvme_ns) tailq;
RB_ENTRY(nvme_ns) node;
};
struct nvme_bdev_io;
bdev/nvme: Aggregate multiple ctrlrs into a single bdev ctrlr This patch enables us to aggrete multiple ctrlrs in the same NVM subsystem into a single bdev ctrlr to create multipath. This patch has a critical limitation that ctrlrs which are aggregated need to have no namespace. Hence any nvme bdev is not created. However it will be removed in the next patch. The design is as follows. A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in the same NVM subsystem. The name of the nvme_ctrlr is changed to be the name of the nvme_bdev_ctrlr. NVMe bdev module has both the failover feature and the multipath feature now. To choose which of failover or multipath to use, add an new parameter multipath to the RPC bdev_nvme_attach_controller. When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover feature if multipath is false, we use the multipath feature if multipath is false. nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via trid as a key. nvme_bdev_ctrlr is not registered as io_device. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-07 16:13:07 +00:00
struct nvme_bdev_ctrlr;
struct nvme_bdev;
struct nvme_io_path;
struct nvme_path_id {
struct spdk_nvme_transport_id trid;
struct spdk_nvme_host_id hostid;
TAILQ_ENTRY(nvme_path_id) link;
bool is_failed;
};
typedef void (*bdev_nvme_reset_cb)(void *cb_arg, bool success);
typedef void (*nvme_ctrlr_disconnected_cb)(struct nvme_ctrlr *nvme_ctrlr);
struct nvme_ctrlr {
/**
* points to pinned, physically contiguous memory region;
* contains 4KB IDENTIFY structure for controller which is
* target for CONTROLLER IDENTIFY command during initialization
*/
struct spdk_nvme_ctrlr *ctrlr;
struct nvme_path_id *active_path_id;
int ref;
uint32_t resetting : 1;
bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed Previously reconnect retry was not controlled and was repeated indefinitely. This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec, to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and reconnect_delay_timer to nvme_ctrlr to control reconnect retry. Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to zero. This means reconnect is not throttled as we did before this patch. A few more changes are added. Change nvme_io_path_is_failed() to return false if reset is throttled even if nvme_ctrlr is reseting or is to be reconnected. spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec. Not only ctrlr reset but also non-multipath ctrlr failover is controlled. So we need to include path failover into ctrlr reconnect. When the active path is removed and switched to one of the alternative paths, if ctrlr reconnect is scheduled, connecting to the alternative path is left to the scheduled reconnect. If reset or reconnect ctrlr is failed and the retry is scheduled, switch the active path to one of alternative paths. Restore unit test cases removed in the previous patches. Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
2022-01-13 07:03:36 +00:00
uint32_t reconnect_is_delayed : 1;
uint32_t fast_io_fail_timedout : 1;
uint32_t destruct : 1;
uint32_t ana_log_page_updating : 1;
uint32_t io_path_cache_clearing : 1;
struct nvme_ctrlr_opts opts;
RB_HEAD(nvme_ns_tree, nvme_ns) namespaces;
struct spdk_opal_dev *opal_dev;
struct spdk_poller *adminq_timer_poller;
struct spdk_thread *thread;
bdev_nvme_reset_cb reset_cb_fn;
void *reset_cb_arg;
/* Poller used to check for reset/detach completion */
struct spdk_poller *reset_detach_poller;
struct spdk_nvme_detach_ctx *detach_ctx;
bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed Previously reconnect retry was not controlled and was repeated indefinitely. This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec, to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and reconnect_delay_timer to nvme_ctrlr to control reconnect retry. Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to zero. This means reconnect is not throttled as we did before this patch. A few more changes are added. Change nvme_io_path_is_failed() to return false if reset is throttled even if nvme_ctrlr is reseting or is to be reconnected. spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec. Not only ctrlr reset but also non-multipath ctrlr failover is controlled. So we need to include path failover into ctrlr reconnect. When the active path is removed and switched to one of the alternative paths, if ctrlr reconnect is scheduled, connecting to the alternative path is left to the scheduled reconnect. If reset or reconnect ctrlr is failed and the retry is scheduled, switch the active path to one of alternative paths. Restore unit test cases removed in the previous patches. Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
2022-01-13 07:03:36 +00:00
uint64_t reset_start_tsc;
struct spdk_poller *reconnect_delay_timer;
nvme_ctrlr_disconnected_cb disconnected_cb;
/** linked list pointer for device list */
TAILQ_ENTRY(nvme_ctrlr) tailq;
bdev/nvme: Aggregate multiple ctrlrs into a single bdev ctrlr This patch enables us to aggrete multiple ctrlrs in the same NVM subsystem into a single bdev ctrlr to create multipath. This patch has a critical limitation that ctrlrs which are aggregated need to have no namespace. Hence any nvme bdev is not created. However it will be removed in the next patch. The design is as follows. A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in the same NVM subsystem. The name of the nvme_ctrlr is changed to be the name of the nvme_bdev_ctrlr. NVMe bdev module has both the failover feature and the multipath feature now. To choose which of failover or multipath to use, add an new parameter multipath to the RPC bdev_nvme_attach_controller. When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover feature if multipath is false, we use the multipath feature if multipath is false. nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via trid as a key. nvme_bdev_ctrlr is not registered as io_device. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-07 16:13:07 +00:00
struct nvme_bdev_ctrlr *nbdev_ctrlr;
TAILQ_HEAD(nvme_paths, nvme_path_id) trids;
uint32_t max_ana_log_page_size;
struct spdk_nvme_ana_page *ana_log_page;
struct spdk_nvme_ana_group_descriptor *copied_ana_desc;
struct nvme_async_probe_ctx *probe_ctx;
pthread_mutex_t mutex;
};
bdev/nvme: Aggregate multiple ctrlrs into a single bdev ctrlr This patch enables us to aggrete multiple ctrlrs in the same NVM subsystem into a single bdev ctrlr to create multipath. This patch has a critical limitation that ctrlrs which are aggregated need to have no namespace. Hence any nvme bdev is not created. However it will be removed in the next patch. The design is as follows. A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in the same NVM subsystem. The name of the nvme_ctrlr is changed to be the name of the nvme_bdev_ctrlr. NVMe bdev module has both the failover feature and the multipath feature now. To choose which of failover or multipath to use, add an new parameter multipath to the RPC bdev_nvme_attach_controller. When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover feature if multipath is false, we use the multipath feature if multipath is false. nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via trid as a key. nvme_bdev_ctrlr is not registered as io_device. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-07 16:13:07 +00:00
struct nvme_bdev_ctrlr {
char *name;
TAILQ_HEAD(, nvme_ctrlr) ctrlrs;
TAILQ_HEAD(, nvme_bdev) bdevs;
bdev/nvme: Aggregate multiple ctrlrs into a single bdev ctrlr This patch enables us to aggrete multiple ctrlrs in the same NVM subsystem into a single bdev ctrlr to create multipath. This patch has a critical limitation that ctrlrs which are aggregated need to have no namespace. Hence any nvme bdev is not created. However it will be removed in the next patch. The design is as follows. A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in the same NVM subsystem. The name of the nvme_ctrlr is changed to be the name of the nvme_bdev_ctrlr. NVMe bdev module has both the failover feature and the multipath feature now. To choose which of failover or multipath to use, add an new parameter multipath to the RPC bdev_nvme_attach_controller. When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover feature if multipath is false, we use the multipath feature if multipath is false. nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via trid as a key. nvme_bdev_ctrlr is not registered as io_device. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-09-07 16:13:07 +00:00
TAILQ_ENTRY(nvme_bdev_ctrlr) tailq;
};
bdev/nvme: Count number of NVMe errors per type or code Error counters for NVMe error was added in the generic bdev layer but we want to know more detailed information for some use cases. Add NVMe error counters per type and per code as module specific statistics. For status codes, the first idea was to have different named member for each status code value. However, it was bad and too hard to test, review, and maintain. Instead, we have just two dimensional uint32_t arrays, and increment one of these uint32_t values based on the status code type and status code. Then, when dump the JSON, we use spdk_nvme_cpl_get_status_string() and spdk_nvme_cpl_get_status_type_string(). This idea has one potential downside. This idea consumes 4 (types) * 256 (codes) * 4 (counter) = 4KB per NVMe bdev. We can make this smarter if memory allocation is a problem. Hence we add an option nvme_error_stat to enable this feature only if the user requests. Additionally, the string returned by spdk_nvme_cpl_get_status_string() or spdk_nvme_cpl_get_status_type_string() has uppercases, spaces, and hyphens. These should not be included in JSON strings. Hence, convert these via spdk_strcpy_replace(). Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I07b07621e777bdf6556b95054abbbb65e5f9ea3e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15370 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot
2023-01-05 23:26:33 +00:00
struct nvme_error_stat {
uint32_t status_type[8];
uint32_t status[4][256];
};
struct nvme_bdev {
struct spdk_bdev disk;
uint32_t nsid;
struct nvme_bdev_ctrlr *nbdev_ctrlr;
pthread_mutex_t mutex;
int ref;
enum bdev_nvme_multipath_policy mp_policy;
enum bdev_nvme_multipath_selector mp_selector;
uint32_t rr_min_io;
TAILQ_HEAD(, nvme_ns) nvme_ns_list;
bool opal;
TAILQ_ENTRY(nvme_bdev) tailq;
bdev/nvme: Count number of NVMe errors per type or code Error counters for NVMe error was added in the generic bdev layer but we want to know more detailed information for some use cases. Add NVMe error counters per type and per code as module specific statistics. For status codes, the first idea was to have different named member for each status code value. However, it was bad and too hard to test, review, and maintain. Instead, we have just two dimensional uint32_t arrays, and increment one of these uint32_t values based on the status code type and status code. Then, when dump the JSON, we use spdk_nvme_cpl_get_status_string() and spdk_nvme_cpl_get_status_type_string(). This idea has one potential downside. This idea consumes 4 (types) * 256 (codes) * 4 (counter) = 4KB per NVMe bdev. We can make this smarter if memory allocation is a problem. Hence we add an option nvme_error_stat to enable this feature only if the user requests. Additionally, the string returned by spdk_nvme_cpl_get_status_string() or spdk_nvme_cpl_get_status_type_string() has uppercases, spaces, and hyphens. These should not be included in JSON strings. Hence, convert these via spdk_strcpy_replace(). Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I07b07621e777bdf6556b95054abbbb65e5f9ea3e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15370 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot
2023-01-05 23:26:33 +00:00
struct nvme_error_stat *err_stat;
};
struct nvme_qpair {
struct nvme_ctrlr *ctrlr;
struct spdk_nvme_qpair *qpair;
struct nvme_poll_group *group;
struct nvme_ctrlr_channel *ctrlr_ch;
/* The following is used to update io_path cache of nvme_bdev_channels. */
TAILQ_HEAD(, nvme_io_path) io_path_list;
TAILQ_ENTRY(nvme_qpair) tailq;
};
struct nvme_ctrlr_channel {
struct nvme_qpair *qpair;
TAILQ_HEAD(, spdk_bdev_io) pending_resets;
struct spdk_io_channel_iter *reset_iter;
};
struct nvme_io_path {
struct nvme_ns *nvme_ns;
struct nvme_qpair *qpair;
STAILQ_ENTRY(nvme_io_path) stailq;
/* The following are used to update io_path cache of the nvme_bdev_channel. */
struct nvme_bdev_channel *nbdev_ch;
TAILQ_ENTRY(nvme_io_path) tailq;
bdev_nvme: update nvme_io_path stat when IO completes Currently we have stat per bdev I/O channel, but for NVMe bdev multipath, we don't have stat per I/O path. Especially for active-active mode, we may want to observe each path's statistics. This patch support IO stat for nvme_io_path. Record each nvme_io_path stat using structure spdk_bdev_io_stat. The following is the comparison of bdevperf test. Test on Arm server with the following basic configuration. 1 Null bdev: block size: 4K, num_blocks:16k run bdevperf with io size=4k, qdepth=1/32/128, rw type=randwrite/mixed with 70% read/randread Each time run 30 seconds, each item run for 16 times and get the average. The result is as follows. qdepth type IOPS(default) IOPS(this patch) diff 1 randwrite 7795157.27 7859909.78 0.83% 1 mix(70% r) 7418607.08 7404026.54 -0.20% 1 randread 8053560.83 8046315.44 -0.09% 32 randwrite 15409191.3 15327642.11 -0.53% 32 mix(70% r) 13760145.97 13714666.28 -0.33% 32 randread 16136922.98 16038855.39 -0.61% 128 randwrite 14815647.56 14944902.74 0.87% 128 mix(70% r) 13414858.59 13412317.46 -0.02% 128 randread 15508642.43 15521752.41 0.08% Change-Id: I4eb5673f49d65d3ff9b930361d2f31ab0ccfa021 Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14743 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-09-29 03:52:43 +00:00
/* allocation of stat is decided by option io_path_stat of RPC bdev_nvme_set_options */
struct spdk_bdev_io_stat *stat;
};
struct nvme_bdev_channel {
struct nvme_io_path *current_io_path;
enum bdev_nvme_multipath_policy mp_policy;
enum bdev_nvme_multipath_selector mp_selector;
uint32_t rr_min_io;
uint32_t rr_counter;
STAILQ_HEAD(, nvme_io_path) io_path_list;
TAILQ_HEAD(retry_io_head, spdk_bdev_io) retry_io_list;
struct spdk_poller *retry_io_poller;
};
struct nvme_poll_group {
struct spdk_nvme_poll_group *group;
struct spdk_io_channel *accel_channel;
struct spdk_poller *poller;
bool collect_spin_stat;
uint64_t spin_ticks;
uint64_t start_ticks;
uint64_t end_ticks;
TAILQ_HEAD(, nvme_qpair) qpair_list;
};
void nvme_io_path_info_json(struct spdk_json_write_ctx *w, struct nvme_io_path *io_path);
struct nvme_ctrlr *nvme_ctrlr_get_by_name(const char *name);
struct nvme_bdev_ctrlr *nvme_bdev_ctrlr_get_by_name(const char *name);
typedef void (*nvme_bdev_ctrlr_for_each_fn)(struct nvme_bdev_ctrlr *nbdev_ctrlr, void *ctx);
void nvme_bdev_ctrlr_for_each(nvme_bdev_ctrlr_for_each_fn fn, void *ctx);
void nvme_bdev_dump_trid_json(const struct spdk_nvme_transport_id *trid,
struct spdk_json_write_ctx *w);
void nvme_ctrlr_info_json(struct spdk_json_write_ctx *w, struct nvme_ctrlr *nvme_ctrlr);
struct nvme_ns *nvme_ctrlr_get_ns(struct nvme_ctrlr *nvme_ctrlr, uint32_t nsid);
struct nvme_ns *nvme_ctrlr_get_first_active_ns(struct nvme_ctrlr *nvme_ctrlr);
struct nvme_ns *nvme_ctrlr_get_next_active_ns(struct nvme_ctrlr *nvme_ctrlr, struct nvme_ns *ns);
enum spdk_bdev_timeout_action {
SPDK_BDEV_NVME_TIMEOUT_ACTION_NONE = 0,
SPDK_BDEV_NVME_TIMEOUT_ACTION_RESET,
SPDK_BDEV_NVME_TIMEOUT_ACTION_ABORT,
};
struct spdk_bdev_nvme_opts {
enum spdk_bdev_timeout_action action_on_timeout;
uint64_t timeout_us;
uint64_t timeout_admin_us;
uint32_t keep_alive_timeout_ms;
/* The number of attempts per I/O in the transport layer before an I/O fails. */
uint32_t transport_retry_count;
uint32_t arbitration_burst;
uint32_t low_priority_weight;
uint32_t medium_priority_weight;
uint32_t high_priority_weight;
uint64_t nvme_adminq_poll_period_us;
uint64_t nvme_ioq_poll_period_us;
uint32_t io_queue_requests;
bool delay_cmd_submit;
/* The number of attempts per I/O in the bdev layer before an I/O fails. */
int32_t bdev_retry_count;
uint8_t transport_ack_timeout;
int32_t ctrlr_loss_timeout_sec;
uint32_t reconnect_delay_sec;
uint32_t fast_io_fail_timeout_sec;
bool disable_auto_failback;
bool generate_uuids;
/* Type of Service - RDMA only */
uint8_t transport_tos;
bdev/nvme: Count number of NVMe errors per type or code Error counters for NVMe error was added in the generic bdev layer but we want to know more detailed information for some use cases. Add NVMe error counters per type and per code as module specific statistics. For status codes, the first idea was to have different named member for each status code value. However, it was bad and too hard to test, review, and maintain. Instead, we have just two dimensional uint32_t arrays, and increment one of these uint32_t values based on the status code type and status code. Then, when dump the JSON, we use spdk_nvme_cpl_get_status_string() and spdk_nvme_cpl_get_status_type_string(). This idea has one potential downside. This idea consumes 4 (types) * 256 (codes) * 4 (counter) = 4KB per NVMe bdev. We can make this smarter if memory allocation is a problem. Hence we add an option nvme_error_stat to enable this feature only if the user requests. Additionally, the string returned by spdk_nvme_cpl_get_status_string() or spdk_nvme_cpl_get_status_type_string() has uppercases, spaces, and hyphens. These should not be included in JSON strings. Hence, convert these via spdk_strcpy_replace(). Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I07b07621e777bdf6556b95054abbbb65e5f9ea3e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15370 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot
2023-01-05 23:26:33 +00:00
bool nvme_error_stat;
uint32_t rdma_srq_size;
bdev_nvme: update nvme_io_path stat when IO completes Currently we have stat per bdev I/O channel, but for NVMe bdev multipath, we don't have stat per I/O path. Especially for active-active mode, we may want to observe each path's statistics. This patch support IO stat for nvme_io_path. Record each nvme_io_path stat using structure spdk_bdev_io_stat. The following is the comparison of bdevperf test. Test on Arm server with the following basic configuration. 1 Null bdev: block size: 4K, num_blocks:16k run bdevperf with io size=4k, qdepth=1/32/128, rw type=randwrite/mixed with 70% read/randread Each time run 30 seconds, each item run for 16 times and get the average. The result is as follows. qdepth type IOPS(default) IOPS(this patch) diff 1 randwrite 7795157.27 7859909.78 0.83% 1 mix(70% r) 7418607.08 7404026.54 -0.20% 1 randread 8053560.83 8046315.44 -0.09% 32 randwrite 15409191.3 15327642.11 -0.53% 32 mix(70% r) 13760145.97 13714666.28 -0.33% 32 randread 16136922.98 16038855.39 -0.61% 128 randwrite 14815647.56 14944902.74 0.87% 128 mix(70% r) 13414858.59 13412317.46 -0.02% 128 randread 15508642.43 15521752.41 0.08% Change-Id: I4eb5673f49d65d3ff9b930361d2f31ab0ccfa021 Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14743 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
2022-09-29 03:52:43 +00:00
bool io_path_stat;
};
struct spdk_nvme_qpair *bdev_nvme_get_io_qpair(struct spdk_io_channel *ctrlr_io_ch);
void bdev_nvme_get_opts(struct spdk_bdev_nvme_opts *opts);
int bdev_nvme_set_opts(const struct spdk_bdev_nvme_opts *opts);
int bdev_nvme_set_hotplug(bool enabled, uint64_t period_us, spdk_msg_fn cb, void *cb_ctx);
void bdev_nvme_get_default_ctrlr_opts(struct nvme_ctrlr_opts *opts);
int bdev_nvme_create(struct spdk_nvme_transport_id *trid,
const char *base_name,
const char **names,
uint32_t count,
spdk_bdev_create_nvme_fn cb_fn,
void *cb_ctx,
struct spdk_nvme_ctrlr_opts *drv_opts,
struct nvme_ctrlr_opts *bdev_opts,
bool multipath);
int bdev_nvme_start_discovery(struct spdk_nvme_transport_id *trid, const char *base_name,
struct spdk_nvme_ctrlr_opts *drv_opts, struct nvme_ctrlr_opts *bdev_opts,
nvme: Added support for TP-8009, Auto-discovery of Discovery controllers for NVME initiator using mDNS using Avahi Approach: Avahi Daemon needs to be running to provide the mDNS server service. In the SPDK, Avahi-client library based client API is implemented. The client API will connect to the Avahi-daemon and receive events for new discovery and removal of an existing discovery entry. Following sets on new RPCs have been introduced. scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp User shall initiate an mDNS based discovery using this RPC. This will start a Avahi-client based poller looking for new discovery events from the Avahi server. On a new discovery of the discovery controller, the existing bdev_nvme_start_discovery API will be invoked with the trid of the discovery controller learnt. This will enable automatic connection of the initiator to the subsystems discovered from the discovery controller. Multiple mdns discovery instances can be run by specifying a unique bdev-prefix and a unique servicename to discover as parameters. scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto This will stop the Avahi poller that was started for the specified service.Internally bdev_nvme_stop_discovery API will be invoked for each of the discovery controllers learnt automatically by this instance of mdns discovery service. This will result in termination of connections to all the subsystems learnt by this mdns discovery instance. scripts/rpc.py bdev_nvme_get_mdns_discovery_info This RPC will display the list of mdns discovery instances running and the trid of the controllers discovered by these instances. Test Result: root@ubuntu-pm-18-226:~/param-spdk/spdk/build/bin# ./nvmf_tgt -i 1 -s 2048 -m 0xF root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [ { "name": "cdc_auto", "svcname": "_nvme-disc._tcp", "referrals": [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } } ] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [ { "name": "cdc_auto02n1", "aliases": [ "600110d6-1681-1681-0403-000045805c45" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-1681-1681-0403-000045805c45", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T0", "serial_number": "00-681681dc681681dc", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto00n1", "aliases": [ "600110da-09a6-09a6-0302-00005eeb19b4" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 2048, "uuid": "600110da-09a6-09a6-0302-00005eeb19b4", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0" }, "ctrlr_data": { "cntlid": 1, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P2T0", "serial_number": "00-ab09a6f5ab09a6f5", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n1", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8c" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8c", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n2", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8d" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8d", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 2, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [] root@ubuntu-pm-18-226:~/param-spdk/spdk# Signed-off-by: Parameswaran Krishnamurthy <parameswaran.krishna@dell.com> Change-Id: Ic2c2e614e2549a655c7f81ae844b80d8505a4f02 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15703 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-by: <qun.wan@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-30 20:11:23 +00:00
uint64_t timeout, bool from_mdns,
spdk_bdev_nvme_start_discovery_fn cb_fn, void *cb_ctx);
int bdev_nvme_stop_discovery(const char *name, spdk_bdev_nvme_stop_discovery_fn cb_fn,
void *cb_ctx);
void bdev_nvme_get_discovery_info(struct spdk_json_write_ctx *w);
nvme: Added support for TP-8009, Auto-discovery of Discovery controllers for NVME initiator using mDNS using Avahi Approach: Avahi Daemon needs to be running to provide the mDNS server service. In the SPDK, Avahi-client library based client API is implemented. The client API will connect to the Avahi-daemon and receive events for new discovery and removal of an existing discovery entry. Following sets on new RPCs have been introduced. scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp User shall initiate an mDNS based discovery using this RPC. This will start a Avahi-client based poller looking for new discovery events from the Avahi server. On a new discovery of the discovery controller, the existing bdev_nvme_start_discovery API will be invoked with the trid of the discovery controller learnt. This will enable automatic connection of the initiator to the subsystems discovered from the discovery controller. Multiple mdns discovery instances can be run by specifying a unique bdev-prefix and a unique servicename to discover as parameters. scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto This will stop the Avahi poller that was started for the specified service.Internally bdev_nvme_stop_discovery API will be invoked for each of the discovery controllers learnt automatically by this instance of mdns discovery service. This will result in termination of connections to all the subsystems learnt by this mdns discovery instance. scripts/rpc.py bdev_nvme_get_mdns_discovery_info This RPC will display the list of mdns discovery instances running and the trid of the controllers discovered by these instances. Test Result: root@ubuntu-pm-18-226:~/param-spdk/spdk/build/bin# ./nvmf_tgt -i 1 -s 2048 -m 0xF root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_start_mdns_discovery -b cdc_auto -s _nvme-disc._tcp root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [ { "name": "cdc_auto", "svcname": "_nvme-disc._tcp", "referrals": [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" } } ] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [ { "name": "cdc_auto0", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] }, { "name": "cdc_auto1", "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.21", "trsvcid": "8009", "subnqn": "nqn.2014-08.org.nvmexpress.discovery" }, "referrals": [] } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [ { "name": "cdc_auto02n1", "aliases": [ "600110d6-1681-1681-0403-000045805c45" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-1681-1681-0403-000045805c45", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T0", "serial_number": "00-681681dc681681dc", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto00n1", "aliases": [ "600110da-09a6-09a6-0302-00005eeb19b4" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 2048, "uuid": "600110da-09a6-09a6-0302-00005eeb19b4", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.2.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0" }, "ctrlr_data": { "cntlid": 1, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P2T0", "serial_number": "00-ab09a6f5ab09a6f5", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.2.0", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n1", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8c" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8c", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 1, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } }, { "name": "cdc_auto01n2", "aliases": [ "600110d6-dce8-dce8-0403-00010b2d3d8d" ], "product_name": "NVMe disk", "block_size": 512, "num_blocks": 32768, "uuid": "600110d6-dce8-dce8-0403-00010b2d3d8d", "assigned_rate_limits": { "rw_ios_per_sec": 0, "rw_mbytes_per_sec": 0, "r_mbytes_per_sec": 0, "w_mbytes_per_sec": 0 }, "claimed": false, "zoned": false, "supported_io_types": { "read": true, "write": true, "unmap": true, "write_zeroes": true, "flush": true, "reset": true, "compare": true, "compare_and_write": true, "abort": true, "nvme_admin": true, "nvme_io": true }, "driver_specific": { "nvme": [ { "trid": { "trtype": "TCP", "adrfam": "IPv4", "traddr": "66.1.1.40", "trsvcid": "4420", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1" }, "ctrlr_data": { "cntlid": 3, "vendor_id": "0x0000", "model_number": "SANBlaze VLUN P3T1", "serial_number": "01-6ddce86d6ddce86d", "firmware_revision": "V10.5", "subnqn": "nqn.2014-08.com.sanblaze:virtualun.virtualun.3.1", "oacs": { "security": 0, "format": 1, "firmware": 0, "ns_manage": 1 }, "multi_ctrlr": true, "ana_reporting": true }, "vs": { "nvme_version": "2.0" }, "ns_data": { "id": 2, "ana_state": "optimized", "can_share": true } } ], "mp_policy": "active_passive" } } ] root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_stop_mdns_discovery -b cdc_auto root@ubuntu-pm-18-226:~/param-spdk/spdk# root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_mdns_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_nvme_get_discovery_info [] root@ubuntu-pm-18-226:~/param-spdk/spdk# scripts/rpc.py bdev_get_bdevs [] root@ubuntu-pm-18-226:~/param-spdk/spdk# Signed-off-by: Parameswaran Krishnamurthy <parameswaran.krishna@dell.com> Change-Id: Ic2c2e614e2549a655c7f81ae844b80d8505a4f02 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15703 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Boris Glimcher <Boris.Glimcher@emc.com> Reviewed-by: <qun.wan@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2022-11-30 20:11:23 +00:00
int bdev_nvme_start_mdns_discovery(const char *base_name,
const char *svcname,
struct spdk_nvme_ctrlr_opts *drv_opts,
struct nvme_ctrlr_opts *bdev_opts);
int bdev_nvme_stop_mdns_discovery(const char *name);
void bdev_nvme_get_mdns_discovery_info(struct spdk_jsonrpc_request *request);
void bdev_nvme_mdns_discovery_config_json(struct spdk_json_write_ctx *w);
struct spdk_nvme_ctrlr *bdev_nvme_get_ctrlr(struct spdk_bdev *bdev);
/**
* Delete NVMe controller with all bdevs on top of it, or delete the specified path
* if there is any alternative path. Requires to pass name of NVMe controller.
*
* \param name NVMe controller name
* \param path_id The specified path to remove (optional)
* \return zero on success, -EINVAL on wrong parameters or -ENODEV if controller is not found
*/
int bdev_nvme_delete(const char *name, const struct nvme_path_id *path_id);
/**
* Reset NVMe controller.
*
* \param nvme_ctrlr The specified NVMe controller to reset
* \param cb_fn Function to be called back after reset completes
* \param cb_arg Argument for callback function
* \return zero on success. Negated errno on the following error conditions:
* -ENXIO: controller is being destroyed.
* -EBUSY: controller is already being reset.
*/
int bdev_nvme_reset_rpc(struct nvme_ctrlr *nvme_ctrlr, bdev_nvme_reset_cb cb_fn, void *cb_arg);
typedef void (*bdev_nvme_set_preferred_path_cb)(void *cb_arg, int rc);
/**
* Set the preferred I/O path for an NVMe bdev in multipath mode.
*
* NOTE: This function does not support NVMe bdevs in failover mode.
*
* \param name NVMe bdev name
* \param cntlid NVMe-oF controller ID
* \param cb_fn Function to be called back after completion.
* \param cb_arg Argument for callback function.
*/
void bdev_nvme_set_preferred_path(const char *name, uint16_t cntlid,
bdev_nvme_set_preferred_path_cb cb_fn, void *cb_arg);
typedef void (*bdev_nvme_set_multipath_policy_cb)(void *cb_arg, int rc);
/**
* Set multipath policy of the NVMe bdev.
*
* \param name NVMe bdev name
* \param policy Multipath policy (active-passive or active-active)
* \param selector Multipath selector (round_robin, queue_depth)
* \param rr_min_io Number of IO to route to a path before switching to another for round-robin
* \param cb_fn Function to be called back after completion.
*/
void bdev_nvme_set_multipath_policy(const char *name,
enum bdev_nvme_multipath_policy policy,
enum bdev_nvme_multipath_selector selector,
uint32_t rr_min_io,
bdev_nvme_set_multipath_policy_cb cb_fn,
void *cb_arg);
#endif /* SPDK_BDEV_NVME_H */