In high read IO pressure, there is a loop call of process_completed_read_subtask_list()
while calling spdk_iscsi_task_response(), this cause 'primary->bytes_completed'
changes, in turn cause multiple calls of 'spdk_iscsi_task_put(primary)', assertion
failes in spdk_scsi_task_put().
Signed-off-by: Sochin Jiang <jiangxiaoqing.sochin@bytedance.com>
Change-Id: I41d02d318f827f3bb3ad9ba3a06e080b5113cd40
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4083
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
a pause
This now also takes a lock instead of requiring a pause of the whole
subsystem.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7de174f3f56d2b3767e723387c4f2257107d8b19
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4581
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The list of allowed hosts is only checked during handling of CONNECT
commands - not in the main I/O path. Protect that list with a mutex
instead of requiring a full pause of the subsystem to allow
dynamic management of the allowed hosts without impacting any
active I/O.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3f7e87cc1fa6de200c422928c07153fc60fab28c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4555
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The WQ has not been provided a PASID so virtual addressing is not
supported. This worked previously because all test set ups had IOVA=VA.
Change-Id: I6b08714e246a0dc8d5bc0f31efa4b90b601c5b4c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4558
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
I think these two lines code can be simplified one line.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Ibfc876e8de1c5a39cde94ed2ad57b5095a098f3b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4375
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Pack all of the hot data into the first cache line. The first cache line
covers everything up to and including the ctrlrs TAILQ.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I184520661743aec91b3bb3d81e53fe8610c9383e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4554
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This saves 2 bytes and allows it to pack nicely with the
changing state bool (which must remember separate for atomic
operations).
Change-Id: Ibb92ae3c74306e60385ae23d0aaf877f33a69095
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4553
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Pass not bdev but bdev_name to scsi_lun_construct() to fix the
race condition due to the time gap between spdk_bdev_get_by_name()
and spdk_bdev_open(). A pointer to a bdev is valid only while the
bdev is opened.
spdk_bdev_open() has been replaced recently by spdk_bdev_open_ext(),
but the issue still existed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic462422dbc2501c24907f56a36570fbb54fef65b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4482
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We can receive buffer reclaim notifications only when a qpair is
attached to a poll group (so qpair's socket is connected to a socket
poll group).
The previous assumption that we enable zcopy only for IO qpairs was
wrong since IO qpair might not use poll groups too (e.g. abort
application).
Fixes issue #1607
Change-Id: I67329d755d81da6606e65eddfeceb20839346d87
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4476
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Here we add some new variables which we will be able
to use in a later patch to generate pkg-config files
for this env_dpdk library and our DPDK library
dependencies.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8a256096ea08f97eba5d4460405f419624e6f0bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4468
Reviewed-by: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When we support spdk_nvme_detach_async(), any controller may be
destructed asynchronously. We will be able to know the case by
ctrlr->is_destructed is true and ctrlr is queued in the attached list.
nvme_ctrlr_probe() should fail if the found ctrlr satisfies these
conditions.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I299c2e5ea3c16cc1239899c163bb9e0eb921ade5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4413
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
With new funtion it is allowed to successfully parse json values even
if doceder for given key is not found.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I036f263e9050bd2b96aaa3ff61a9542c98365892
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4340
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Memory allocated for impl_name is not freed in error cases.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: If7cd62d948a05421b0bd5d1599f1275a0f3b4597
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When uring is enabled, uring socket implementation is
used to create sockets. We may want to use posix sockets
for some reasons (e.g. performance tests). This patch adds
a new API function to set the socket implementation which
will be used by default, e.g. when no impl_name is passed
to spdk_sock_connect/spdk_sock_listen functions.
Misc changes: include spdk_internal/log.h to register
SOCK log component. The new include header already
includes spdk/sock.h and spdk/queue.h, sow remove
direct inclusion of these headers.
Change-Id: I4abad0a59cd033b15bd43a00e3dbdf313fa6b06c
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Without this change nvmf_ctrlr_create() will fail to lookup
the subsystem listener matching this qpair.
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: I855baa16e996737b60dbd745ce84f8c0bc024cf1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add getters to the ZNS specific data structures, so that an
SPDK application, e.g. examples/nvme/identify/identify.c,
has the ability to get and utilize the information in them.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I26056161093cc811acb6840ff7e2068e5f6058f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4412
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add a new state in the SPDK NVMe state machine in order to fetch
I/O Command Set Specific Namespace data structures.
Right now there is only support for the Zoned Namespace Command Set
Specific Identify Namespace data structure.
The NVM Command Set Specific Identify Namespace data structure is
all zeroes right now, reserved for future use.
The Key Value Command Set Identify Namespace data structure is not
all zeroes, however, adding support for Key Value is outside the
scope of this patch.
The new NVME_CTRLR_STATE_IDENTIFY_NS_IOCS_SPECIFIC state is added
after the NVME_CTRLR_STATE_IDENTIFY_ID_DESCS state. This is because
we need to have fetched the identifiers in the desc list in order
to know which command set a namespace belongs to.
A slightly nicer design might have been to refactor the NVMe state
machine to first fetch the id desc list, then the identify namespace
struct, and finally the identify IOCS specific namespace struct.
However, since this would have required a lot of changes, it didn't
really seem justified.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I62cbc533c2c3eec1ccf0ba9b1c414d5a70919cff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4368
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add a new state in the SPDK NVMe state machine in order to fetch
I/O Command Set Specific Controller data structures.
Right now there is only support for the Zoned Namespace Command Set
Specific Identify Controller data structure.
The NVM Command Set Specific Identify Controller data structure is
all zeroes right now, reserved for future use.
The Key Value Command Set Identify Controller data structure is also
all zeroes right now, reserved for future use.
The new NVME_CTRLR_STATE_IDENTIFY_IOCS_SPECIFIC state is added
after the NVME_CTRLR_STATE_IDENTIFY state. That way, if support for
the Zoned Namespace Command Set is enabled during probing, we will
fetch the Zoned Namespace Command Set Specific Identify Controller data
structure, regardless if any Zoned Namespaces are attached or not, and
no additional steps will be needed once a Zoned Namespace is attached.
Since we only have one command set to fetch, avoid creating
NVME_CTRLR_STATE_IDENTIFY_IOCS_SPECIFIC substates, although that will
probably be needed when support for another command set is added.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I95535b09b03b7ef2ee9a11eebdbd28aad66d65ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4367
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When adding an additional state to enum nvme_ctrlr_state, abidiff (1.6.0)
will report that almost every public interface in the nvme library has
been impacted, causing test/make/check_so_deps.sh to fail.
While it is possible that by adding another state, the compiler decides
to use a larger data type for representing enum nvme_ctrlr_state, abidiff
shouldn't complain in the first place, since spdk_nvme_ctrlr is only
ever exposed as an opaque handle. It can never be accessed directly.
Jim Harris suggested to workaround this abidiff bug by changing the type
of spdk_nvme_ctrlr::state from enum to int.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I8b85446580043e95cf791249d643907587e2f982
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4427
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This seems to be causing some CI test failures. So
disable zero copy in all cases for now for client
sockets.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iceea09fe65fb90c7df15f500878a473f1ad4152c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4473
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
DPDK patch (4143b122) included in DPDK v18.05 replaced
MEMPOOL_F_NO_PHYS_CONTIG with MEMPOOL_F_NO_IOVA_CONTIG.
Meanwhile latest DPDK from (28e3c8b2) removed the
MEMPOOL_F_NO_PHYS_CONTIG.
This patch simply replaced the define, since it will
work for any DPDK v18.05+.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I43ada50df31be18c724b2f5078d3f29f3d1c0c71
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4418
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This library has existed since DPDK 17.05, so there are
no supported versions of DPDK that do not contain this
library.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I84f2b77046d093989dfa9533f3d1c76e8c243c3f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4417
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
These libraries have existed since DPDK 17.11. We
do not support any DPDK versions older than that, so
there is no need to conditionally handle cases where
those libraries do not exist.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3906db4d07ae04344b4c3bfaac02da58f248bf75
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4392
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This library was removed in DPDK 2.1 which SPDK has
not supported for a very long time now.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I01cf47078e69b9d396a80f5680a4f1c1c3a9be46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4391
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Only libraries that are using shm_open require linking
rt when creating the shared library. Stop including
it on every library and just add it to LOCAL_SYS_LIBS
in the one case where it is used (spdk_trace).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic13128873a76c355b14871a0dea0922488c9cd13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4370
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
LOCAL_SYS_LIBS is meant to define *direct* system
library dependencies for a given library. libuuid
is directly used by the SPDK util library and then
other SPDK libraries use uuid indirectly through
util.
So only the util library should include uuid in
LOCAL_SYS_LIBS.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia0d2d63f48e6f89891164cf2f9dc4c7a6476d4e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4366
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In NVME TCP initiator zero copy is enabled for IO qpairs
and disabled for admin qpairs
Change-Id: Ibdf521dccde9b95ec5dd15a5eb2baed8fcf8b88e
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4211
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A preparation step for enabling zero copy in NVMEoF TCP initiator.
This option will be used to disable zero copy
for admin qpair. This is needed since the admin
qpair's socket is not connected to socket poll group
and we can't receive buffer reclaim notification.
Change-Id: Ibfbb8a156aafcd7ba8975a50f790da7fbd37d96f
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4210
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
A preparation step for enabling zero copy in NVMEoF TCP initiator.
With zero copy enabled, some requests might be completed out
of "process_completions" call and we should take them into
account to return the correct number of completions.
Change-Id: Iba7973f6da815645bbfad0334619d46b66379226
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4209
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A preparation step for enabling zero copy in NVMEoF TCP initiator.
We should wait for both events to occur before continue qpair
initialization.
Add a new bit to nvme_tcp_qpair::flags to track receiving of icreq ack
since icreq is sent without tcp_req and there is no way to apply
existing synchronization mechanisms.
Move tcp qpair to initializing state if we receive icresp before icreq ack,
this state will be checked during handling of icreq ack to continue
qpair initialization
Change-Id: I7f1ec710d49fb1322eb0a7f133190220b9f585ab
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4207
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A preparation step for enabling zero copy in NVMEoF TCP initiator.
Since nvme_tcp_qpair_process_completions doesn't process poll
group, we can't get asycn notification from kernel.
1. Add a qpair to poll group before we send icreq in order to be able
to process buffer reclaim notification.
2. Check if qpair is connected to a poll group and call
nvme_tcp_poll_group_process_completions instead of
nvme_tcp_qpair_process_completions when waiting for icresp
3. Add processing of poll group to nvme_wait_for_completion_timeout
and nvme_wait_for_completion_robust_lock since they are used to
process FABRIC_CONNECT command
Change-Id: I38d2d9496bca8d0cd72e44883df2df802e31a87d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4208
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently host/data digest are bool members of nvme_tcp_qpair
structure. Change the type of this members to bitfield, reserved
bits will be used in the next patches to support zero copy.
Change-Id: If0659bf2445901e45fe0816af5f4fca5f494b154
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4206
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A preparation step for enabling zero copy in NVMEoF TCP initiator.
Make sure that we complete a request (call user's callback)
when all acknowledgements are received. For write operation -
when we received send cmd ack, h2c ack and response from target.
For read operation - when we received send cmd ack and c2h
completed
Since we can receive send ack after resp command, store
nvme completion received in resp command in a new field
added to tcp_req structure
Change-Id: Id10d506a346738c7a641a979e1c8f86bc07465a4
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4204
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A preparation step for enabling zero copy in NVMEoF TCP initiator.
Some NVMEoF TCP targets can send several R2T requests. We should
check that we finished the previous H2C (received buffer reclaim
notification from kernel) before sending the next H2C.
This patch adds a new ordering bit indicating the described case
and 2 fields to nvme_tcp_req to store the values from the last R2T
request which will be applied when send ack is received.
Change-Id: Iaa5ad846712ca18a8382680baa02413c18c4eb37
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4203
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the capacity of scsi bdev has changed, vhost-scsi should
notify the guest to handle this change.
Change-Id: I1087b28cdb719f6b727ff0ae486cee6a0719bb0c
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4124
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently, the scsi bdev only supports the hotremove event,
and the scsi library uses the deprecated `spdk_bdev_open` function.
In this patch, add the resize event support, so the upper layer
could do more actions, like vhost-scsi could notify the guest os.
For the scsi compatibility, add _ext suffix for some public api.
Change-Id: I3254d4570142893f953f7f42da31efb5a3685033
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4353
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This was accidentally bumped twice (3 to 4 to 5) since
v20.07 release.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia5bda3349fa5c1fce37166fe4b640ff722bb7e3a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4421
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It should be 16 but not 6. For example, it will have 16 priorities
when configuring ADQ with Intel's 100G NIC.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iebdf7b379c15f3b5fd16dba2ad87ec55af04577f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4235
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We have some RPCs defined in the bdev library itself,
others in a separate bdev_rpc library. There's no need
for the separate library - just move them all into the
bdev library.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I298eedb88924197e64eb315369efb10f402903a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4364
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There is no need to have the application-level RPCs
defined separately from the event library itself
(which defines the application framework).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic264ed761f5ec1a40d604e63395c5740af4be1a6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4363
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The log_rpc library serves little (if any) use in
isolation. It makes more sense to just include
this code in the event library. The event library
already depends on and uses the log library, and it
is natural to just enable these RPCs directly in
that library instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie39b8598ce0c06729a13d188ce00da44a996accc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4362
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This RPC was originally put into the app_rpc library,
but the log_rpc library is a better home for it, since
other log-related RPCs are already there.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7ba5ac6cdeb57fb4219244690590c8fabbc3f59a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Meanwhile, to verify an issue about git push unittest failure.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Idac60e5832390eb8bdce68aee639be2e9ac6cff6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4373
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add spdk_nvme_ns_get_ana_group_id() and spdk_nvme_ns_get_ana_state()
to getthe ANA group ID and the ANA state of the given namespace,
respectively.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id5f1f7ee488a1eb2a7a77f9986a3bb89146628e0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4354
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add ana_state and ana_group_id to struct spdk_nvme_ns and keep
them up-to-date by updating when spdk_nvme_ctrlr is created or
ANA change notice is received asynchronously. For both cases,
struct spdk_nvme_ctrlr holds the latest ANA state.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I311fe1c8015c8b8ac9659c38661244706c04b3e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4287
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add an internal API nvme_ctrlr_parse_ana_log_page() to parse an ANA
log page and execute the specified callback function for each
ANA group descriptor in the ANA log page.
We will be able to copy the ANA group descriptor to the caller instead.
To do that, we will need to inform the size of the descriptor first,
but the size will not be constant.
Passing parser to the API will be more convenient.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifd8fda30a83965948017fb8ad992c0d889197cde
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4279
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When creating a controller, allocate a buffer to the controller
and read ANA log page into the buffer.
When receiving ANA change notice, read ANA log page into the buffer
to keep the contents up to date.
The next patch will provide a public API to get the contents of
ANA log page the controller holds.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If5c653f4e80d157e5120bb754e6660250b2b8fa1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4233
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add an internal API nvme_wait_for_completion_robust_mutex_lock_timeout()
and related internal APIs just call it with adjusting parameters.
nvme_wait_for_completion_robust_lock_timeout() will be usable for
the current use cases of nvme_wait_for_completion_robust_lock() and
future use cases.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2b499643930256a39ebe279f56a399f20a7a2fde
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4217
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
No longer required to allocate from shared memory. No tools
use this anymore.
This removes the final call to the event library from iscsi,
so we also drop that dependency.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I41a6877b782cb927d9ac7d206ccd36a8195efc42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4346
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This was not used by anything. It was intended for use by user-space
TCP stacks.
Change-Id: I416589e421784882c693bcc5b03fe1dbcc4b1bd3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4297
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the both normal and exceptional case, the mutex
will need to be destroyed.
Change-Id: I39c815f2adffbd3786b45a938c476dcbb66a438f
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4339
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It may have been a long time since the thread last executed
so ensure this time is accurate.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iaa4c35b50cdc05ebb41724ed9946c5232d242ee3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4321
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
value
If the user passes NULL for the thread, just use the current thread
to get the last tsc.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1a2b61d9765e1ef59927ffec7c49f2a2b62590f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4320
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fix some spelling and make the message clearer
Change-Id: Ib291542a9735d6409db84f16c530e78567123f67
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4249
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Unlike ADMIN and IO commands, the FABRIC command is only processed
in the ctrlr.c file.
Change-Id: Ic4e01c7f81c98631a2c7cb603343b301f8ba63e1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4307
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
With the introduction of namespace types, the identify command has
gained an additional parameter: Command Set Identifier (CSI).
This parameter is similar to the existing parameters NSID and CNTID,
and is not used by all CNS values.
Most notably, the CSI parameter is not used for the existing CNS
values 00h (ID NS) and 01h (ID CTRL).
There are new CNS values, e.g. 05h (ID IOCS specific NS), and
06h (ID IOCS specific CTRL), which do take the new CSI parameter.
The new CNS values instead return Command Set Specific data structures,
which is basically an additional data structure. Therefore, the CNS
values 00h and 01h are very much still in use.
(Even the NVM Command Set has a Command Set Specific data structure,
even though all fields in that data structure are currently reserved.)
Since the CSI parameter is unused by all the existing calls to
nvme_ctrlr_cmd_identify() (since none of the calls send in a CNS value
that uses CSI), simply send in 0 for all existing calls.
No functional change intended.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia2b2324393a0707152b2f8511f0a22ad4a12bd46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4309
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The poller is now created internally to the library whenever a target
is constructed. Applications are not expected to poll for connections
any longer.
Change-Id: I523eb6adcc042c1ba2ed41b1cb41256b8bf63772
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3583
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
There are two different is_active() functions.
spdk_nvme_ctrlr_is_active_ns() which iterates through the active_ns_list,
and spdk_nvme_ns_is_active(), which simply checks the nsdata.
There is an event callback that refreshes active_ns_list when a relevant
events has occured.
In nvme_ns_construct(), nvme_ctrlr_identify_ns() has just been called,
so we know that nsdata is as fresh as possible.
Hence, there is no reason to iterate through a less fresh active_ns_list.
Since we know that the nvme_ctrlr_identify_ns() call was done through the
same controller, we also know that the active/inactive is from the
perspective of the correct controller, so that is not a reason to use the
less efficient is_active() function.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I185f59b53e16e70163e33a3909f4b55ebf631cc4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4293
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since the command set identifier might be accessed at several
different states in the nvme state machine, cache it so that
we don't need to loop through the ns id desc list every time.
This is similar to how other identify fields are cached using
nvme_ns_set_identify_data().
None of the identifiers in the desc list (including the new CSI)
can change over the life time of a namespace, so caching them
should be safe.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ie06180a4b3750dfa1a42f47afe0f7f9e3ec04ba9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4266
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If the nvme completion was an error, the function will return,
so there is no reason for an else statement.
In fact, the else statement in nvme_ctrlr_identify_ns_async_done()
differs from the coding style used in other nvme_ctrlr_identify_*
functions, and arguably makes the code harder to read.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: If76b823b7ca04ab98abb2912927c344ee9f12314
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4265
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Clear the ns id desc list in nvme_ns_destruct().
Without this, someone can get stale data by calling e.g.
spdk_nvme_ns_get_uuid() on a destructed namespace.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I965dd4cd6101d3a77eddbd582b9618b3436d39c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4263
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When we disconnect a qpair, part of the code path is
calling _nvme_qpair_abort_queued_reqs. This takes
care of aborting any requests that were queued waiting
for slots to open on the submission queue.
It walks the STAILQ one by one and manually completes
them with ABORT status back to the caller.
But if the callback path submits another request, this
request may also get queued to the end of the queued_req
TAILQ. This can result in an infinite loop.
The solution is to use an STAILQ_SWAP to a local, empty
STAILQ. Then we ensure we only abort the requests that
were queued when _nvme_qpair_abort_queued_reqs() started
executing.
Fixes issue #1588.
I used the multipath.sh test to reproduce this on my local
system. If it ever dropped into the STAILQ loop in this
function, we would hit the infinite loop. With this patch,
I confirmed locally that now we safely avoid the infinite
loop and the test passes.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I657db23efe5983bd8613c870ad62695a7fc7f689
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4284
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Remove some of the boilerplate code from each case and
replace with just an spdk_msg_fn assignment.
This also reduces the size of an upcoming change needed
in this function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia209073cfb66032f2cca6bb44a09e1984ef2110c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4257
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When the vhost-scsi target needs live recovery, check the inflight share memory,
and resubmit the inflight io.
Signed-off-by: Li Feng <fengli@smartx.com>
Change-Id: I785476c8835053a4e8d4f1d692437feaf3a9ace1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4092
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Rename ordering bit r2t_recv to h2c_send_waiting_ack, that is more
descriptive name.
Change-Id: I6d6143ff4c1cccc74e11226b7974706808092f9a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4202
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This makes it easier to zerofy ordering bits.
Change-Id: If5696bfedfff1bf75e41c1449eac7fccb469e98b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4201
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The issue happens when SPDK RDMA initiator is connected to a remote
target and this target reports rather small (or zero) ICD and we try
to send several SGL descriptors.
Since SGL descriptors are located in ICD, we should check that their
total length fits into ICD. In other case sending such a command
will cause RDMA errors (local length error)
Change-Id: I8c0e8375dae799bc442ed2fab249cad2c4ccce51
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4131
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
uint32_t supports at most 2TB at most, we need to handle
the larger blobstores, fix this overflow problem.
Signed-off-by: Sochin Jiang <jiangxiaoqing.sochin@bytedance.com>
Change-Id: I27950eb759e9cb9ad48fa4aa8dd1976b4e852832
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4075
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In C language, we cannot use constant at compile time. Hence the
local array _ana_desc[] is not a fixed size array but a variable
length array.
We can avoid using variable length array by changing const variable
to macro constant.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7333a8078d3102c4bd5088f56f6530846854c85f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4093
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add an new RPC, nvmf_subsystem_listener_set_ana_state.
Find the specified subsystem listener, and then set the ANA state
of the listener by calling nvmf_subsystem_listener_set_ana_state().
By adding a string and an enum to the existing context structure,
nvmf_rpc_listener_ctx, and adding an operation type to the existng
enum, nvmf_rpc_listen_op, reuse the existing code and data as much
as possible.
Besides, insert line break into a few long lines and fix wrong
error log.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6fb2dfbb1f9c5f56848eba21d2a733fbed802614
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4080
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an internal API nvmf_subsystem_set_ana_state() to change the
ANA state of the subsystem listener whose trid matches.
ANA optimized state, ANA non-optimized state, and ANA inaccessible
state are supported. ANA change state is not used and ANA persistent
loss state is not supported.
After changing the ANA state of the subsystem listener, on each poll
group, controllers, whose the subsystem listener match, send ANA
change notice.
Initiators query ANA log page anyway if they receive ANA change
notification. False positive notification should be avoided but is
acceptable.
To avoid any concurrency conflict, simply compare ctrlr->listener and
the passed listener.
It may be better to execute nvmf_subsystem_set_ana_state() on the
subsystem thread but currently the RPC thread adds and removes a
listener to and from the subsystem, respectively, and the subsystem
has been suspended while executing nvmf_subsystem_set_ana_state().
Hence we keep this as a future enhancement.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If1910b79dd33d904114e258ae2c5e868947cdc52
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4079
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
If the ANA reporting feature is enabled for the subsystem,
- set ANA Change Notice of Asynchronous Event Configuration to 1
- set ANA Change Notice of Optional Asynchronus Event Supported to 1
- set ANA Non-Optimized state and ANA Inaccessible state of ANA
Capability to 1.
ANA Change state is not used and ANA Persistent Loss state is not
supported for now.
The next patch will actually support ANA Change Notice using an new
RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4db2e33dd2879cdf995adcab41ef53728b27a201
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4087
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In cases where the SPDK nvme driver is being used as a validation/test
vehicle, users may need to allocate a currently unused qid that can be
used for creating queues using the raw interfaces. One example would be
testing N:1 SQ:CQ mappings which are supported by PCIe controllers but
not through the standard SPDK nvme driver APIs.
These new functions fulfill this purpose, and ensure that the allocated
qid will not be used by the SPDK driver for any future queues allocated
through the spdk_nvme_ctrlr_alloc_io_qpair API.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I21c33596ec415c2816728a600972b242da9d971b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3896
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If we are already in the desired state,
just call the callback directly from the
subsystem_state_change function. That way
we save a lot of message passing.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I6cf8563524610d9125d53266e3c0e179e064bf63
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3760
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is important to avoid doubling up on state changes
and hitting asserts.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If8797ea13a5c224cee85e53e9b2542012423b37f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3759
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We still need to be able to explicitly set specific
bits in the cluster array during initialization and
loading (especially recovery), so we use a bit_array
during load, and then convert it to a bit_pool just
before calling the user's cmopletion callback.
This gives a roughly 300% improvement over baseline
on a benchmark which does continuous resize operations.
The benefit is primarily from saving the lowest free
bit rather than having to always start at bit 0. We
may be able to further improve this by saving extents
in the bit pool as well, although after this patch,
the benchmark shows other hot spots different from the
bit search.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idb1d75d8348bc50560b1f42d49dbe4d79d024619
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3975
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_bit_pool is a wrapper around spdk_bit_array with the
intentions of providing much better performance for allocating
from a fragmented bit array. The cost of searching a large bit
array for a cleared bit can become expensive so the spdk_bit_pool
will provide an ability to track extents of recently cleared
bits.
This initial commit does not adding the tracking yet - it is strictly
a wrapper around spdk_bit_array with enough functionality to replace
the use of spdk_bit_pool in SPDK blobstore with equivalent performance.
This will allow us to switch blobstore to use this minimal
wrapper first, and then iteratively improve spdk_bit_pool to provide
the better performance.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I95d0d12db47eac73e0641eb7f94fa5df43d42e45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3974
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
The ctx allocation was duplicated after both bs_alloc
calling sites, so this reduces the code a bit. This change
also enables some future changes involving the used_clusters
bit array.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4ea98f079dbe385654e9cb9c0c58a1926a990c9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3973
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This will allow for some additional simplifications
in future patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie421ad35f8c0efbb775fbe6bf85799af515264ef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3972
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This prepares for some future patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If63c83f76e839b796c58200ddb0ca2137fbc4288
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3971
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Have it both find and set the lowest available cluster
bit index.
This will temporarily hurt the performance for cluster
allocation, since it will always search starting at
bit index 0. But upcoming changes in this patch set
will fix that again by using a new spdk_bit_pool object
that will do allocations much more efficiently than the
current implementation here.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad199c9166b82cb9a31597a080f5a28823849e60
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3970
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently if we create a blob of 128 clusters, we
lock/unlock the used_cluster_mutex 128 times - once
for each cluster. Same when those clusters are released
when the blob is deleted. Batching these lock/unlock
operations is very easy and gives a noticeable
efficiency improvement.
My local benchmark (1GiB ramdisk, 4KB cluster size,
128 clusters/blob) creates enough blobs to fill the
blobstore and then continuously deletes and recreates
them. Performance increases 20% on that benchmark
with this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic503accf1ca1ab1af7254b4067771d956f52014d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4069
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This may happen when resetting a controller, if the ADMIN queue failed
to reconnect, the controller is set to failed state, so for this case
we don't need to loop until timeout, just exit.
Change-Id: I2b37af5453086cd64f3609c41eb8f6475da55fd4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4143
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
There is no need for this interface to be async.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1f21b53e90b7d165b6b5fb2e1226ce7591966b58
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4181
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It was introduced for the purpose of executing fabric cmds when
subsystem and qpairs are not active. It was rather workaround than
solution for transport type like vfio-user. spdk_nvmf_request_exec
is a preferred way of passing request obj into nvmf layer.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I4f989de27bfd494c744017599909c2e200f0f233
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4180
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If ctrlr->cdata.cmic.ana_reporting is 1, set the corresponding
field to true.
Then use its API in the identify application.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4e74bc4c114883e4aecdbee7a6f1a02027db23a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4156
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch is used to enable placement_id getting
in sock layer and also add the rpc support.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I70de57b0ed392a0aefce9d3ff1f61ef924015a87
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4146
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add support for getting the Command Set Identifier for a given namespace.
The SPDK_NVME_CAP_CSS_IOCS feature can be implemented on top of an old NVMe
specification. If the feature is set, retrieve the NS ID Descriptor List
regardless of the NVMe specification version. The quirk is still respected.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I7b257115ecb0d813ba75201c0f48960c7070dcc9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4085
Community-CI: Broadcom CI
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Warn if found UUID descriptor length differs from NIDL for NIDT_UUID.
This will help identify non-compliant NVMe controllers.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Idf0daff9996147f38413318d1cd7fc3f929c5ce4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4138
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add an new RPC, nvmf_subsystem_get_listeners.
ANA state is per listener and per subsystem, and is stored in
subsystem listener. We can return ANA state by the existing
nvmf_get_subsystems RPC but it's confusing that listen addresses
have ANA states.
To change ANA state, we will provide a RPC to change ANA state of
only one selected subsystem listener.
To query ANA state, it will be convenient to get ANA states of all
listeners of one selected subsystem.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic3baad6eac65d7af6e0cab2c4059e1458d41e6e2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4059
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Data structure and macro constants for multiple listen addresses
and namespaces are not used anywhere in nvmf_rpc.c
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idd8bc61e22f9e9918a88f017a024cab239ff5e53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4060
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an new RPC, nvmf_subsystem_get_qpairs to retrieve the list of
qpairs of an NVMe-oF subsystem.
This RPC will be usable to verify if NVMe ANA works.
Pause and resume the subsystem to access the qpairs safely.
One subtle issue remains. The JSON RPC returns success even if
resuming the subsystem fails. Write FIXME to address this.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9d90a01b1117dee00d85b2e21b4f4d02d80db531
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4050
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Some of the functions were only referenced directly.
There is no need to use void* or pass any bserrno,
in some cases.
Let's be explicit.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib26dda7068965838f38dad856ea1e456fd87a655
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4061
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This looks like a major omission on persist path.
Especially visible for cases where blobstore was not
reloaded between blob creations/deletion.
Added writing out zeroes to md_pages that contained
truncated extents (resized down).
After zeroes are writen out, md_pages for those extents
are released. In case of blob deletion, extents are
resized down to 0 so all extent pages are released.
Fixes#1590
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I9a2a1190e3f1f3b5d1bb806191c1fe4d27df7780
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4051
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Usage of spdk_thread_get_count is wrong since there might be many
threads allocated by other modules. Transport buffers are used by
transport poll groups, their number is equal to the number of cores.
Change-Id: I4bc748e93c3b204bf3b3ec73f17257b927a7f428
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3882
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When we try to evenly divide transport buffers between poll grouos,
e.g. when we run spdk_tgt on 8 cores, set num_shared_buffers=32768
and pg buf_cache_size=4096, the last pg can't retrieve enough
buffers to fill cache. In my case if only got 4040 buffers out of
4096. Missing 56 buffers were cached by previous poll groups.
That occurred due to mempool has per lcore cache of 512 elements
and when it becomes empty, the cache is refilled. It seems that
each poll group cached extra 8 buffers.
The issue doesn't occur when we use mempool_get_bulk.
Change-Id: I866d58aa03986a3cffe27402b12f9a2519097f83
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3881
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In production environments, there could be large numbers
of uevents other than nvme hotplug events. We want to
ensure we never lose an nvme uevent due to ENOBUFS
(i.e. overflow). So allocate a bigger receive buffer
for the netlink socket to ensure we never lose any events.
We only allocate one netlink socket per SPDK application,
so the extra memory consumption is not really a concern.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I663fbb093516a01a8980a1517245f92d8c76f7aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4070
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There are two bugs:
1, When the target response 0, it means target does't
support keep alive.
2, Change the interval time to us so when the keep alive
timeout is 1ms then the interval is 500us.
Fix github issue: #1565
Change-Id: I75707ab0e4e639209a9c50ef326492fae213044d
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4077
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Factor out the internal of rpc_nvmf_subsystem_get_controllers() into
a function rpc_nvmf_subsystem_query() to use it for the upcoming RPC,
nvmf_subsystem_get_qpairs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe62bcfadf6b33ef26c018a3667f280b6fcd8fdf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4049
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For nsid, use SPDK_NVME_GLOBAL_NS_TAG rather than raw number
0xffffffff wherever possible.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I23e989786263172e13bab40c011cf58beb06fabf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4055
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This can happen and we should make a best effort to return
the subsystem to a coherent state when it does.
maybe fixes: issue #1416
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic3d0376984733e6664295305be82fca678c515b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3437
Community-CI: Broadcom CI
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This can happen and we should be prepared for it.
Maybe fixes: issue #1416
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I77f48dbcabf702f88df56ad7e866bbcb830fc239
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3393
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
And modify test/env/vtophys to resolve linking errors.
SPDK_PRINTF() and SPDK_ERRLOG() use spdk_log() procedure which is
customizable and redirectable, so it is preffered over printf()
In case of test/env/vtophys/ program,
we have to make it an app first to avoid linking errors.
Change-Id: Id806ec3bb235745316063bbdf6b5a15a9d5dc2d9
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1944
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
After a submission queue is deleted, the device is supposed
to post completions for every command to the completion queue.
Previously, we never looked and completed all commands with
an ABORTED status. Instead, complete any commands in the
completion queue with the status the drive gave them.
Change-Id: If851a365d4f305cf4390454b6b26dd0f7c5b82ac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3875
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
For I/O commands, block them if ANA state is inaccessible, persistent
loss, or change.
For Identify command, clear capacity field (nuse) to 0 if ANA state
is inaccessible or persistent loss.
For Get Features command, block features, error recovery, write
atomicity normal, reservation notification mask, and reservation
persistence if ANA state is inaccessible, persistent loss, or change.
For Get Log Page command, error information page does not return
any data yet, and hence there is no change.
For Set Features command, if ANA state is inaccessible or change,
block the command if NSID is 0xFFFFFFFF or if feature is error recovery,
write atomicity normal, reservation notification mask, or reservation
persistence, or if ANA state is persistent loss, block the command.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I15dd593227e451aa2247c53da42b6acad1757907
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4043
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add ANA state to struct spdk_nvmf_subsystem_listener and initialize
it to optimized.
Then ctrlr->listener->ana_state is referred when creating ANA log page.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I978424e51d3f23ca72dee30192bc2693abfe203d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4012
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We will have ANA state per listener and per subsystem. On the other hand,
NVMe specification defines ANA state per controller.
However, it is possible that I/O qpair and admin qpair are different
listeners on a single controller.
Let's check if I/O qpair is on the same listener as admin qpair if
ANA reporting is enabled.
The case that I/O qpair is on a different listener from admin qpair
is not usual and so the purpose of this check is just to guard SPDK
from any unexpected behavior.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idb8d255de7f998e45a59a120c2ed5803258873f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4026
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Find the subsystem listener whose trid matches req->port->trid when
creating a controller, and store it in the controller.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iea343b8d8ae827b554df2245b67aed113469c592
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4010
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add trid to struct spdk_nvmf_qpair and initialize it at initialization.
admin_qpair->trid will be used to get the corresponding
subsystem_listener via nvmf_subsystem_find_listener() and add it to
struct spdk_nvmf_ctrlr in the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d1a41aede60de88747eff16c7e04f63d0702596
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4009
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The new function () will be used in the following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I788cfb38d75c3f1f64e1754912b776a80f0f1be8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4007
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nblocks is zero based, so read path was missing the increment.
NVMe device that cuse represents can be of any block_size,
so rather than hardcoding 512 - actually verify it.
Both paths didn't request enough of a buffer from cuse.
Reported-by: Niklas Cassel
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I228dc2572bc94ecbcb913e950d912a7ab5be9434
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4037
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch does not alter functionality, just moves
around where cuse_device and block_size is determined.
Next patch will fix both paths.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5a827b5b4ab080b2aa0f76f5cdcbcb177b38b474
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4036
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Socket message VHOST_USER_SET_VRING_ENABLE will use number of
vring entries as input parameter to indicate the vring is
enabled or not, previously the flag in vhost-user library
wasn't checked before commit d0fcc38f5
"vhost: improve device readiness notifications", so here
we also use correct filed set in SPDK.
Fix issue #1583.
Change-Id: If5ac8a4ba31bdecbb5a64b736346c99e4be0f4b6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3989
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We stopped the poller to early, so we were not able to
reap all completions on ibv CQ, so RDMA qpair was not freed.
This patch stops the poller when all references to poll group
are released (all qpairs are destroyed)
Fixes#1578
Change-Id: I15c1697db13aef9da7567c7312476306c3ee1d62
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3962
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When tested on Linux 5.8 kernel and configure spdk
with debug mode (--enable-debug), and test SPDK NVMe-oF
tcp transport, and we see the coredump in sock_map_release
with the following statements:
assert(entry->ref > 0);
After debug, I can confirm that the placement_id value got
from the following function (sock->net_impl->get_placement_id)
changes.
It means that: When the sock is added into the poll group
(spdk_sock_group_add_sock), we get the placement_id (named as
Value(begin)); and when the sock is removed from the poll group
(spdk_sock_group_remove_sock), we get the plaemednt_id on
the same sock (named as Vaule(end)). I found that
Value(begin) ! = Value(end).
So our solution is for a socket, we will get placement_id once,
then we can solve this issue.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia1d0cf39247b53410260561aca5af38130cc0abb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3983
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We will use it earlier in this file in a future patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I554f2073185d466bd0b4e98bdeec721f763c1b44
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3969
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When claiming clusters as part of blobstore initialization
or recovery, just call spdk_bit_array_set directly rather
than going through the bs_claim_cluster function. We will
be modifying how runtime cluster allocation works so need
to separate the two use cases. This code is very small so
inlining it has minimal code impact.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaaa1c817e57b4a2eea62eb4683407364bac1fcc0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3966
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
These functions were added during FTL development and
are more efficient than the roll-your-own implementations
blobstore had previously.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie09e5c305e6e171af0258e805f2aac3b88822b5e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3965
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Allow toggling log timestamps on and off by adding new RPC call.
Change-Id: I34c84bf89fae352ade266fbf7fd20594ff67bced
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2024
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Remove assert and add exit codes instead. That in non-debug mode, these
could lead coredump. We don't want the vhost target be crashed after
recieved invalid commands.
fixes issue: #1575
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Ifef6d8f9c32150213bc2c80787e92d428d4c49c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3951
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
cpumask can be changed by spdk_thread_set_cpumask()
during the time that event takes before it arrives
on _schedule_thread() function, which would make the
function assert(false), even though that is ~ok~.
Currently, that can happen right after thread is created
or between two successive calls to spdk_thread_set_cpumask().
But most importantly, it will constantly happen if we
introduce rescheduler.
This patch just disables the check for now.
Change-Id: Ie6dfe22d6eff2c908c367d1311436cc6769a6960
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3905
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When the PDU receive handler processes the header of the logout request PDU,
conn->is_logged_out is set to true.
However, if conn->is_logged_out is true, conn->pdu_recv_state is set to ERROR
before the PDU receive handler completes processing the logout request PDU.
Then if conn->pdu_recv_state is ERROR, conn->state is set to EXITING
after returning from the PDU receive handler.
Response PDUs are sent asynchronously now and may not be sent even after
returning from the PDU receive handler.
On the other hand, outside the PDU receive handler, the current connection
is closed if conn->state is EXITING.
Hence logout response PDU may not be sent to the initiator.
For the case that the initiator logs out and then reconnects when receiving
asynchronous logout request, missing logout response is critical
because initiator waits until receiving logout request and gets timeout.
This patch moves the check if PDU comes after logout to the place
just after getting a PDU header.
At the new location, data segment of the PDU is not received yet. But
logout request PDU does not have data segment and initiator will not
send additional PDU after sending logout request PDU, and by this patch,
iSCSI target will continue to stop receiving any new PDU after processing
logout request. Furthermore, even if there is any remaining data in the
kernel buffer, the kernel will discard or flush it when closing the socket.
Fixes issue #1571
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9554f4d54f3db80bf86abd6bffe81bac8c234531
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3928
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
ANA transition time shall be non-zero if controller supports ANA
reporting. Linux NVMe host sets this value to 10, and we don't
have any reason to change from that.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I61396695dacf47fad40e3cea3311e555729d9e3e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3909
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Registration macro now generates function based on driver's name.
It allows to have multiple registration within single source file.
Similar pattern is used e.g. by SPDK_NVMF_TRANSPORT_REGISTER.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ied0887e8dae7fe9ca1517313be5eff8f218b7e98
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3895
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be used in another place later.
This patch is part of a series aimed at improving recovery
when we are fail to change the subsystem state.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I24bfbeb3d006584003164540d6ede540dbcafa86
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3392
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
The loop here was counting the bytes in the cpus array,
but the lcores are represented by bits.
While here, add a unit test that exposes this bug and
demonstrates it is now fixed with this patch.
Fixes#1570.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3a1fc48a8085254f41587e3b3d5d732154b90134
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3931
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will allow applications to understand why
they were unable to connect.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic04c7e72098c6ec1823de7d6a07d90150ef5ac20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3836
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add a bdev_examine_allowlist_free function, which releases the members
in g_bdev_examine_allowlist. Invoke it in bdev_mgr_unregister_cb.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: I47faf6959066da6679716b2f2abfab8ac8b8dd79
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3880
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently when the uevent processing code finds a non-uio/vfio
uevent, it just stops its loops and returns. This means that if
there are a lot of non-uio/vfio uevents, the netlink socket buffer
can build up until its full because only one non-uio/vfio event
gets drained per spdk_nvme_probe() call (which may be very
infrequently).
So modify parse_event so that it does not indicate error when
a non-uio/vfio event is found.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic8a40f71ee89d597ce46129eac889fe5b7ef5171
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3876
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This ensures we don't send a nopin immediately after
a connection is established, in case the nopin poller
fires before the connection reaches full feature phase.
Fixes#1441.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ieba9476bec0e9b7f85e60b9113ae8364eda5bda3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3902
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The md_page alignment is not really required for md_page
buffers.
Allocating 4k aligned buffers all the time, causes memory
to be heavily fragmented. Due to DPDK keeping track of the
allocation in the same DMA region as the allocation themselves.
Removing this alignment requirement will help DPDK when searching
for the right part of memory in the heap.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reported-by: Mike Cui
Change-Id: If2f4ca2be38d432d5740f6145b5e0ff46237806b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3853
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We should make nvme_tcp_ctrlr_connect_qpair always return
negative value if this function fails.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I457e704e39d7a3acd298fd48e89e8ea51e2ed4ad
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3809
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The specification says it will return INVALID FIELD if the NS
is in inactive state.
Fix issue #1551.
Change-Id: I1b32f023ed665d410f4705e439068699e2b2f8de
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3860
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Failed qpair will be destroyed on generic nvmf layer during handling
of error code returned from spdk_nvmf_poll_group_add.
The current approach leads to heap-use-after-free.
Change-Id: I99331150fa36a3c3c18176589afb973dee449b3a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3538
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The one large global mempool was a waste of memory for apps that
don't use the accel framework as its always allocated a pool sized
to handle a heavy load with multiple threads.
Instead move to a per channel list of just 1024 tasks greatly
decreasing the memory footprint but still able to scale as more
threads are added.
Also renamed all accel_req to acccel_taak and simply task to
accel_task as this was being touched anyways and not consistent.
fixes issue #1510
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I0e93ca6270323e2df4b739711c5d9b667a52e1eb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3740
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently rdma acceptor handles only one ibv event per poll
Taking into account the default acceptor poll rate (10ms), it can
take a long time to handle e.g. LAST_WQE_REACHED events when we
close huge amount of qpairs at the same time.
This patch allows to handle up to 32 ibv events per acceptor poll.
Change-Id: Ic2884dfc5b54c6aec0655aaa547b491a9934a386
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3821
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This is an oversight that can cause issues with looping
through the list if we end up allocating the same qpair
twice.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I513ea35398f4b724366c21be144531fbfbdb4347
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3835
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We only create one spdk_blob object for a given blob, and just
increase the ref_count if it is opened multiple times. bs_open_blob
would do the lookup for existing opened blobs.
But if the blob is opened again, before the previous open operation
has completed, we would end up with two spdk_blob objects for the same
blob.
Solution is to do another lookup when the open operation completes.
If we find the blob, free the one we just finished opening and return
the existing one instead.
Also added unit test that failed on the existing code but passes now
with this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reported-by: Mike Cui
Change-Id: I00c3a913b413deddf06f0b63f7a669efb2b5658f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3855
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
pctrlr->cmb.mem_register_addr and pctrlr->cmb.mem_register_size
are assigned after spdk_mem_register.
if spdk_mem_register is failed , ctrlr_map_cmb hasn't been executed.
they are not be used.
So remove them.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I3d1996eee8b5260b79c4c3e0a2e1d376da2343b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3856
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
SPDK poller uses microsecond as the input parameter, so we need to
change the correct value when opts.association_timeout is expressed
by millisecond.
Change-Id: Ia674f0115ea176b998e4c0c70b8ce75b28984701
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3861
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
After supporting ANA reporting by default, Linux kernel 5.3 reported
error when parsing NVMe ANA log. The newer kernel fixed the issue
but we should optionalize ANA reporting feature to avoid error for
Linux kernel 5.3 or before.
Add a bool variable ana_reporting to struct spdk_nvmf_subsystem
and disable ANA reporting and initialization of related variables
if it is false. We can expose MNAN (Maximum Number of Allowed
Namespaces) even if ANA reporting is disabled. But MNAN is not
required if ANA reporting is disabled. So do not set MNAN if it is
false too.
Add a public API spdk_nvmf_subsystem_set_ana_reporting() to set
ana_reporting by the nvmf_create_subssytem RPC.
The next patch will add ana_reporting to nvmf_create_subsystem RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Icc77773b4c9513daba2f1a9fdaf951d80574f379
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3850
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an new RPC, nvmf_subsystem_get_controllers to retrieve the list
of NVMe-oF controllers of an NVMe-oF subsystem.
One of the main use cases will be to get identification information
of NVMe-oF controllers to configure their ANA states dynamically.
Pause and resume the subsystem to access the controllers safely.
One subtle issue remains. The JSON RPC returns success even if
resuming the subsystem fails. Write FIXME explicitly to address this.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibf8d1cf56850a705e343b86022d101b4c7204199
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3848
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A subsystem RPC is not transitioned to a paused state when there
are ios outstanding (tracked by subsystem poll group).
In general AERs, are not tracked as outstanding IOs. However,
there are 3 paths in nvmf_ctrlr_async_event_request which do not
adjust the outstanding io count.
If we get into any of these 3 paths, the subsystem pause can hang
forever.
The issue was reproduced with hot plug stress testing under load.
We can get into the second path (SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE)
under these circumstances:
- An AER completion is sent to the initiator due to a namespace change
(e.g. hot remove/add)
- In this case, type is set to SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE
- The initiator sends a new AER admin command, hitting the second path
where we return without adjusting the outstanding ios.
Fixes: 1552
Change-Id: I45f781966cc1e9a601b2305c7985a21154d802e8
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3854
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There is a fatal bug that could easily cause data corruption when using
thin-provisioned blobs. In blob_request_submit_rw_iov(), we first get
lba by calling blob_calculate_lba_and_lba_count(),
blob_calculate_lba_and_lba_count() calculates different lbas according to
the return of bs_io_unit_is_allocated(). Later, we call bs_io_unit_is_allocated()
again to judge whether the specific cluster is allocated, the problem is it may
have be allocated here while not be allocated when calling blob_calculate_lba_and_lba_count()
before. To ensure the correctness of lba, we can do lba recalculation when
bs_io_unit_is_allocated() returns true, or make
blob_calculate_lba_and_lba_count() return the result of
bs_io_unit_is_allocated(), use the second solution in this patch.
By configuring more than one cpu core, md thread will run in a separate
SPDK thread, this data corruption scenario could be easily reproduced
by running fio verify in VMs using thin-provisioned Lvols as block
devices.
Signed-off-by: Sochin Jiang <jiangxiaoqing.sochin@bytedance.com>
Change-Id: I099865ff291ea42d5d49b693cc53f64b60881684
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3318
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
According to page35 in recent NVMe-oF spec (
NVMe-over-Fabrics-1.1-2019.10.22-Ratified), ioccsz is used
to restrict the incapsule size of I/O command, so do not
restrict the NVMe-oF OPC command and also the admin command.
We accidently trigger an bug in kernel since we do not send
the fabrics command with the incapsule and make the kernel
coredump, though the kernel has bugs.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I869a2c8ab7b9c2ac1e5cc5b603920662591c2c64
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3837
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The bdev_examine_bdev api will examine a bdev explicitly. After
disabling the auto_examine feature, a user could call
bdev_examine_bdev to examine a specific bdev he/she wants.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: Ifbbfb6f667287669ddf6175b8208efee39762933
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3219
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If the transport is broken, we should set errno code in
spdk_nvme_ctrlr_process_admin_completions instead of keeping silence.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ie73763e1329e12a8c82a0223d360991f86c39be3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3773
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This allows for much more granular control over the timeout.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ib23de21e60eec4207c55320579699edf284f4e16
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3794
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Generally, this patch did the following work:
Remove the destruct poller. I think that we do not need this,
the destruct poller is specially for Softwaare RoCE case.
Since SoftRoCE will not have IBV_EVENT_QP_LAST_WQE_REACHED event,
we will not wait the last_wqe_reached flag when srq is enabled.
So we can avoid using the poller.
And the purpose of this patch is to solve the coredump issue.
For example, if we run rdma local test such as, e.g.,
test/nvmf/host/bdevperf.sh --transport=rdma
The coredump reason: the qpair is freed twice. Because for RDMA transport,
we do not really remove the qpair from the group if the upper layer
does it.
The first time is called by nvmf_rdma_destroy_drained_qpair in nvmf_rdma_poller_poll,
and the second time is called by nvmf_rdma_qpair_reject_connection in
in nvme_rdma_close_qpair. Since nvme_rdma_close_qpair will always called,
so we need make sure that the qpair will be close after calling this function.
Otherwise we will have the double free qpair. So our approach here is add a flag
("to_close")in rqpair structure and make sure the rqpair be freed after the
"to_close" is set nvme_rdma_close_qpair
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I6f97debbcd29bbb7c6e3f9725907b4102a1d2892
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3661
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Add MaxR2TPerConnection to iSCSI global options and make it configurable
by JSON RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ida95e5c7dac301a22520656709e1aa4d611f31ef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3777
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
By the recent refactoring, we have no static size array for outstanding
R2Ts per connection. It looks that we do not have any critical reason
to prohibit us from making max outstanding R2Ts per connection configurable.
There are some use cases to use large write I/O intensively (e.g. 128KB).
Let such use cases change the value of max R2Ts per connection by their
responsibility to do performance tuning.
Maximum outstanding R2Ts per task are defined both for iSCSI target
and NVMe-TCP target but maximum outstanding R2Ts per connection is
unique for iSCSI target.
The next patch will add the corresponding iSCSI option.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4f6fd3c750a9a0a99bcf23064fe43a3389829aa9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It is likely that the raw number 8 in the macro NUM_PDU_PER_CONNECTION
means 2 * DEFAULT_MAXR2T and the raw number 2 means R2T and Data Out, but
is not certain.
On the other hand, the next patch will make the max number of outstanding
R2Ts per connection configurable.
As a preparation to the next patch, add 2 * DEFAULT_MAXR2T explicitly
to the macro NUM_PDU_PER_CONNECTION.
The next patch will replace DEFAULT_MAXR2T by an new variable.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8a3be14d53c0abf11d7aade401386601d8fe6c11
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3783
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Other count variables in iSCSI library have used uint32_t rather
than int.
Change the type of spdk_iscsi_conn::pending_r2t from int to uint32_t
and add assert to check if pending_r2t is not negative.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9bd296c0142b0808ae822952277c9ecc133e5f62
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3775
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add MaxLargeDataInPerConnection to iSCSI global options and make
it configurable by JSON RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibcd16da2eac64241217bedeb89a7929bbdc67871
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3756
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For some use case that there is heavy large read I/O, the performance
bottleneck due to MAX_LARGE_DATAIN_PER_CONNECTION was reported.
The following assumes that all I/Os are large read.
Large read primary task whose I/O size is more than
SPDK_BDEV_LARGE_BUF_MAX_SIZE (=64KB) is split into multiple
read subtasks.
spdk_iscsi_globals::MaxQueueDepth limits maximum number of outstanding
read primary tasks, and MAX_LARGE_DATAIN_PER_CONNECTION (=64)
limits maximum number of outstanding read subtasks.
MAX_LARGE_DATAIN_PER_CONNECTION is also used to calculate PDU pool.
To remove the performance bottleneck, change the macro constant
MAX_LARGE_DATAIN_PER_CONNECTION to a global variable
spdk_iscsi_globals::MaxLargeDataInPerConnection.
We don't see any negative side effect if we set
spdk_iscsi_globals::MaxLargeDataInPerConnection to 64.
The use case that reported the performance issue will change the
value of spdk_iscsi_globals::MaxLargeDataInPerConnection by its own
responsibility.
The next patch will add the value of
spdk_iscsi_globals::MaxLargeDataInPerConnection to iSCSI options,
and make it configurable by JSON RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifc30cdb8e00d50f4d3755ff399263cf5d0b681b6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3755
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Helps us avoid adding a new I/O qpair while the ctrlr
is being destroyed.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I3bf9318b075125b9d432b885fa9f6f2f44d422d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3686
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For the login redirection feature, the current implementation works
only if a portal is redirected from an initial portal to a redirect
portal. However, the login redirection feature should work even if a
portal is redirected from one redirect portal to another redirect
portal.
A public portal group knows only a redirect portal and does not know
the portal group of the redirect portal.
Moreover, it is very likely that an initial portal and a redirect portal
exist in different SPDK iSCSI target applications.
To cover all these concerns, add an new iscsi_target_node_request_logout
RPC to request connections whose portal group tag match for the target
node.
To cover potential use cases, make the second parameter portal group
tag optional.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I612672490722fb22fd4eba055998b7408ab84ca5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3780
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
As written in doc/iscsi.md, typically the login redirection feature
will be used in scale out iSCSI target system, which runs multiple
SPDK iSCSI target applications.
In scale out iSCSI target system, the initial portal, the current
redirect portal, and the next redirect portal are likely to be in
different SPDK iSCSI target applications.
In this case, asynchronous logout request should be sent independently
from the iSCSI target application which has the current redirect portal.
However, we had added asynchronous logout request into the iSCSI target
application which has the next redirect portal. This idea works only
for the case that login is redirected from the initial portal to a
redirect portal.
We remove asynchronous logout request from iscsi_target_node_redirect()
in this patch, and update the corresponding help documents.
The next patch will add an new RPC to send asynchronous logout
request to all connections to the specified portal group and the
specified target.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib0ac72e8cdad7e8c64e446b7495e572fac4b5bae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3779
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This data structure is not used.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I143fb9256f692d7bd9bb5e14cdc479f64ddcef45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3746
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
1. Retrieve actual IBV state when we receive WC with bad status
2. Don't log an error if WC status is IBV_WC_WR_FLUSH_ERR. This
means that we are performing qpair cleanup and this WC is expected.
Change-Id: Id23634092f537861e66ca0f83ab79db9e052507b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3736
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
From the time a shutdown is initiated the controller shall disable
Keep Alive timer.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Id499dabce1913b9da2f0b3fd961fdfc8b621afa9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3462
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
After CC.EN transitions to ‘0’ (due to shutdown or reset), the
association between the host and controller shall be preserved for at
least 2 minutes. After this time, the association may be removed if
the controller has not been re-enabled.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Change-Id: I4734600067fd4b7306b46f1325fdd5031e81c079
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2984
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This call can be made directly now that
spdk_nvmf_qpair_disconnect is thread safe. It's
actually better that we do it this way, because
the qp destruct call is guaranteed to block until
the ib events associated with it are acknowledged.
this means that by processing the disconnect before
we ack the event, we will have valid memory to do
the atomic checks.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If6882b7dc568fe4c35f4a35375769634326e9d76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3681
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We should use this function as the synchronization point
for all qpair disconnects.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic685ac3481765190cc56eeec3ee24dad52e336c9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3675
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This function should be the synchronization point for all
disconnects regardless of whether they begin on the transport,
from an RPC, or in response to application termination.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If3553ab3a9e265b0938c84832cb9f774852d7565
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3674
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
SPDK NVMe-oF controller creates a ANA group for each namespace,
ANA group ID matches namespace ID, and default ANA state of ANA group
is optimized, and the MNAN field is set equal to the NN field.
If a ANA log page contains multiple ANA group descriptors, it has
one or more descriptors will not be 8 bytes aligned. Hence we create
one descriptor and copy it to the ANA log page at a time.
Change count will be supported later.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I56ba6aa78983480caa3dfbf22aefc9aeabfd5405
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2920
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
FTL core poller should return SPDK_POLLER_BUSY flag only
when some writes operations were processed.
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Change-Id: I50e2b536fbec819887148cc045d76c5c5d78beb2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3619
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
There are 2 messages passed between when
_nvmf_ctrlr_free_from_qpair is executed and when
nvmf_ctrlr_destruct is executed. That leaves time
when the controller->qpair_mask is not a valid
pointer, but it is still in the subsystem
controllers list.
The purpose of this patch is to close that hole.
It is part of a larger series aimed at cleaning up
the controller destruct path.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I0c0199c8392ee278f36df56f599beb10e7a46948
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3685
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This API differs from spdk_nvmf_tranpsort_stop_listen in
that it also disconnects the qpairs associated with
that listener.
Change-Id: Iadfc6d2debc0ef8f1a8cd5db4f20168aeae8264d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3279
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the portal group map of the target has a redirect portal,
iscsi_tgt_node_is_moved() fills the buffer by the redirected address
and returns true.
iscsi_op_login_check_target() calls iscsi_tgt_node_is_redirected() before
calling iscsi_tgt_node_access() because login redirection can be
checked before any or after all security check.
If iscsi_tgt_node_is_redirected() returns true, notify login redirection
to the corresponding initiator.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4573a69c0a32eafcfe48080a033c135e127da321
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
iscsi_tgt_node_redirect() updates redirect portal of the initial
portal iin a primary portal group for the target node.
Check if the specified portal group is a public portal group and is
mapped to the target node first.
Then if the passed IP address-port pair is NULL, clear the current
redirect setting. Public portal group and private portal group are
clearly separated and redirect portal must be chosen from a private
portal group. Hence this clear method is intuitive and simple.
If the passed IP address-port pair is not NULL, check if they are
valid, and are not in the specified portal group. Then update a
redirect portal of the portal group map.
Finally, send asynchronous logout request to all corresponding
initiators.
Besides, change allocating pg_map from malloc to calloc to initialize
redirect portal.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I79d826663f4c3d5a117add286f133adeb1ce07f5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3222
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>