There are more transport on the way and we don't want to add
all their various opts into the single, generic structure.
We'll pass the JSON structure to transports instead. Then
the transport code can custom pull from the JSON any param
it wants.
To complement that, transports will now also have their own
JSON config dump callback. This was only done in the generic
nvmf.c so far, with conditions for RDMA and TCP.
Change-Id: I33115a8d56cec829b1c51311a318e0333cc64920
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Signed-off-by: jiaqizho <jiaqi.zhou@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2761
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For example:
Got JSON-RPC error response
response:
{
"code": -32602,
"message": "Invalid transport type 'rdma'\n"
}
The \n here is redundant.
Change-Id: I30a22f93f2be2550fdbe2af2d90eaa1c381dc7ae
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4655
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This will disconnect all connections to a subsystem from a given
host identified by HOSTNQN.
Change-Id: Ibc9cea1f08a58a05dbac3a0bb47df8d8a58e7c10
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4556
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
There is nothing left here, so remove it.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib947d42bc577dbebb4650b1be885e05a80f8f8cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4541
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI
Add an new API spdk_nvmf_subsystem_add_ns_ext() to pass not bdev but
bdev_name to fix the race condition due to the time gap between
spdk_bdev_get_by_name() and spdk_bdev_open(). A pointer to a bdev is
valid only while the bdev is opened.
spdk_bdev_open() has been replaced by spdk_bdev_open_ext() but the
issue still existed.
Update the corresponding unit tests accordingly.
Then replace the internal of spdk_nvmf_subsystem_add_ns() by
spdk_nvmf_subsystem_add_ns_ext() call.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifcaa2121129ef22d5e61c9a8f7c640ff37a64485
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4485
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There are operations on nvmf which depends on proper values of qpair
attributes which can be intepreted as internal state.
e.g.
nvmf_ctrlr_process_fabrics_cmd execution relies on qpair->ctrlr
spdk_nvmf_qpair_disconnect relies on qpair->disconnect_started
As poll group add is like a registration of qpair into nvmf lets try
to initialize it to a defined and expected state.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I10494e7f70ff58ec5460cab1de8a52fd21cc4a48
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4479
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch removes the string from register component.
Removed are all instances in libs or hardcoded in apps.
Starting with this patch literal passed to register,
serves as name for the flag.
All instances of SPDK_LOG_* were replaced with just *
in lowercase.
No actual name change for flags occur in this patch.
Affected are SPDK_LOG_REGISTER_COMPONENT() and
SPDK_*LOG() macros.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I002b232fde57ecf9c6777726b181fc0341f1bb17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4495
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
a pause
This now also takes a lock instead of requiring a pause of the whole
subsystem.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7de174f3f56d2b3767e723387c4f2257107d8b19
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4581
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The list of allowed hosts is only checked during handling of CONNECT
commands - not in the main I/O path. Protect that list with a mutex
instead of requiring a full pause of the subsystem to allow
dynamic management of the allowed hosts without impacting any
active I/O.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3f7e87cc1fa6de200c422928c07153fc60fab28c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4555
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Pack all of the hot data into the first cache line. The first cache line
covers everything up to and including the ctrlrs TAILQ.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I184520661743aec91b3bb3d81e53fe8610c9383e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4554
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This saves 2 bytes and allows it to pack nicely with the
changing state bool (which must remember separate for atomic
operations).
Change-Id: Ibb92ae3c74306e60385ae23d0aaf877f33a69095
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4553
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Without this change nvmf_ctrlr_create() will fail to lookup
the subsystem listener matching this qpair.
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: I855baa16e996737b60dbd745ce84f8c0bc024cf1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
LOCAL_SYS_LIBS is meant to define *direct* system
library dependencies for a given library. libuuid
is directly used by the SPDK util library and then
other SPDK libraries use uuid indirectly through
util.
So only the util library should include uuid in
LOCAL_SYS_LIBS.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia0d2d63f48e6f89891164cf2f9dc4c7a6476d4e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4366
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It should be 16 but not 6. For example, it will have 16 priorities
when configuring ADQ with Intel's 100G NIC.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iebdf7b379c15f3b5fd16dba2ad87ec55af04577f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4235
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In the both normal and exceptional case, the mutex
will need to be destroyed.
Change-Id: I39c815f2adffbd3786b45a938c476dcbb66a438f
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4339
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Unlike ADMIN and IO commands, the FABRIC command is only processed
in the ctrlr.c file.
Change-Id: Ic4e01c7f81c98631a2c7cb603343b301f8ba63e1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4307
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The poller is now created internally to the library whenever a target
is constructed. Applications are not expected to poll for connections
any longer.
Change-Id: I523eb6adcc042c1ba2ed41b1cb41256b8bf63772
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3583
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Remove some of the boilerplate code from each case and
replace with just an spdk_msg_fn assignment.
This also reduces the size of an upcoming change needed
in this function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia209073cfb66032f2cca6bb44a09e1984ef2110c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4257
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In C language, we cannot use constant at compile time. Hence the
local array _ana_desc[] is not a fixed size array but a variable
length array.
We can avoid using variable length array by changing const variable
to macro constant.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7333a8078d3102c4bd5088f56f6530846854c85f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4093
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add an new RPC, nvmf_subsystem_listener_set_ana_state.
Find the specified subsystem listener, and then set the ANA state
of the listener by calling nvmf_subsystem_listener_set_ana_state().
By adding a string and an enum to the existing context structure,
nvmf_rpc_listener_ctx, and adding an operation type to the existng
enum, nvmf_rpc_listen_op, reuse the existing code and data as much
as possible.
Besides, insert line break into a few long lines and fix wrong
error log.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6fb2dfbb1f9c5f56848eba21d2a733fbed802614
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4080
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an internal API nvmf_subsystem_set_ana_state() to change the
ANA state of the subsystem listener whose trid matches.
ANA optimized state, ANA non-optimized state, and ANA inaccessible
state are supported. ANA change state is not used and ANA persistent
loss state is not supported.
After changing the ANA state of the subsystem listener, on each poll
group, controllers, whose the subsystem listener match, send ANA
change notice.
Initiators query ANA log page anyway if they receive ANA change
notification. False positive notification should be avoided but is
acceptable.
To avoid any concurrency conflict, simply compare ctrlr->listener and
the passed listener.
It may be better to execute nvmf_subsystem_set_ana_state() on the
subsystem thread but currently the RPC thread adds and removes a
listener to and from the subsystem, respectively, and the subsystem
has been suspended while executing nvmf_subsystem_set_ana_state().
Hence we keep this as a future enhancement.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If1910b79dd33d904114e258ae2c5e868947cdc52
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4079
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
If the ANA reporting feature is enabled for the subsystem,
- set ANA Change Notice of Asynchronous Event Configuration to 1
- set ANA Change Notice of Optional Asynchronus Event Supported to 1
- set ANA Non-Optimized state and ANA Inaccessible state of ANA
Capability to 1.
ANA Change state is not used and ANA Persistent Loss state is not
supported for now.
The next patch will actually support ANA Change Notice using an new
RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4db2e33dd2879cdf995adcab41ef53728b27a201
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4087
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If we are already in the desired state,
just call the callback directly from the
subsystem_state_change function. That way
we save a lot of message passing.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I6cf8563524610d9125d53266e3c0e179e064bf63
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3760
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is important to avoid doubling up on state changes
and hitting asserts.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If8797ea13a5c224cee85e53e9b2542012423b37f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3759
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There is no need for this interface to be async.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1f21b53e90b7d165b6b5fb2e1226ce7591966b58
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4181
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It was introduced for the purpose of executing fabric cmds when
subsystem and qpairs are not active. It was rather workaround than
solution for transport type like vfio-user. spdk_nvmf_request_exec
is a preferred way of passing request obj into nvmf layer.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I4f989de27bfd494c744017599909c2e200f0f233
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4180
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add an new RPC, nvmf_subsystem_get_listeners.
ANA state is per listener and per subsystem, and is stored in
subsystem listener. We can return ANA state by the existing
nvmf_get_subsystems RPC but it's confusing that listen addresses
have ANA states.
To change ANA state, we will provide a RPC to change ANA state of
only one selected subsystem listener.
To query ANA state, it will be convenient to get ANA states of all
listeners of one selected subsystem.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic3baad6eac65d7af6e0cab2c4059e1458d41e6e2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4059
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Data structure and macro constants for multiple listen addresses
and namespaces are not used anywhere in nvmf_rpc.c
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idd8bc61e22f9e9918a88f017a024cab239ff5e53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4060
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an new RPC, nvmf_subsystem_get_qpairs to retrieve the list of
qpairs of an NVMe-oF subsystem.
This RPC will be usable to verify if NVMe ANA works.
Pause and resume the subsystem to access the qpairs safely.
One subtle issue remains. The JSON RPC returns success even if
resuming the subsystem fails. Write FIXME to address this.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9d90a01b1117dee00d85b2e21b4f4d02d80db531
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4050
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Usage of spdk_thread_get_count is wrong since there might be many
threads allocated by other modules. Transport buffers are used by
transport poll groups, their number is equal to the number of cores.
Change-Id: I4bc748e93c3b204bf3b3ec73f17257b927a7f428
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3882
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When we try to evenly divide transport buffers between poll grouos,
e.g. when we run spdk_tgt on 8 cores, set num_shared_buffers=32768
and pg buf_cache_size=4096, the last pg can't retrieve enough
buffers to fill cache. In my case if only got 4040 buffers out of
4096. Missing 56 buffers were cached by previous poll groups.
That occurred due to mempool has per lcore cache of 512 elements
and when it becomes empty, the cache is refilled. It seems that
each poll group cached extra 8 buffers.
The issue doesn't occur when we use mempool_get_bulk.
Change-Id: I866d58aa03986a3cffe27402b12f9a2519097f83
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3881
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Factor out the internal of rpc_nvmf_subsystem_get_controllers() into
a function rpc_nvmf_subsystem_query() to use it for the upcoming RPC,
nvmf_subsystem_get_qpairs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe62bcfadf6b33ef26c018a3667f280b6fcd8fdf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4049
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For nsid, use SPDK_NVME_GLOBAL_NS_TAG rather than raw number
0xffffffff wherever possible.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I23e989786263172e13bab40c011cf58beb06fabf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4055
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This can happen and we should make a best effort to return
the subsystem to a coherent state when it does.
maybe fixes: issue #1416
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic3d0376984733e6664295305be82fca678c515b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3437
Community-CI: Broadcom CI
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This can happen and we should be prepared for it.
Maybe fixes: issue #1416
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I77f48dbcabf702f88df56ad7e866bbcb830fc239
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3393
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For I/O commands, block them if ANA state is inaccessible, persistent
loss, or change.
For Identify command, clear capacity field (nuse) to 0 if ANA state
is inaccessible or persistent loss.
For Get Features command, block features, error recovery, write
atomicity normal, reservation notification mask, and reservation
persistence if ANA state is inaccessible, persistent loss, or change.
For Get Log Page command, error information page does not return
any data yet, and hence there is no change.
For Set Features command, if ANA state is inaccessible or change,
block the command if NSID is 0xFFFFFFFF or if feature is error recovery,
write atomicity normal, reservation notification mask, or reservation
persistence, or if ANA state is persistent loss, block the command.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I15dd593227e451aa2247c53da42b6acad1757907
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4043
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add ANA state to struct spdk_nvmf_subsystem_listener and initialize
it to optimized.
Then ctrlr->listener->ana_state is referred when creating ANA log page.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I978424e51d3f23ca72dee30192bc2693abfe203d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4012
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We will have ANA state per listener and per subsystem. On the other hand,
NVMe specification defines ANA state per controller.
However, it is possible that I/O qpair and admin qpair are different
listeners on a single controller.
Let's check if I/O qpair is on the same listener as admin qpair if
ANA reporting is enabled.
The case that I/O qpair is on a different listener from admin qpair
is not usual and so the purpose of this check is just to guard SPDK
from any unexpected behavior.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idb8d255de7f998e45a59a120c2ed5803258873f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4026
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Find the subsystem listener whose trid matches req->port->trid when
creating a controller, and store it in the controller.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iea343b8d8ae827b554df2245b67aed113469c592
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4010
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add trid to struct spdk_nvmf_qpair and initialize it at initialization.
admin_qpair->trid will be used to get the corresponding
subsystem_listener via nvmf_subsystem_find_listener() and add it to
struct spdk_nvmf_ctrlr in the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d1a41aede60de88747eff16c7e04f63d0702596
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4009
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The new function () will be used in the following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I788cfb38d75c3f1f64e1754912b776a80f0f1be8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4007
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We stopped the poller to early, so we were not able to
reap all completions on ibv CQ, so RDMA qpair was not freed.
This patch stops the poller when all references to poll group
are released (all qpairs are destroyed)
Fixes#1578
Change-Id: I15c1697db13aef9da7567c7312476306c3ee1d62
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3962
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
ANA transition time shall be non-zero if controller supports ANA
reporting. Linux NVMe host sets this value to 10, and we don't
have any reason to change from that.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I61396695dacf47fad40e3cea3311e555729d9e3e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3909
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will be used in another place later.
This patch is part of a series aimed at improving recovery
when we are fail to change the subsystem state.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I24bfbeb3d006584003164540d6ede540dbcafa86
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3392
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Failed qpair will be destroyed on generic nvmf layer during handling
of error code returned from spdk_nvmf_poll_group_add.
The current approach leads to heap-use-after-free.
Change-Id: I99331150fa36a3c3c18176589afb973dee449b3a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3538
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Currently rdma acceptor handles only one ibv event per poll
Taking into account the default acceptor poll rate (10ms), it can
take a long time to handle e.g. LAST_WQE_REACHED events when we
close huge amount of qpairs at the same time.
This patch allows to handle up to 32 ibv events per acceptor poll.
Change-Id: Ic2884dfc5b54c6aec0655aaa547b491a9934a386
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3821
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
SPDK poller uses microsecond as the input parameter, so we need to
change the correct value when opts.association_timeout is expressed
by millisecond.
Change-Id: Ia674f0115ea176b998e4c0c70b8ce75b28984701
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3861
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
After supporting ANA reporting by default, Linux kernel 5.3 reported
error when parsing NVMe ANA log. The newer kernel fixed the issue
but we should optionalize ANA reporting feature to avoid error for
Linux kernel 5.3 or before.
Add a bool variable ana_reporting to struct spdk_nvmf_subsystem
and disable ANA reporting and initialization of related variables
if it is false. We can expose MNAN (Maximum Number of Allowed
Namespaces) even if ANA reporting is disabled. But MNAN is not
required if ANA reporting is disabled. So do not set MNAN if it is
false too.
Add a public API spdk_nvmf_subsystem_set_ana_reporting() to set
ana_reporting by the nvmf_create_subssytem RPC.
The next patch will add ana_reporting to nvmf_create_subsystem RPC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Icc77773b4c9513daba2f1a9fdaf951d80574f379
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3850
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an new RPC, nvmf_subsystem_get_controllers to retrieve the list
of NVMe-oF controllers of an NVMe-oF subsystem.
One of the main use cases will be to get identification information
of NVMe-oF controllers to configure their ANA states dynamically.
Pause and resume the subsystem to access the controllers safely.
One subtle issue remains. The JSON RPC returns success even if
resuming the subsystem fails. Write FIXME explicitly to address this.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibf8d1cf56850a705e343b86022d101b4c7204199
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3848
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A subsystem RPC is not transitioned to a paused state when there
are ios outstanding (tracked by subsystem poll group).
In general AERs, are not tracked as outstanding IOs. However,
there are 3 paths in nvmf_ctrlr_async_event_request which do not
adjust the outstanding io count.
If we get into any of these 3 paths, the subsystem pause can hang
forever.
The issue was reproduced with hot plug stress testing under load.
We can get into the second path (SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE)
under these circumstances:
- An AER completion is sent to the initiator due to a namespace change
(e.g. hot remove/add)
- In this case, type is set to SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE
- The initiator sends a new AER admin command, hitting the second path
where we return without adjusting the outstanding ios.
Fixes: 1552
Change-Id: I45f781966cc1e9a601b2305c7985a21154d802e8
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3854
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Generally, this patch did the following work:
Remove the destruct poller. I think that we do not need this,
the destruct poller is specially for Softwaare RoCE case.
Since SoftRoCE will not have IBV_EVENT_QP_LAST_WQE_REACHED event,
we will not wait the last_wqe_reached flag when srq is enabled.
So we can avoid using the poller.
And the purpose of this patch is to solve the coredump issue.
For example, if we run rdma local test such as, e.g.,
test/nvmf/host/bdevperf.sh --transport=rdma
The coredump reason: the qpair is freed twice. Because for RDMA transport,
we do not really remove the qpair from the group if the upper layer
does it.
The first time is called by nvmf_rdma_destroy_drained_qpair in nvmf_rdma_poller_poll,
and the second time is called by nvmf_rdma_qpair_reject_connection in
in nvme_rdma_close_qpair. Since nvme_rdma_close_qpair will always called,
so we need make sure that the qpair will be close after calling this function.
Otherwise we will have the double free qpair. So our approach here is add a flag
("to_close")in rqpair structure and make sure the rqpair be freed after the
"to_close" is set nvme_rdma_close_qpair
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I6f97debbcd29bbb7c6e3f9725907b4102a1d2892
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3661
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Helps us avoid adding a new I/O qpair while the ctrlr
is being destroyed.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I3bf9318b075125b9d432b885fa9f6f2f44d422d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3686
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This data structure is not used.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I143fb9256f692d7bd9bb5e14cdc479f64ddcef45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3746
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
1. Retrieve actual IBV state when we receive WC with bad status
2. Don't log an error if WC status is IBV_WC_WR_FLUSH_ERR. This
means that we are performing qpair cleanup and this WC is expected.
Change-Id: Id23634092f537861e66ca0f83ab79db9e052507b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3736
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
From the time a shutdown is initiated the controller shall disable
Keep Alive timer.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Id499dabce1913b9da2f0b3fd961fdfc8b621afa9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3462
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
After CC.EN transitions to ‘0’ (due to shutdown or reset), the
association between the host and controller shall be preserved for at
least 2 minutes. After this time, the association may be removed if
the controller has not been re-enabled.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Change-Id: I4734600067fd4b7306b46f1325fdd5031e81c079
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2984
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This call can be made directly now that
spdk_nvmf_qpair_disconnect is thread safe. It's
actually better that we do it this way, because
the qp destruct call is guaranteed to block until
the ib events associated with it are acknowledged.
this means that by processing the disconnect before
we ack the event, we will have valid memory to do
the atomic checks.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If6882b7dc568fe4c35f4a35375769634326e9d76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3681
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We should use this function as the synchronization point
for all qpair disconnects.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic685ac3481765190cc56eeec3ee24dad52e336c9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3675
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This function should be the synchronization point for all
disconnects regardless of whether they begin on the transport,
from an RPC, or in response to application termination.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: If3553ab3a9e265b0938c84832cb9f774852d7565
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3674
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
SPDK NVMe-oF controller creates a ANA group for each namespace,
ANA group ID matches namespace ID, and default ANA state of ANA group
is optimized, and the MNAN field is set equal to the NN field.
If a ANA log page contains multiple ANA group descriptors, it has
one or more descriptors will not be 8 bytes aligned. Hence we create
one descriptor and copy it to the ANA log page at a time.
Change count will be supported later.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I56ba6aa78983480caa3dfbf22aefc9aeabfd5405
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2920
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are 2 messages passed between when
_nvmf_ctrlr_free_from_qpair is executed and when
nvmf_ctrlr_destruct is executed. That leaves time
when the controller->qpair_mask is not a valid
pointer, but it is still in the subsystem
controllers list.
The purpose of this patch is to close that hole.
It is part of a larger series aimed at cleaning up
the controller destruct path.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I0c0199c8392ee278f36df56f599beb10e7a46948
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3685
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This API differs from spdk_nvmf_tranpsort_stop_listen in
that it also disconnects the qpairs associated with
that listener.
Change-Id: Iadfc6d2debc0ef8f1a8cd5db4f20168aeae8264d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3279
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Purpose: To make the pdu management consistent with other PDUs, then
we can easily adapt our code into some hardware offloading solution.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ic4a2847fd1b6cacda4cbaa52ff12c338f0394805
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3588
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
RDMA target can't handle bidirectional xfer type, in debug build
it throws an assert in nvmf_rdma_setup_wr function. NVMF controller
performs checks od opcodes, but the failure happens before this
check. Add similar validation in TCP transport.
Change-Id: I14400b9c301295c0ae1d35a4330189d38aeee723
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3436
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
CC.EN, CSTS.RDY should not be modified during shutdown.
It doesn't make much sense (against nvme spec) and nvmf spec 1.1
doesn't mentioned it (4.6) either.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I7014b10b0217db61c3d380d5c0843808e54577cd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3477
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Currently we don't resubmit receive request associated with AER
request to SRQ. This leads to reducing of SRQ elements and may
lead to non responsive NVMF target.
Fixes#1507
Change-Id: Ie96f8c4be0202ae973e561ebe5ea28688a6a3b72
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3558
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since rqpair->qpair.group is set to NULL when we remove the
qpair from poll group, we fail to send event to qpair's thread.
This patch adds a pointer to io_chaneel to spdk_nvmf_rdma_qpair
structure and a function to handle poll_group_remove transport
operation. In this function we get io_channel from nvmf_tgt,
this channel will be used to get a thread for sending
async event notification. This also guarantees that the thread
will be alive while we are destroying qpair.
Change-Id: I1222be9f9004304ba0a90edf6d56d316d014efda
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3475
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The new abort functionality doesn't take custom admin cmd
handlers into account.
This commit allows setting a custom admin cmd handler
for abort that provides the ability to influence the
bdev lookup to which the abort is sent to.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: I3a66c6f863f5ee4d89cb2194dffdc6855945fa8a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3485
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
struct spdk_nvmf_request holds req_to_abort and so passing req_to_abort
separately is not really necessary now. The internal API
nvmf_ctrlr_abort_request() was added at the stage of prototyping.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9ef2467d6f92422f044650c62a0777b95c0fc1ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3488
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
1 Change the default factor from 4 to 8, which can be used
to improve the performance.
2 Change the base buffer size in nvme_tcp.c,
we should not use sizeof(struct spdk_nvme_tcp_cmd),
it is 72 bytes. Normally, the initiator will receive
C2h pdus and R2T Pdus by most, so set the size of using
sizeof(struct spdk_nvme_tcp_c2h_data_hdr) is enough.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I384f4cb026cb8d83e75b639f7256ee8cb8ed1df1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3283
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There is no reason to continue processing these requests if the
qpair is not still active. We should complete them and free
any resources they are still holding.
Also, not doing so can cause issues with trying to access pointers
in the qpair after they are invalid. See issue #1460.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I6e570a576983dfedf726dc4a9a83316209403e00
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3451
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Make the abort execution timeout value as optional.
Zero is acceptable and means immediate timeout.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia4b03c65b8bd15899f48be9476ee657446147581
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3104
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If the state of the request is TRANSFERRING_HOST_TO_CONTROLLER,
we cannot abort it now but may be able to abort it when its state
is EXECUTING. Hence wait until its state is EXECUTING, and then
retry aborting.
The following patch will make the timeout value configurable as
an new transport option.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia4b43e79c3b0d9c53ed04b01a9eaa9b117b32d81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3013
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the request is queued and is not in completing, we can abort
it safely.
If the state of the request is NEED_BUFFERING, the request is
queued to tqpair->group->group.pending_buf_queue.
If the state of the request is DATA_TRANSFER_TO_CONTROLLER_PENDING,
the request is queued to rqpair->pending_rdma_read_queue.
If the state of the request is DATA_TRANSFER_TO_HOST_PENDING,
the request is queued to rqpair->pending_rdma_write_queue.
According to the current state, dequeue from the corresponding
queue, and then call an new helper function
nvmf_rdma_request_set_abort_status().
Using helper function will be easier to read.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id0327f4d2c4728a11b3b6bbc7c2252f0b35263cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3012
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Call nvmf_ctrlr_abort_request() if the request whose CID matches
is found and its state is executing.
nvmf_rdma_qpair_abort_request() returns immediately if rc is
SPDK_NVMF_REQUEST_EXEC_STATUS_ASYNCHRONOUS, or calls
spdk_nvmf_request_complete() otherwise.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1462a21db7270f3d63f8f293ad4be61d52e74da1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3011
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
If the state of the request is TRANSFERRING_HOST_TO_CONTROLLER,
we cannot abort it now but may be able to abort it when its state
is EXECUTING. Hence wait until its state is EXECUTING, and then
retry aborting.
The following patch will make the timeout value configurable as
an new transport option.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I98347b68e8b6b4a804c47894964cb81eae215aaa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3010
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
If the request is queued and is not in completing, we can abort
it safely.
If the state of the request is NEED_BUFFERING, the request is
queued to both tqpair->group->group.pending_buf_queue and
the queue per state.
If the state is AWAITING_R2T_ACK, the request is queued to the
queue per state.
Dequeueing from the queue per state is done in
nvmf_tcp_req_set_state(). Hence explicit dequeuing only when the
state of the request is NEED_BUFFERING.
Most abort operation is common between two cases. We can use fallthrough
in switch-case but factor out the common operation into a helper
function nvmf_tcp_req_set_abort_status() instead because we may use
the helper function in future and using helper function is easier to
read than fallthrough.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1695b084d5d1f2537fbdd512bc3cd136e0f6a65b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3009
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Call nvmf_ctrlr_abort_request() if the request whose CID matches
is found and its state is executing.
nvmf_tcp_qpair_abort_request() returns immediately if rc is
SPDK_NVMF_REQUEST_EXEC_STATUS_ASYNCHRONOUS or calls
spdk_nvmf_request_complete() otherwise.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1abceecc211ee79d8ac18a82dc63b13d313a6f27
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3008
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
State machine is different among NVMe-oF transports and is
encapsulated to the transport neutral NVMe-oF controller and
NVMe-oF qpair.
To implement abort operation for each NVMe-oF transport,
add a function pointer qpair_abort_request to struct spdk_nvmf_transport_ops
and a stub nvmf_transport_qpair_abort_request() to encapsulate
which transport is used.
The following patches will implement qpair_abort_request for each
transport. Each qpair_abort_request() is responsible to call
spdk_nvmf_request_complete() for the abort request.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2beac959ed428c5108cf33691226b7fae5cd24d6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3007
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Factor out abort operation on the specific qpair into a helper
function nvmf_qpair_abort_request().
After this refactoring, nvmf_ctrlr_abort_done() calls
_nvmf_request_complete() only if the passed status is zero.
If the passed status is not zero, nvmf_qpair_abort() is responsible
for calling _nvmf_request_complete() instead.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4828c0e21cc7650210675661d6e1c0fd54c7a2cb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2991
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Description is not clear but according to the NVMe specification,
always set the completion status to success and differentiate only
the bit 0 of CDW0 between success and failure for abort command.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0195e72fe1d7fcc2592f47e9dcf92ac56912282c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1965
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Logic error Dereference of null pointer ctrlr.c
nvmf_ctrlr_async_event_request 1522
Dereference of null pointer is not possible if sgroup obtained using
ctrlr obj. Adding corresponding asserts suppresses the warning.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I78b32fadd5449ee9b533f65193c70e55cf9a8f1c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3251
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This allows users to configure the number of
connection requests outstanding to an rdma port
at once.
RPC included.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I8a2bb86b2fb7565cb10288088d39af763b778703
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3097
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Poller should return status > 0 when it did some work
(CPU was used for some time) marking its call as busy
CPU time.
Active pollers should return BUSY status only if they
did any meangful work besides checking some conditions
(e.g. processing requests, do some complicated operations).
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Id4636a0997489b129cecfe785592cc97b50992ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2164
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It should return "NVME_TCP_PDU_FATAL". I think that
this issue is introduced after we move the data
copy from tcp transport layer to the socket
layer. And it should return "NVME_TCP_PDU_FATAL now",
and it will be consistent with the logic in the same
function.
With this patch, it will fix the big I/O size write
from the initiator.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ide018adb603eb13d002fc98886258dd1e2424f7c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3122
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
10 is kind of unreasonable. You could easily start
seeing failures if you had just 3 intiators trying to
connect with 4 io qpairs at once.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I9985ffa3b03ebb33880eb5934b60eaafab57c82d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3096
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
rdma_qp may not be initialized when qpair is not fully
created. When such a qpair is being destroyed we may pass
a NULL pointer to spdk_rdma_qp_disconnect or spdk_rdma_qp_destroy
and hit an assert. This patch fixes this problem for NVMEoF
target and initiator.
Change-Id: I84787dc1b1211293c2a19f59d47727eaecd9d5a1
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3050
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
SPDK NVMe driver had processed ACL as 1's based value by mistake,
and SPDK NVMe-oF target sets ACL to 0. Hence If NVMe driver connects
to SPDK NVMe-oF transport, spdk_nvme_ctrlr_cmd_abort() always queued
abort request.
Fix this bug to process ACL as 0's based value in
spdk_nvme_ctrlr_cmd_abort(). Besides, initialize ACL explicitly to
0 in spdk_nvmf_ctrlr_identify_ctrlr() for clarification.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id4f3a469776cdab88bcc6f41e7893885a7b78d8c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2513
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
This API is a wrapper for rdma_accept which allows
to remove spdk_rdma_qp_init_attr::initiator_side.
Change-Id: Iba2be5e74e537c498fb11c939c922b2bbda95309
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
These dependencies were removed in patches that added
RDMA provider. It was incorrect change since it causes
SEGFAULT when SPDK is built with shared libraries
Change-Id: I15f4ff86a75b3d080e1c7c89d75af4959c4ed989
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2900
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Recently support for multiple AERs was added to lib/nvmf, but the fc
transport was not updated.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1755fdc4a6db2c6ce751c34fe5ad1d43e30298f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2912
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Calling spdk_nvmf_tgt_accept() now automatically assigns new qpairs
to the best available poll group.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3df2a2c5a28dba45c5ba0cbd1e8c28dd7e56cf9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2813
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When read I/O to the controller failed, clear
rdma_req->num_outstanding_data_wr to zero to avoid
rqpair->current_send_depth goes to negative after completing data
transfer to the host.
This bug was apparent when the outstanding read I/O was aborted.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I62e439f946ef81ea0b5de5280670ed97fe8d977f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2341
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
struct spdk_nvmf_fc_rq_buf_ls_request does not need to be marked
as packed. The static assert that immediately follows will catch
any inserted padding.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id11ac9865a9c60f6d147d0f829d272ca791d0336
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2683
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
These returns at the end of functions, so remove them.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I1f54d812956727e1871f4879c8bc9a526d7d14c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2848
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The plan is to push the logic that assigns qpairs to poll groups down
into the nvmf library. To do that, we'll need to have a list of the poll
groups.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iea59ac1a439dbd1bcae68fb2977a47a855884a15
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2811
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Include completion operation into nvmf_qpair_abort() and rename
it by nvmf_qpair_abort_aer() for clarification. Update unit test
accordingly.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I763cc7d24b979e27e8775f4e69730466a2351bdf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2712
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
nvmf_ctrlr_abort_aer() has not been used anywhere, and so we may be
able to remove it, but let's update it to set the completion status
of the aborted AERs to ABORTED for future potential use cases, and
then rename it by nvmf_ctrlr_abort_all_aer() to avoid name conflict
with the next patch.
Setting the completion status to SUCCESS is not good in this case
anyway.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie49936429b82dd05724cf8f10a1417e9c5304635
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2709
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Some cases need to redirect spdk_nvmf_request_complete() to the
thread pointed by qpair->group->thread. Abort command will be
included such cases but abort command is executed in the different
file ctrlr_bdev.c. For the convenience, change spdk_nvmf_request_complete()
to include the redirection and extract the core operation of
spdk_nvmf_request_complete() into a private function
_nvmf_request_complete(). In ctrlr.c, call _nvmf_request_complete()
for non-redirected cases.
Besides, locate the definition of _nvmf_request_complete() under the
definition of spdk_nvmf_request_complete() in a file to trick the format
check tool to avoid false positive, and fix a spell error together.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I85546c80e99e01686c9470653e0ddafcf7c6a391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2115
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Having that transport can decide about particular ctrlr attributes not
globally but per ctrlr.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ia3fb0d4e576cb9f8ce6df75f775e2fd5727d7f48
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2757
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Useful for transport specific layer to inform that SGLs are not
supported or to adjust settings.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ia849f5af206538408664fd20ea4e7dcb9da6f6f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2756
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This parameter describes the number of admin and IO
qpairs while admin qpair always exists and should not
be configured explicitly.
Introduce a new parameter `max_io_qpairs_per_ctrlr`
which configures the number of IO qpairs.
Internal structure of NVMF transport is not changed,
both RPC parameters configure the same nvmf transport parameter.
Deprecate max_qpairs_per_ctrlr in spdkcli as well
Side change: update dif_insert_or_strip description -
it can be used by TCP and RDMA transports
Config files parsing is not changed since it is deprecated
Fixes#1378
Change-Id: I8403ee6fcf090bb5e86a32e4868fea5924daed23
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2279
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
I missed a few files in this library the first time.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I2ad55355e6348eaa10384a148dd45deb9f68fc2b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2442
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Remove inclusion of spdk/event.h and spdk_internal/event.h from
SPDK nvmf library. Their dependency had been removed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibc6c52ab41555d9b29afc3e16c1c3fd0bf5fc63a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2687
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add the AER events in nvmf target from one to four.
Change-Id: Ie31988b49d68bdbc28ab2e09c783e681d3017e2b
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2548
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Using an inline function to decrease the aer notification code.
It also can benefit from adding new async events later.
Change-Id: Id0f7c57fd66ade303c625d1d1af10d41dead8994
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2547
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Fix potential bug. In nvmf_ctrlr_abort_on_pg(), when run into
the request_complete it can trigger this bug because in nvmf_qpair_abort
the ctrlr->aer_req was set to NULL.
Change-Id: I37a7e3c76c6616ca4ecbc6d9d2776ee32449c796
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2545
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When we don't use SRQ and close a qpair, we will receive completions
with error status for all posted ibv_recv_wrs. This is expected and
we don't want to log an error, use debug severity in this case.
Fixes#1387
Change-Id: Ia8efb8f24628880454fab1d7f6188e7b31cffb67
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2451
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also, while we are here, consolidate setting SO_SUFFIX to one spot.
Previously, it was possible for a library to slip through
without an SO version.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I4db5fa5839502d266c6259892e5719b05134518c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The new RDMA provider can be enabled by passing
--with-rdma=mlx5_dv parameter to configure script
This provider uses "externally created qpair"
functionality of rdma cm - it must move a qpair
to RTS state manually
Change-Id: I72484f6edd1f4dad15430e2c8d36b65d1975e8a2
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1658
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This is a wrapper over RDMA CM rdma_disconnect function
The wrapper is needed since in Mellanox Direct Verbs
(aka DV) we must move qpair to error state manually
before calling rdma_disconnect
Change-Id: Ia8623c6989e7679591f2da56bafa7f4262eeebf9
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1975
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
This patch updates NVMF target to use RDMA provider API
to create and destroy qpairs.
Makefiles have been updated with new dependency on RDMA lib
Change-Id: Iae35aea601380f8d1a6453a7fd6115f781e126f5
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1656
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There were 50 functions and 5 variables that were previously
exported in the library which are actually private symbols.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I08c47255cdefff5948c914b0782e872c28c27130
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2291
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When using spdk_nvmf_request_exec_fabrics() to connect one IO
queue, the qpair->ctrlr is null before this function, and the
qpair->ctrlr will be set to associate controller after the
connect, so it will check the sgroup->io_outstanding count,
then we can hit the assertion.
Change-Id: I747cbfd0541cd12286dab549cd02245aac54f2db
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2062
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Use 20.04 reference build instead of 20.01.
Also updatethe NVMe-oF Makefile to reflect a change to the
ABI since 20.04 was released. This has to be done in the same
patch to keep the build from failing.
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Change-Id: I3201f698ecb441021964debda760866dbbc01a64
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2171
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
In SPDK NVMe/TCP target, when initializing the socket, the low
watermark is set to sizeof(struct spdk_nvme_tcp_common_pdu_hdr),
which is 24 bytes. In our testing, some times there might be very
small data packet (as small as 16 bytes) be sent to wire. After
this, if there is no more data sent to the same socket, this small
data packet won’t be received by NVMe/TCP controller qpair thread
because the size hasn’t reached the low watermark. Because of this,
the qpair thread is waiting for more data come in and the initiator
is waiting for the IO request to be completed. Hence the delay
happens.
As the minimum data that allows target to determine the PDU type is
sizeof(struct spdk_nvme_tcp_common_pdu_hdr), which is 8 bytes, we
changed low watermark setting as below. With the change, the problem
was gone immediately.
Change-Id: I14ccc4c84b77e33a617726e7455304aca29d5d57
Signed-off-by: Wenhua Liu <liuw@vmware.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2138
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Useful for transport specific layer to inform that Keep Alive is not
supported or to adjust granularity.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I636fda3eadcb96cd8a4b79570fc4e3cc6a58fe93
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1545
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Virtual controller capabilities can be overridden on transport
specific layer. The current behavior shall be preserved.
This can be useful to limit or extend the default based on transport
type.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I754f0d957a46f219adc1e55f792e79c7546ddb43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1274
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Purpose: To set the priority of the NVMe-oF connection especially
for TCP connection.
For example, the previous example can be:
trtype:TCP adrfam:IPv4 traddr:10.67.110.181 trsvcid:4420
With the change, it could be:
trtype:TCP adrfam:IPv4 traddr:10.67.110.181 trsvcid:4420 priority:2
The priority is optional. We try to change
spdk_nvme_transport_id but not in spdk_nvme_ctrlr_opts since
the opts in spdk_nvme_ctrlr_opts will reflect in every nvme ctrlr,
this is short of flexibility.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Change-Id: I1ba364c714a95f2dbeab2b3fcc832b0222b48a15
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1875
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Since the related feature is already contained in
spdk_sock_listen and spdk_sock_connect functions,
we no longer need this function.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I1eafff0d139fa266a355fbee2bf0fc3947db69fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1876
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Hit assert when completion is done is response to internal connect
cmd. Fix by outstanding counter incr, add missing err handling.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I2ba71f83432831fc043f2b6f6a5cba024045068b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1902
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This also lets us get rid of the set_ibv_state function
since rdma_disconnect does the transition to the error
state for us.
rdma_disconnect triggers an RDMA_CM_EVENT_DISCONNECT on the
initiator side. This is important so we know when qpair ids
can be reused on the initiator side.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I87b80797c84ffa1d2cbe94cf37316b1b674b6c74
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1878
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
As soon as we disconnect the qpair, the initiator can submit an
additional request to connect a qpair with the same qid as the
one we connected.
This series is aimed at making sure we don't acknowledge a disconnect
until we have cleared that bit.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I76d9312448a9740911465c146a195996cc567370
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1880
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Historically, we have immediately called the qpair
fini function directly from _spdk_nvmf_qpair_destroy.
However, to fix a race with initiators where we can get
a duplicate QID from the initiator, we will have to clear
the QID bit before calling qpair_fini. Clearing the QID bit
must be done on the controller thread, so we have to pass at
least two messages between calling _spdk_nvmf_qpair_destroy
and qpair_fini. This leaves a gap where we can poll the poll group
and reap completions for the qpair before we call qpair_fini and drain
all of the requests related with the qpair.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I337a0608f88fdda132b8a629a508c0263d2261c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1881
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We will be create fine name for each poller but it will need large
effort. Replacing spdk_poller_register by the macro SPDK_POLLER_REGISTER
will provide better name than function address with minimum effort.
Following patches may improve function name for clarification.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If862a274c5879065c3f7cb04dcb5ca7844523e68
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1781
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Community-CI: Broadcom CI
The code used to do this but it was removed when the buffering was
shifted down to the posix layer. Add a way for users of sockets
to still properly size the buffers.
This also means that by default, the receive buffering is not enabled
on sockets. That matches the behavior of the previous release.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I20ce875be2efd841fe3a900047b4655a317d7799
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1560
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's actually faster to process them until you run out of data.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I9e81babdb9bdc405a8dbf03b2f701fe50bcc70f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1559
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add FC listen address to target listeners table before attaching the
same to the subsystem listener list. Without this fix, subsystem allow
any listener provision which is very important to FC-NVMe is broken.
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: I06436c0a73e65cb5f7bb3280658fcf200b975989
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1443
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently SPDK rejects Connect command when subsystem is not active.
This change allows to queue Connect command and execute it when
the subsystem goes back to active state. To queue the command we
should know subsystem_poll_group, in current implementation
this poll_group is known only when controller is already created.
To get the poll_group for Connect command we can retrive subsystem
subnqn, find subsystem and get poll_group by subsystem->id.
Increment subsystem_poll_group->io_outstanding even for Connect
cmd in order to prevent subsystem change state during the
connection process. Update spdk_nvmf_request_complete -
decrement io_outstanding for Connect cmd.
Fixes#1256
Change-Id: I724abb911696d7234a9c9d27458eba24739b26fd
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1273
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We callocated the listener, but if we to execute the listening
behaviors which got failed, and all the other resources have been released,
only left listener.
Fixes issue #1326.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I020b509caed79e9880d07b01888ed389630ff67f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1595
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_jsonrpc_begin_result() was changed to always return non-NULL
a while ago. Most of the checks were removed from SPDK, but this
one must have been omitted because to its unusual structure.
Change-Id: If305d7a27d8e6bd4dae1aa8221efabba50db1982
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
the nvme, nvmf, and thread libraries have all had public APIs
removed or changed since the API was changed to 2.0 and
backported to 20.01.1 we should rev these so versions to make
that distinction obvious.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Id48454b8d0451794abad4db452b5c4e337b23c0b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1269
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will allow us to keep track of compatibility issues on a
per-library basis.
Change-Id: Ib0c796adb1efe1570212a503ed660bef6f142b6e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1067
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The previous patch ce6b8a1313
added a wrong assumption that every WC of RDMA_WR_TYPE_DATA
type must point to rdma_req with IBV_WC_RDMA_READ opcode since
RDMA_WRITE operations are non-signaled. However it is wrong
since in the case of error all WRs will have WCs. Revert part
of the problematic patch.
Change-Id: I8d270c5313ebfe1ec44a338820a62f085996eb8f
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1334
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The next patch will create poll group threads dynamically for
NVMe-oF target, and will need to wait for completion of poll group and
I/O channel destroy. This is a preparation for the next patch.
Add callback function and its argument to spdk_nvmf_poll_group_destroy(),
and to struct spdk_nvmf_poll_group, respectively.
The callback has not only cb_arg but also status as its parameters even
if the next patch always sets the status to zero. The reason is to follow
spdk_nvmf_tgt_destroy's callback and to process any case that the status
is nonzero in future.
spdk_nvmf_poll_group_destroy() sets the passed callback to the passed
poll group.
Then spdk_nvmf_tgt_destroy_poll_group() calls the held callback in the
end.
This change will ensure all pollers are being unregistered and
all I/O channels are being released.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifb854066a5259a6029d55b88de358e3346c63f18
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/495
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The current implementation doesn't log WC status. Remove independent
logs by WR type, use common error log. Besides that add a minor cleanup
in checking of WC opcode for RDMA_WR_TYPE_DATA type - it is always
RDMA_READ
Change-Id: Ifdfc295e804ab3246ec54327afc8a5a37aca1d55
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1160
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This reverts commit ea5ad0b286.
This code is moving from the nvmf target to the posix sock
layer in this series.
Change-Id: I333bdf325848e726ab82a9e6916e1bbdcd34009c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/446
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reverts commit d50736776c.
This code is moving from the nvmf target to the posix sock
layer in this series.
Change-Id: I7cf477333f2a3fa4a0089394d5fa28142b262a7f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/445
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reverts commit 5e7b8d18f3.
This code is moving from the nvmf target down into the posix
sock layer in this series.
Change-Id: Iea9a7cef5bedd6a34edf7b4c87825897279829c3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It was probably miss-interpretation of description from discovery log
page which refers to min admin max sq size.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I575bf7fd6beb904b3a38a07616b76a34f8365643
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
subsystems
This is optional and most transports will not implement it.
Change-Id: I51e0f1289b0e61a8bdb9a719e0a2aae51ecb451c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
In the next patch a new transport call will be added to notify
transports of subsystem->listener associations. The operations the
transports may need to perform there are likely asynchronous, so make
this top level call asynchronous here in preparation.
Change-Id: I7674f56dc3ec0d127ce1026f980d436b4269cb56
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/628
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This was recently made asynchronous to support virtualized transports.
However, we're moving to add a new call to associated a listener with a
subsystem to transports and the operation that needed to be asynchronous
will actually be performed there. For simplicity, make this synchronous
again.
Change-Id: Ie98136a19c58f0f9bba0d140476de3bbb38e12d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/881
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Fix Segmentation fault on the target side.
Issue:
rdma.c:2752:spdk_nvmf_rdma_listen: *NOTICE*: *** NVMe/RDMA Target Listening on 192.168.35.11 port 4420 ***
rdma.c: 789:nvmf_rdma_resources_create: *ERROR*: Unable to allocate sufficient memory for RDMA queue.
rdma.c:3385:spdk_nvmf_rdma_poll_group_create: *ERROR*: Unable to allocate resources for shared receive queue.
Segmentation fault (core dumped)
GDB:
Program terminated with signal 11, Segmentation fault.
736 if (resources->cmds_mr) {
(gdb) bt
736 if (resources->cmds_mr) {
(gdb) bt
0 nvmf_rdma_resources_destroy (resources=0x0) at rdma.c:736
1 0x0000000000497516 in spdk_nvmf_rdma_poll_group_destroy (group=group@entry=0x2fe1300) at rdma.c:3489
2 0x00000000004978bb in spdk_nvmf_rdma_poll_group_create (transport=0x2fe11d0) at rdma.c:3371
3 0x000000000048df70 in spdk_nvmf_transport_poll_group_create (transport=0x2fe11d0) at transport.c:267
4 0x000000000048a450 in spdk_nvmf_poll_group_add_transport (group=0x2f49af0, transport=<optimized out>) at nvmf.c:941
5 0x000000000048a6cb in spdk_nvmf_tgt_create_poll_group (io_device=0x2fce600, ctx_buf=0x2f49af0) at nvmf.c:122
6 0x00000000004a0492 in spdk_get_io_channel (io_device=0x2fce600) at thread.c:1324
7 0x000000000048a0e9 in spdk_nvmf_poll_group_create (tgt=<optimized out>) at nvmf.c:723
8 0x000000000047f230 in nvmf_tgt_create_poll_group (ctx=<optimized out>) at nvmf_tgt.c:356
9 0x000000000049f92b in spdk_on_thread (ctx=0x2f81b20) at thread.c:1065
10 0x000000000049f17d in _spdk_msg_queue_run_batch (max_msgs=<optimized out>, thread=0x1e67e90) at thread.c:554
11 spdk_thread_poll (thread=thread@entry=0x1e67e90, max_msgs=max_msgs@entry=0, now=now@entry=947267017376702) at thread.c:623
12 0x000000000049af86 in _spdk_reactor_run (arg=0x1e678c0) at reactor.c:342
13 0x000000000049b3a9 in spdk_reactors_start () at reactor.c:448
14 0x0000000000499a00 in spdk_app_start (opts=opts@entry=0x7ffc2a5e0ce0, start_fn=start_fn@entry=0x40aa80 <nvmf_tgt_started>,
arg1=arg1@entry=0x0) at app.c:690
15 0x0000000000408237 in main (argc=5, argv=0x7ffc2a5e0e98) at nvmf_main.c:75
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Id9bf081964d0cf3575757e80fc7582b80776d554
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1073
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
To avoid the strange formatting, typedef has been used. But this
comment is hard to get the meaning. So stop breaking after return
type for this case.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia03d6ec50610c395007fe172018b890733dce599
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1052
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
There is a warning triggered when holding ref to const obj and passing
to these getters.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I2c7b4ea0d325d84d66923fc524273ea44a3a311b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/997
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
According to the spec, if the CC.EN bit is transitioned from 1 to 0,
then all of the I/O qpairs shall be disconnected and the csts.rdy bit
shall be set to zero.
Change-Id: I871170e79e08a9fab8286f9c135c7b3316f58ace
Signed-off-by: Seth Howell <seth.howell@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/423
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a way for the transport to query the value of the controller
registers.
Change-Id: Id365ff088989f6f8e74e26ff6f3d435f35bee2f4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/422
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The first 4 or last 4 bytes of an 8 byte property can now
be written independently.
Change-Id: I894f8349be836511c18c380262eae46951060766
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/421
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It is memory optimisation as transport id is 'heavy'. As a side effect
simpler handling of listen and stop_listen on transport specific layer.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I4e9d0e0c5eee2d570ec4ac9079270c32d5afb8db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/626
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
qpair might be deleted with incomplete requests (e.g. when NIC
is removed or when huge amount of qpair are being destroyed
simultaneously), this reduces the capacity of the transport
buffers pool. Check that qpair qd is nonzero and process
requests whose state is not FREE. Processing of requests
when qpair is being deleted leads to their release.
Change-Id: I0e42b5cb78f35add9f37942db77781db72c1e59c
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/676
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Destroy subystem if spdk_nvmf_subsystem_set_sn or spdk_nvmf_subsystem_set_mn
failed. Check status in spdk_rpc_nvmf_subsystem_started callback, destroy
subsystem and report an error on error.
Fixes#1192
Change-Id: Id6bdfe4705b5f4677118f94e04652c2457a3fdcc
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/832
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is meaningless for network devices, but will be useful
when emulating the more complete register state of local devices.
Change-Id: I37052e514101c298a1f66cc72135a8c3dd669003
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This doesn't do anything for a network fabric, but it doesn't
hurt to allow these commands to set the emulated register
values for AQA. This will be more useful when emulating a
physical NVMe device.
Change-Id: I2891d7a07a5dceff50c6d66a8ce0b6b7c22a79f8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/419
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The custom command handlers are registered by outside software.
Move the implementation from lib/nvmf to the nvmf_tgt application
to match the intended usage.
Change-Id: Iedb7ae5356f195dfb5bb465975808c8749d16f32
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/416
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is a public header that needs to be accessible to
code outside of the SPDK project. The spdk_internal/
directory does not end up getting packaged - it's just for
headers used by multiple libraries within SPDK.
Change-Id: I14e1ab4fda4b0ee779203d190a266240b10be6ae
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/413
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This defines the official interface that NVMe-oF target
transports may use. For now, all code is just copied
from elsewhere. Eventually we'll want to add doxygen
comments.
Change-Id: I0cd9368607544be18c7c49188d071e38ceb59b8f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/412
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
ibv_query_qp can return nonzero value if e.g. we received
IBV_EVENT_DEVICE_FATAL. Remove assertion not to break SPDK
in debug mode
Change-Id: I00b3bef448a69e2f43ee90e5466b2d78b55d8a08
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/659
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This event can occur for either qpair or listening device. The
current implementation assumes that every event refers to a qpair
which is wrong. Fix: check if the event refers to a device and
disconnect all qpairs associated with the device and stop all
listeners.
Update spdk_nvmf_process_cm_event - break iteration if
rdma_get_cm_event returns a nonzero value to reduce the
indentation depth
Fixes#1184
Change-Id: I8c4244d030109ab33223057513674af69dcf2be2
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It fixes memory leak e.g. when add_listener rpc called twice with the
same trid on the same subsystem (ref = 2). In such case kill or
remove_listener decrements ref only once.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ib19f2e50838feff1c9108957ee82a42da66e54a2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482446
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Some functions performed incorrect header/data digest
support check, align it with NVMEoF spec. Use a table
to check if PDU supports digest depending on its type.
Change-Id: I6170dd19ace017f37fda0a923f604732799460b9
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483375
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It allows to property set (e.g. cc) when subsystem and qpair are not
active.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I0b0d150fbdac5bdf0d20762337f0a811f4d6d243
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481494
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This internal interface allows to create nvmf ctrlr and connect io
qpairs on add listener rpc request (i.e. when subsystem is stopped
and listener is not yet on subsystem's list).
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I998cb72ed773094faacc6668cf069ba9e2a6bf50
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481409
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Had to remove one part of a unit test because the null
checking is moved to a different function.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I0a95d0a9a9a5708416fdc7efefb36e17b1ffe010
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/480008
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
The fuse command value is a two byte value, but we were only checking to
see if the fuse value was equal to SPDK_NVME_CMD_FUSE_FIRST or
SPDK_NVME_CMD_FUSE_SECOND in spdk_nvmf_ctrlr_process_io_fused_cmd. If a
haywire initiator sent a command with a fused value equal to
SPDK_NVME_CMD_FUSE_MASK, that would result in us skipping all checks and
dereferencing a null pointer in
spdk_nvmf_bdev_ctrlr_compare_and_write_cmd.
To fix this, add an extra condition to validate the cuse field.
Change-Id: I1ec4169ff5637562effd694f7046c6e3389627f1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483123
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
When a command arrives and no requests are available, the socket
recv state machine sits in the RECV_STATE_AWAIT_REQ state until another
network event occurs. If this I/O was the last one sent, this leaves the
target hung. To fix this, when a request is completed, kick the state
machine to make forward progress.
In practice, this can only occur once the pdu send acknowledgements are
asynchronous relative to arriving commands. That only begins happening
with the use of MSG_ZEROCOPY. When MSG_ZEROCOPY is turned on, it's
possible receive the next PDU in a chain for a command prior to seeing
the acknowledgement that the response that triggered that PDU actually
sent.
Change-Id: I556f31ad56970d36aa3538cfde375d35f3d4e551
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/480002
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously, the R2T was sent and if an H2C arrived prior
to seeing the R2T ack, it was processed anyway. Serialize
this process.
In practice, if the H2C arrives with a correctly functioning
initiator, that means the R2T already made it to the initiator.
But because the PDU hasn't been released yet, immediately processing the
PDU requires an extra PDU associated with the request. Basically, making
this change halves the worst-case number of PDUs required per
connection.
In the current sock layer implementations, it's not actually possible
for the R2T send ack to occur after that H2C arrives. But with the
upcoming addition of MSG_ZEROCOPY and other sock implementations, it's
best to fix this now.
Change-Id: Ifefaf48fcf2ff1dcc75e1686bbb9229b7ae3c219
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479906
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This function was only called from one spot.
Change-Id: I856f564d3ef6c6157be7a32a2cd812c702516a8d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482003
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This seems like a more descriptive name
Change-Id: Ia616865b3fb36d8f9ccc5fb2ca6185bdd8543cf8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482002
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
With our target design, there's no advantage to sending
multiple R2T PDUs per nvme command. This patch starts by
setting up the math so that at most 1 R2T PDU is required
per request. This can be guaranteed because the maximum
data transfer size (MDTS) is pre-negotiated in NVMe-oF
to a reasonable size at start up.
It then proceeds to simplify all of the logic around mapping
requests to PDUs. It turns out that the mapping is now always
1:1. There are two additional cases where there is no request
object at all but a PDU is still needed - the connection response
and termination request. Put an extra PDU on the queue object
for that purpose.
This is a major simplification.
Change-Id: I8d41f9bf95e70c354ece8fb786793624bec757ea
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479905
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
We can always accept up to the maximum I/O size in an H2C,
so eliminate the #define.
Change-Id: I349dab5f9b6ec482a7c580b1396e03c8d30a250b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482278
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The resources allocated to a queue pair do not need to be directly
correlated to the queue size requested by the initiator in NVMe-oF, as
long as enough resources are present. The RDMA transport, for instance,
does complex pooling of the resources behind the scenes when using a
shared receive queue.
Simplify the resource allocation for a TCP qpair to just always allocate
the max allowed queue size right away. This is a configurable parameter,
so system administrators can adjust for their needs. The initiator may
then request a queue size less than or equal to that, which will only be
enforced by queue depth counting and not impact the actual number of
resources allocated on the target.
This change relies on the MaxC2HSize being equal to the Maximum Data
Transfer Size (MDTS) reported. That is the default configuration, but
MDTS is configurable. Changing the MDTS with this patch to a value
larger than 128k will cause the target to break. This is addressed in
the next patch in this series.
Change-Id: Ibd4723785c6a4d8d444f9b7bbfa89f98de2320f5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479733
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
These values do not need to be negative.
Change-Id: Id9f798cf1c9da354448f9c6fbb90e599f877bb32
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482277
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
By releasing the just-completed PDU prior to calling the callback,
for flows that immediately submit another PDU inside the callback,
the just-released PDU can be immediately reused. This reduces the number
of PDUs required in the pool to continue forward progress to half of the
previous value, while also making it more CPU cache friendly.
Change-Id: I8031b8f9f57ac05f261d96433d9899fe5e31d318
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479904
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
For ACWU we always set value 1 because bdev holds
information specific for namespace only. This value
actually does not matter because we also set NACWU
which makes ACWU irrelevant. We set ACWU because
NVMe specs requires ACWU != 0 if fused commands
are supported.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ida4357026d3b32677fc824b3cd878e7ad8ef2680
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477915
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add call for spdk_nvmf_bdev_ctrlr_compare_and_write_cmd
function in spdk_nvmf_ctrlr_process_io_cmd function
when fused command is discovered.
This patch also removes redundant defines for fused flags.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I61971a56577ab32b52e1fde1e572f718a9a2d9aa
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476621
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Move fused cmd related code from spdk_nvmf_ctrlr_process_io_cmd
to separate function.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ic662a968b054f05db7f6e1cf4fa9aa13f6fb7c40
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch introduces new spdk_nvmf_bdev_ctrlr_compare_cmd
function which implements support for compare operation.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iadf402a6441a78ea0e6468f1066c6b0e10e63b9b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477782
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch introduces new function that is a part of
upcoming support for fused commands.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I019c587bee7fd0f745ec17c141baf4cb7bf86645
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476611
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This change fixes a merge incompatibility between commits
50cb6a04ac and
708ed4fb6e.
Change-Id: I5bc71a3c214667f01de66857cf61b9eb25f6cf6b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482586
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is a spdk_nvmf_tgt_listen() which opens a port for specified
transport (trid) which opens possibility to accept new connections
from initiators. However there is no counterpart of this function
(i.e. spdk_nvmf_tgt_stop_listen()), which would stop listening.
Instead the current code relies on spdk_nvmf_subsystem_destroy()
to stop the listener, which seems to be wrong.
Fixes#1129
Change-Id: I6e73d8c234dc451f0fee8394132eae34cd4f4756
Signed-off-by: Jan Kryl <jan.kryl@mayadata.io>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479873
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This commit sets the optional admin command flags
in the identify structure based on whether handlers
for these optional admin commands are specified.
Change-Id: If4aa36a414b0811dafaadbc1094e6c2628d21b39
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479446
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
This commit provides the capability to install a
custom admin command handler for NVMF.
It can be used to implement or replace NVMe admin commands that
are currently not handled by the NVMF subsystem.
The handler implementation is pretty generic and the handler function
has to figure out what to do with the command based on the bdevs
that are configured for the subsystem.
In cases where admin commands need to be forwarded to an NVMe bdev,
the commit provides functions that allow access to the underlying bdev.
There is an example handler in lib/nvmf/custom_cmd_hdlr.c.
Change-Id: I4f9d538c53669c176a836e8bdd379db0070a87dc
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479167
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <jacek.kalwas@intel.com>
This was calling a callback for another function which
attempted to release the request. The code only worked because
in the r2t case the cb_arg was set to NULL, and that makes
the request free function do nothing.
Change-Id: Id9ec30ceb0eaa41deb67aa995da5d6f786d9b9f0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479903
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This wasn't actually used. Every PDU only had a single reference.
Change-Id: I8adaa7edeca5fe175aa853c156df741170d76c10
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479902
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This would allow to respond for add listener rpc request even
when there are async calls in transport specific function.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I94a9f45b7ba9e8d46a60ae3785953cea12554732
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479511
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Make common code as part of successful return.
In rdma check if already listening first.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ib0c87ac11db7daff00dc4042c9e0ab20eb7ffd0f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478721
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Purpose: With this patch,
(1)We can support using different sock implementations in
one application together.
(2)For one IP address managed by kernel, we can use different method
to listen/connect, e.g., posix, or uring. With this patch, we can
designate the specified sock implementation if impl_name is not NULL
and valid. Otherwise, spdk_sock_listen/connect will try to use the sock
implementations in the list by order if impl_name is NULL.
Without this patch, the app will always use the same type of sock implementation
if the order is fixed. For example, if we have posix and uring together,
the first one will always be uring.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ic49563f5025085471d356798e522ff7ab748f586
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478140
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function previously accepted a trtype enum, but needs to be able
to accept a string to support custom transports.
Change-Id: I931aed30ca3be65468552ffa1bb1ef3f91275fda
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479601
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The trtype should be stored as both an enum and string. This is intended to
help pave the way for pluggable NVMe-oF transports.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I6af658d7a17c405e191ff401b80ab704c65497e7
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478744
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
This commit exposes some internal functions and enums
in preparation for the custom admin cmd handler functionality.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: Iec15c1f3d9cba5db267f6e43f3d929cf382ca8f4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476800
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It is faster for the kernel to pin memory in hugepages, so allocate
the pdu pool from hugepages. This will help more
with upcoming changes to leverage MSG_ZEROCOPY.
Change-Id: I9ce581acca9c6edb71bd8119258966e3b405db77
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475801
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
All transitions to the EXITING state go through the disconnect function
now
Change-Id: Ia55816351b2998bfef26130b6ffdc4a1010567a1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470533
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function can't actually return NULL. It aborts if we get
our math wrong.
Change-Id: Iaf77112addc3c14c70755a56043c5dba3427890d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478911
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We can use spdk_min to get the copy_len in spdk_nvmf_tcp_send_c2h_term_req.
It confirms copy_len it's not larger than SPDK_NVME_TCP_TERM_REQ_ERROR_DATA_MAX_SIZE
Signed-off-by: dongx.yi <dongx.yi@intel.com>
Change-Id: Id343928e1911e4ab77fca7463f3f0cc55889db30
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479118
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The only reason the qpair ever uses the port is to get to
the device attributes so skip the middle man.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ib14a97ceaa0c49176027d6c35c5cb2787a845cc1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478961
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Features are not saveable by the controller as indicated
by ONCS field of the Identify Controller data.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I7e5bc2c701f5857e2c1481e8370b070089f88111
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479128
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It is useful to have sct and sc on request completion explicitly.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: If429123589694ba36111330699fee22238f8a557
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Minor optimisation done by code analysis, both cmd and dif are
overridden in TCP_REQUEST_STATE_NEW.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I6bae4ddae175035d029c0693f7e4351b95a296ab
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478604
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The discovery target support the keep alive timeout so
it should also support the keep alive cmd.
Change-Id: I08bd3312c17962c97c96fdd1469246fe97d5e8e7
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478016
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Recent effort to unify NVMe NQN length macro replaced
'FCNVME_ASSOC_HOSTNQN_LEN' with 'SPDK_NVME_NQN_FIELD_SIZE' in
include/spdk/nvmf_fc_spec.h. This change updates the downstream files
which are also affected by the change.
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: I44e1de50067e11fabacbb69cf1c42331f3339bc8
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478769
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
It's not wrong, just to keep consistency with other functions.
So remove these.
Signed-off-by: dongx.yi <dongx.yi@intel.com>
Change-Id: I833211ea8ee6c6b02c874ea340a3f936a0c4c00f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478684
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The call to update_ibv_state could result in a segfault if the other
thread had already freed the qp and was just spinning on handling the
rdma event. By not updating the qpair state here, I don't think that we
lose any information about the qpair state. Especiallyy since just a
little bit later we update the qpair state to be in error.
There is nothing in the man pages about the cm events changing the ib
state although I imagine they are closely related. I just say that
because I believe that's why the update was originally in that spot.
fixes: GitHub issue #1110
Change-Id: I3f87ff009bc2019464ed7c6920dd71e2b286b3fd
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477877
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It will be the expected behavior when the error message will
printed if we use asynchrounous I/O. And the real
error message for not getting the tcp_req is located in
spdk_nvmf_tcp_capsule_cmd_hdr_handle.
Change-Id: I1a608fbd3a04050eacb6cb68eafd50e5128925ab
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477872
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
TCP_REQUEST_STATE_NEW is already set in spdk_nvmf_tcp_req_get.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ia835f3763cd74ef9b504901c719d9954317f49af
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476164
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This eliminates the flushing logic, simplifying the tcp
transport.
This also happens to greatly improve performance, especially
on random read tests. The batching done in spdk_sock_writev_async seems
to be more effectively than the previous batching logic in the tcp
transport.
Change-Id: Id980ac6073e380dc75f95df3f69cb224f50fb01b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470532
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
RDMA qpair might be destroyed by defunct timer, so it can have
active recv elements in incoming_queue. This queue is cleaned
incorrectly, so recv element for the destroyed qpair still may
be presented in the queue and be processed later. That leads
to undefined behaviour.
Fixes#1086
Change-Id: Ieae186b2d2dce4ec88ab886b26165f6ef98e8d05
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477957
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
This function is already declared in the public header which this
function includes.
Change-Id: Iff3f85dc166b7bd4d949b9b099a6bf05dec7dec8
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477868
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Request can be freed by transport_req_complete. In such case req
or req->cmd dereference might result in heap-use-after-free.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I2280d3978f1f183a250828aab7d2ca49ef1800ec
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476929
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The following scenario might occur when nvmf_tgt is stopped:
1. nvmf_tgt receives SIGINT, changes state to NVMF_TGT_FINI_STOP_SUBSYSTEMS
2. In this state nvmf_tgt stops all subsystems and disconnects associated qpairs
3. In the case of RDMA qpair, its state will be changed to IBV_QPS_ERR.
Once qpair changes the state to IBV_QPS_ERR, RDMA device generates
LAST_WQE_REACHED event when there are no more WQE that can be sonsumed
from the SRQ by this qpair.
4. When all subsystems are stopped, some of qpair may still be alive since they
haven't received LAST_WQE_REACHED event yet.
5. nvmf_tgt stops all poll groups and forcefully destroyes any qpairs linked to them.
6. At this moment LAST_WQE_REACHED event might be generated and received in another thread.
Handler of this event sends a message with a pointer to qpair. The qpair itself may already
be destroyed.
7. Thread that owned qpair receives a message (LAST_WQE_REACHED) with a pointer to alredy destroyed qpair and
destroyes it for the second time when all pointer are invalid.
ibv events related to qpair should be handled by the thread that
owns this qpair. This commit adds a new structure that describes
ibv event, helper functions for sending the event and a list
of events per rdma qpair; add syncronization for LAST_WQE_REACHED event
Fixes#1075
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I22bff89741708df2518760934ecb4e33fad49473
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476712
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
NVMF_REQ_MAX_BUFFERS is defined in nvmf_internal.h and not used in rdma
source code.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I50f24c8fc4ea773378418f7803361d8592f961ae
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475529
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
It is valuable to have more detail status instead
SPDK_NVME_SC_INTERNAL_DEVICE_ERROR.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ifd003b490a7ae9af017645c97636ceaf2f93d4b0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476634
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There are some new features defined in NVMe 1.4 specification also need
the data buffer, but for now we only listed 3 features, we will check
the data length field anyway, so this piece of code can be removed.
Change-Id: I22204cf53077073434bb0ed73693bcf72883f084
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475953
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reservation Acquire Action(rqcqa) is used for reservation acquire command, since
we have replaced cdw10 with specific commands fields, so split the two cases to
make the code looks more clear.
Change-Id: Ib2a2458dd0fe8345f1c33341dab1c4fe3b82829f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476838
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Destroy in poll_group_add results in heap-use-after-free because
upper layer calls qpair_fini in case poll_group_add returns
error.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I3e921a21b7ab5f7c15c80bc5919cb97cbda0b5d2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475858
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This issue was found by code inspection.
It only occurs when pdk_nvmf_transport_poll_group_create()
itself calls spdk_nvmf_transport_poll_group_destroy().
Other transports at this time do not show this issue.
spdk_nvmf_rdma_poll_group_destroy() depends on
rgroup being assigned a transport, which is only being
done on generic nvmf layer using
spdk_nvmf_transport_poll_group_create()
after successful spdk_nvmf_rdma_poll_group_create().
When failure occurs during create, such assignment was not
performed so any references to rtransport will segfault.
Reported-by: Jacek Kalwas <jacek.kalwas@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id54482d562bd6d7c71371306cf1de93bc05f4e8a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475002
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Also need to update the spdk_nvmf_tcp_poll_group_poll.
Since if the tqpair recv state in wait_for_req,
we may already received the data, and there could be
not epoll event.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9c5a202e47e57aaba63da143f954a20c135a98ae
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473626
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The NVMe-oF target requires 128-bit identifiers for reservations,
so the extended report format must be used. If the user issues
a reservation report command without the extended format bit set,
the specification says to fail it with the HOSTID_INCONSISTENT_FORMAT
error code.
Change-Id: I2382af4f69167322d8e2c3f06cf8d9042830a70c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474131
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Purpose: But if we use asynchronous writev
for pdu sending, the call_back of writev may occur
after the new data coming. So it means that the
free tcp request may not be available.
So we use the strategy to check the request status
in TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST.
So the strategy is checking the state_cntr of all the
reqs in TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST
state.
1 If the state_cntr > 0, we should queue
the new request.
2 If the statec_cntr == 0, it means that
there is no available slot for the new tcp request
, i.e., the new nvme command comming from the initiator.
If we receive this, it means that the initiator sends more
requests,and we should reject it.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ifbeb510e669082cb7b80faf2e7987075af31d176
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472912
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
To avoid the allocation of ttransport in the sub functions,
and it makes the code much efficient.
Change-Id: Ie4c5a1755ddbecf10dc364ff811f74a7af5f9c3b
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473003
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When we use async writev (e.g., lib io_uring), we find that
the callback of writev is executed after recving the new
data from the initiator, and this is possible.
For example, if the NVMe-oF TCP target receives the ic_req from the
initiator, and sendout the ic_resp, the state of tqpair will change from
invalid to running until the callback is executed. And the data of ic_resp
is already sent to the initiator, and we receive the new command later. However,
we may still not get the call back function executed
(i.e, spdk_nvmf_tcp_send_icresp_complete). And it is possible
for using lib io_uring, I faced this issue when using lib uring.
And this patch can fix this issue.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7f4332522866d475e106ac6d36a8ec715133f0dc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472770
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
The loop is intended to accept multiple socks when
available, but once accept returns NULL, there's no
reason to keep trying.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I896908d276da35bc3fff172c1c17e22abd2a5343
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473234
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It's important to be able to recover full context from just
the PDU in the future.
Change-Id: I3d1f3c326299b1237b42dbe33d340a282c3bc5bb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470531
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
This is always the request pointer, so rename it for clarity.
Change-Id: Ifbda7db7787c65f0deb190a1e94f0676b2c0d99a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470530
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Use whatever size the socket layer thinks is best. In practice,
this is the same size as before.
Change-Id: I4820e16d8da6e566d1f8f078a75d345399f64ab5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470511
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
This change allows setting of the NVMe completion queue
CDW0 in spdk_bdev_io_complete_nvme_status.
Before that change, handling of vendor specific NVMe IO
commands was limited since there wasn't a way to return
command specific info back to the initiator.
Change-Id: I250d5df3bd1e62ddb89a011503d42bd4c8390f9b
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470678
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since we use big buffer to read the data, so the incoming
data may already be read when the req is waiting for the buffer.
So if we use the orginalstatement machine, there will be no
read event will be generated again.
The quick solution is to restore the original code, since
for req which has incapsule data, we not need to wait for the
buffer from the shared buffer pool.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ib195d57cc2969235203c34664115c3322d1c9eae
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472047
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch adds the following:
1. Change signature of nvmf_rdma_fill_wr_sge - pass ibv_send_wr ** in
order to update caller's variable, add a pointer to the number of extra WRs
2. Add a check for the number of requested WRs to nvmf_request_alloc_wrs
3. Add a function to update remote address offset
Change-Id: I26f6567211b3ebfdb4981a7499f6df25e32cbb3a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470475
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
RDMA WR has a predefined number of SGL entries (16) and with new logic added to
support DIF the number of entries required to process the request may exceed
this limit since IO unit buffers might be divided into several parts to remove
metadata chunks from the transmition. This change calculates and allocates
required number of WRs.
Error handling section in spdk_nvmf_rdma_request_fill_iovs has been updated
to free allocated WRs
Change-Id: Ie5d659d8305a454949827d1f4aff6d871b7e825d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470474
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This change helps to avoid passing the number of additional WRs to
spdk_nvmf_rdma_request_fill_iovs in the following commits and minimizes
changes in spdk_nvmf_rdma_request_parse_sgl
Change-Id: Id530d0996af661051d94930e3f776bad2dcc5771
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470473
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It can be useful for passing additional information about nvmf
target to a handler for new nvmf connections. Context can be
stored in globals as it is currently done in nvmf code. However
in case of multiple targets or languages where accessing global
state is challenging (i.e. Rust), this becomes inconvenient.
Change-Id: Ia6a2fdba4601531822b3e5fda7ac5ab89d46f6c5
Signed-off-by: Jan Kryl <jan.kryl@mayadata.io>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469263
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Enables access control to allow subsystem access on one or more FC ports.
If a subsystem definition does not have a single valid "Listen" directive,
then the subsystem will allow dynamic listen address binding. For such
subsystems,
* When a FC port comes online, FC transport will add port's address
to subsystem's listen list.
* When a FC port goes offline, FC transport will remove port's address
from subsystem's listen list.
If subsystem definition has 1 or more valid "Listen" directives then FC ports
coming online or going offline will not affect subsystem's listen address
whitelist.
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: Ic7fed76e4bf8d1df0aeeae1d4fa5e6d207afccf3
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471025
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use this function in nvmf_request_alloc_wrs and nvmf_rdma_setup_request
to eliminate code duplication
Change-Id: I771e0e899043049c90298e050d4353145e36f0d1
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471034
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Delete a pointer to spdk_nvme_cmd as it is not used directly
Change-Id: I36a6a6d95c0707f446a0797a55a9e60c62f9503c
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470472
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Return -1 from spdk_nvmf_rdma_request_parse_sgl in the case of -EINVAL since 0 value
returned by spdk_nvmf_rdma_request_parse_sgl is treated as no memory case
Change-Id: I536592260a3bb658a1a4bc3a79c5b37fbacd3edc
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470471
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
hwqp->fgroup is valid only for IO queues and this particular function
deals with pending requests for IO queues. Check hwqp->fgroup and
bailout if called in LS queue context.
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: I40bc9d3c576abd145bd6b296c07dbd64fd3dabd1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470897
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In-capsule data transfer can only be supported by NVME drives with SGL memory layout
Add test to examine new behaviour
Change-Id: Iaef6564c8e5c96c1c5af16ab41d6e3827f6a82b6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470469
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Run the following command:
pahole ./app/nvmf_tgt/nvmf_tgt -R -C spdk_nvmf_tcp_req
It tells me change the bool definition location
of dif_insert_or_strip.
Change-Id: Ia43ab62bcc223a07e6415b2c769fe4af2b097f18
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470401
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The type of iovcnt of struct spdk_nvmf_request is uint32_t, and so
change the type of iovpos of struct spdk_nvmf_rdma_request from int
to uint32_t.
iovpos of struct spdk_nvmf_rdma_request is only incremented and
accessed. It is not used for comparison.
So to avoid rerunning CI, this fix is appended to the patch series.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I65fc5dfb7067f6e8f7cb1e555f010b246a72ec32
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469660
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
By passing the pointer to struct spdk_nvmf_transport_poll_group
to spdk_nvmf_tcp_req_parse_sgl(), we can remove spdk_nvmf_tcp_req_fill_iovs()
and inline spdk_nvmf_request_get_buffers() into spdk_nvmf_tcp_req_parse_sgl().
Pointers to struct spdk_nvmf_request are used in many lines of
spdk_nvmf_tcp_req_parse_sgl(). Caching and using them simplifies and
improves readability a little for spdk_nvmf_tcp_req_parse_sgl().
We can pass pointer to not struct spdk_nvmf_tcp_transport but struct
spdk_nvmf_transport to spdk_nvmf_tcp_req_parse_sgl().
Ordering the pointer to struct spdk_nvmf_tcp_req first in parameters
of spdk_nvmf_tcp_req_parse_sgl() matches the function name.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9f0d33b48383800c3b0a738eb24b11ffed7e6e60
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
nvmf_rdma_fill_wr_sge() gets pointer to iovec at its head, but
nvmf_rdma_fill_wr_sgl() can pass it to nvmf_rdma_fill_wr_sge()
simply.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I16176d5d36ca9daf57640bfcbc49dfbf997afe54
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469639
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Pointers to struct spdk_nvmf_request and struct ibv_send_wr are
used in many lines of spdk_nvmf_rdma_request_parse_sgl().
Caching and using them simplifies and improves readability a little
for spdk_nvmf_rdma_request_parse_sgl().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib000c9d4e7fb7bb415f4ac4622b32b12cc787c80
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We can merge two loops of req->buffers and req->iov into a single
loop and merge two variables, req->num_buffers and req->iovcnt into
a single variable. For the latter, use req->iovcnt because it is
also used for in-capsule data.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia164f2054b98bbcb00308791774e3ffa4fc70baf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469489
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_nvmf_request_get_buffers()/_multi() may return not only -ENOMEM
but also -EINVAL, but spdk_nvmf_rdma_request_fill_iovs() and
nvmf_rdma_request_fill_iovs_multi_sgl() had returned -ENOMEM
regardless of the actual return value. Fix them in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic19593ffa9c0731f63d198d4ae16feb3bb47f57c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469378
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is the end of the effort to unify buffer allocation
among NVMe-oF transports.
This patch aggregates multiple calls of spdk_nvmf_request_get_buffers()
into a single spdk_nvmf_request_get_buffers_multi().
As a side effect, we can move zeroing req->iovcnt into
spdk_nvmf_request_get_buffers() and spdk_nvmf_request_get_buffers_multi()
and do it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I728bd330a1f533019957d58e06831a79fc17e382
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469206
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is close to the end of the effort to unify buffer allocation
among NVMe-oF transports.
Merge each transport's fill_buffers() into common
spdk_nvmf_request_get_buffers() of the generic NVMe-oF transport.
One noticeable change is to set req->data_from_pool to true not in
each specific transport but in the generic transport.
The next patch will add spdk_nvmf_request_get_multi_buffers() for
multi SGL case of RDMA transport.
This relatively long patch series is a preparation to support
zcopy APIs in NVMe-oF target.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Icb04e3a1fa4f5a360b1b26d2ab7c67606ca7c9a0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469205
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch merges nvmf_rdma_fill_wr_sgl_with_md_interleave()
into nvmf_rdma_fill_wr_sge(), and then removes
nvmf_rdma_fill_wr_sgl_with_md_interleave().
In nvmf_rdma_fill_wr_sgl(), pass DIF context, remaining data block
size, and offset to nvmf_rdma_fill_wr_sge() in the while loop.
For non DIF case, initialize all of them by zero.
In nvmf_rdma_fill_wr_sge(), classify non-DIF case and DIF case
by checking if DIF context is NULL.
As a minor change of wording, remaining is sufficiently descriptive
and simpler than remaining_io_buffer_length and so use remaining.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I55ed749c540ef34b9a328dca7fd3b4694e669bfe
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch separates filling wr->sg_list from filling req->iov
in nvmf_rdma_fill_buffers_with_md_interleave() and create an new helper function
nvmf_rdma_fill_wr_sgl_with_md_interleave() to fill wr->sg_list by adding iovcnt to
struct spdk_nvmf_rdma_request.
The subsequent patches will merge nvmf_rdma_fill_buffers() into
spdk_nvmf_request_get_buffers().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I03206895e37cf385fb8bd7498f2f4a24797c7ce1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469204
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch separates filling wr->sg_list from filling req->iov
in nvmf_rdma_fill_buffers() and create an new helper function
nvmf_rdma_fill_wr_sgl() to fill wr->sg_list by adding iovcnt to
struct spdk_nvmf_rdma_request.
The next patch will do the same change for
nvmf_rdma_fill_buffers_with_md_interleave().
The subsequent patches will merge nvmf_rdma_fill_buffers() into
spdk_nvmf_request_get_buffers().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia4cdf134df39997deb06522cbcb6af6666712ccc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469203
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When buffer replacement succeeds, only iov_base has to be updated.
This change is small but will be helpful to disaggregate buffer
allocation and filling WR SGL.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifc72fd783b515dfaecac04939c183097f939e29b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469202
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Factor out setup WR operation from nvmf_rdma_fillbuffers_with_md_interleave()
into a function nvmf_rdma_fill_wr_with_md_interleave().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I92689daa7dcc93aaa68ecf5706d4e1b75d7fabae
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469066
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch
- applies nvmf_rdma_get_lkey(),
- changes pointer to struct iovec from iovec to iov,
- changes pointer to ibv_sge from sg_list to sg_ele, and
- passes DIF context instead of decoded data block size and metadata size
- use cached pointer to nvmf_request to call
- change the ordering of operations to setup sg_ele slightly
for nvmf_rdma_fill_buffers_with_md_interleave().
Name changes are from the previous patch.
They are for consistency with nvmf_rdma_fill_buffers() and a
preparation for the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I942fb9d07db52b9ef9f43fdfa8235a9e864964c0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469201
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This reduces the diff in the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I85dccdc1a1a5a51777934121f50a6af97feda5a5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469480
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
SPDK_ERRLOG lists the function name, so remove old references that
assume it doesn't and reprint the function name.
Change-Id: I69da6ca0a25bf0eda07d8dad52bcfadf964ac715
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469487
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Factor out getting lkey and checking translation length in
nvmf_rdma_fill_wr_sge() into a function nvmf_rdma_get_lkey().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I495ba9ae4a48b4aa7dc35a0bd72708753846dfdc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469349
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Cache pointers to iovec and ibv_sge at the head of the function
and use them throughout.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I493759bf3989ced4390d077280cd44c122847d08
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Factor out setup WR operation from nvmf_rdma_fill_buffers() into a
function nvmf_rdma_fill_wr_sge().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I813f156b83b6e1773ea76d0d1ed8684b1e267691
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468945
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
The subsequent patches unifies getting buffers, filling iovecs, and
filling WRs in a single API. This is a preparation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I077c4ea8957dcb3c7e4f4181f18b04b343e9927d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468953
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
This is a preparation to unify getting buffers, filling iovecs,
and filling WRs in a single API in RDMA transport and then to unify
it among RDMA, TCP, and FC transport.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia69d4409c8cccaf8d7298706d61cd4e2d35e4406
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468944
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
This patch makes multi SGL case possible to call spdk_nvmf_request_get_buffers()
per WR.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I977ebb0c6b2a67218c9b6fc20dc26a93a6ec770b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468943
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
This patch makes multi SGL case possible to call spdk_nvmf_request_get_buffers()
per WR.
This patch has an unrelated fix to clear req->iovcnt in
reset_nvmf_rdma_request() in UT. We can do the fix in a separate patch
but include it in this patch because it is very small.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If6e5af0505fb199c95ef5d0522b579242a7cef29
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch matches the ordering of single SGL case and multi SGL
case for parsing SGL.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iea026b48e8957e140b71db7afaf8aca88634dc33
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468941
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
In nvmf_rdma_requst_fill_iovs_multi_sgl(), length of descriptors
are accumulated into req->length. However, req->length was not cleared
when nvmf_rdma_fill_buffers() fails in the middle. This patch fixes it.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I80a55d90d09c8af46d570e017d342afd69f41996
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469199
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
wr->num_sge has to be used in spdk_nvmf_rdma_request_fill_iovs(),
and memset() can be used instead of clearing each variable.
Besides, holding cached pointer to the current WR simplifies the
code a little and so is done together in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iebda158f85e3a0e3046686f76991217fa7297c24
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469198
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
With this change we only check the subsystem state once.
Previously it did it twice, and with a different order
(once PAUSED || INACTIVE, the other INACTIVE || PAUSED).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idef44accc69dccb9d161b8f04b9d5d3bbbf9e037
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469285
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The previous version of this function precluded one target name from
being a leading substring of another. i.e. if "nvmf_tgt_1" was already
used as a name "nvmf_tgt_11" could not be used subsequently.
Just an odd quirk that shouldn't be the case.
Change-Id: Iea59b6757512f01070e48074e35a11d942e399bb
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468522
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Also update the changelog for the previous few changes.
Change-Id: I79ac330b4992ccc3e41fd1643b09128c6de6c86d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468391
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The current connection scheduling mechanism (RoundRobin) doesn't take into account the qpair type and assigns each new qpair to the next poll group. As a side effect there might occur a disbalance when some poll group handles more IO qpairs than others. In RDMA transport it is possible to get the qpair type before the controller creation using a private data from the rdma_cm event, this allows to schedule admin and IO qpairs in the balanced way.
Change-Id: I90c368a41c4cd0f5347a83cab7511e4494f05b29
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468993
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Operations with poll groups list must be protected by rtransport->lock.
Make rtranposrt->lock recursive to avoid unnecessary mutex operations when
the poll group is being destroyed within spdk_nvmf_rdma_poll_group_create
Change-Id: If0856429c10ad3bfcc9942da613796cc86d68d8d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468992
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
On a DIF verification error, fail the read command with a status code
of APPLICATION_TAG_CHECK_ERROR, GUARD_CHECK_ERROR, or
REFERENCE_TAG_CHECK_ERROR and a status code type of SCT_MEDIA_ERROR.
The state of the request is TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST
when a DIF verification error is detected. So dequeue the request
from C2H data queue, return the response PDU, and then send the command
response.
This was an item on the TODO list. RDMA transport do this right
behavior from the start and so TCP transport follows it by this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I102bbd253cc8c1379d0937c9536bf2bfe04cbf6a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468911
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
tcp_req->orig_length had been set just before I/O submission but
the value is already fixed in spdk_nvmf_tcp_req_parse_sgl().
Hence move setting tcp_req->orig_length accordingly.
This follows the good practice of RDMA transport.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I99f6e266d8f7027bce810864314f3ee24a1af10c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468910
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
used for creating a new spdk_nvmf_tgt structure in the application.
Change-Id: Ib0182ea6d935b84b4fe4fcad79e173cb46859669
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468387
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Functions added in this patch:
spdk_nvmf_tgt_get_name - get human readable name from target.
spdk_nvmf_get_first_tgt - start iterating over global list of targets
spdk_nvmf_get_next_tgt - get next target in iteration
These functions will facilitate the following RPC
nvmf_get_targets - get the names of all active NVMe-oF targets.
In this series, I will also add two more RPCs, nvmf_create_target, and
nvmf_destroy_target, as wrappers around the create and destroy
functions. Since all of these changes are pretty minor and closely
related, I will just do one big changelog entry at the end.
Change-Id: Ia9f1248fbf9726fa3889998a169211fb25e724f2
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468386
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This issue can be reproduced on fedora30.
Add assert here is enough to fix this kind of warning.
Error log:
rdma.c:3070:20: warning: Access to field 'data_buf_pool' results in a
dereference of a null pointer (loaded from field 'transport')
spdk_mempool_put(group->transport->data_buf_pool, buf);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
This is to fix issue #965.
Change-Id: Ifb742ab914ee9a0381dca0bb769ba8aa564c816f
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Update transaction length wrt to medata size
Change buffers handling in the case of enabled DIF - add function nvmf_rdma_fill_buffer_with_md_interleave to split SGL into several parts with metadata blocks between them in order to perform RDMA operation with appropriate offsets
Add DIF generation before executing bdev IO operation
Add parsing of DifInsertOrStrip config parameter.
Since there is a limitation on the number of entries in SG list (16), the current approach has a limitation on the max transaction size which depends on the data block size. E.g. if data block size is 512 bytes then the maximum transaction size will be 512 * 16 = 8192 bytes.
In adiition, the size of IO buffer (IOUnitSize conf param) must be aligned to metadata size for better perfromance since metadata is treated as part of this buffer. E.g. if the initiator uses transaction size = 4096, data block size on nvme disk is 512 then IO buffer size should be aligned to (512 + 8) which is 4160. In other case an extra IO buffer will be consumed which will increase the number of entries in SGL and in iov.
Change-Id: I7ad2270fe9dcceb114ece34675eac44e5783a0d5
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Now code always return 0 , do this like nvme_rdma_mr_map_notify.
That callback can get the right return.
Change-Id: Ief2924e14321b2062f6001e7ae3f50d507206594
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468663
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It is a very rare thing for a buffer to be split over two memory
regions. In fact, it is only possible in dpdk versions where
--match-allocations is not passed as a startup parameter to dpdk but
dynamic memory allocation is enabled.
By adding a small helper function, we avoid failing an I/O because it
was assigned one of these improperly aligned buffers. Also, we try to
remove the buffer from circulation so that it doesn't get picked up
again by another request.
Also, add a unit test to catch this case.
Change-Id: Ia09865c2f77160a960571665b29c4533b11758ae
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467446
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Just cleaning up a few things like variable names and ordering to make
the whole function more readable.
Change-Id: I1503cdb43ddd73e063d6e57e9ff0cf2a06e79728
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467445
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
IB Architecture Specification vol.1 rel.13. in ch.10.3.1 "QUEUE PAIR
AND EE CONTEXT STATES" suggests the following destroy procedure for
QPs associated with SRQ:
- Put the QP in the Error State;
- wait for the Affiliated Asynchronous Last WQE Reached Event;
- either:
* drain the CQ by invoking the Poll CQ verb and either wait for CQ
to be empty or the number of Poll CQ operations has exceeded CQ
capacity size; or
* post another WR that completes on the same CQ and wait for this WR
to return as a WC;
- and then invoke a Destroy QP or Reset QP.
Without the drain step it is possible that LAST_WQE_REACHED event is
received and QP is destroyed before the last receive WR completion is
polled from the CQ.
In SPDK there is no risk of resource leakage in this case. So, instead
of draining we can destroy QP and then just ignore receive completions
without QP and post receive WRs back to SRQ.
Fixes#903
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ice6d3d5afc205c489f768e3b51c6cda8809bee9a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465747
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The default behavior is to set it to 2MB, so this isn't
required anymore.
Change-Id: I62d7605cd4d5bc41347128f32f9a1aa373a15744
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466993
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Since we use pdu->data_iovcnt to
build the iov in nvme_tcp_build_iovs, so
send out pdu has the maximal iov number
equals to: 2 + pdu->data_iovcnt,
so we change the comparison.
This makes sure that we can handle all the data
owned by one pdu.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I2b9258cc5716d706c0fa38af609726c439708768
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467207
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This unifies buffer management among transports further and is a
preparation to make buffer allocation asynchronous.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8c588eeac4081f50fe32605feb7352f72c628d95
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466847
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
I/O buffer cache is per transport_poll_group now. Hence moving
pending_data_buf_queue from struct spdk_nvmf_fc_conn to struct
spdk_nvmf_fc_poll_group is reasonable and do it in this patch.
This change is based on RDMA and TCP transport.
Further unification among transports will be done in subsequent
patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic857046be8da238cb3ff9e89b83cdac5f6349bcf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466844
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The pointer to transport is set to struct nvmf_transport_poll_group
in nvmf_transport_poll_group_create() after returning
nvmf_fc_poll_group_create(). Hence use it and remove ftransport pointer
from struct nvmf_fc_poll_group.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9f2b2ade77afa18d0e97949fc0c2403eb000cdad
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467060
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
RDMA transport have used rtransport and TCP transport have used
ttransport, respectively. So FC transport changes to use ftransport
instead of fc_transport.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7d98eb2f6efbae7e2b4784f31b9de5e1a81bc2ac
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467059
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Both RDMA and TCP transport have uesd group for such case. Hence
FC transport changes to use group instead of tp_poll_group.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic4b401179da506bb204c3ec48650db87f91fe72a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466843
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The pointer to nvmf_poll_group is set in nvmf_transport_poll_group_create()
after returning nvmf_fc_poll_group_create(). Hence holding it into
struct spdk_nvmf_fc_poll_group is duplicated and can be removed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7087c5cdb94b0b0c5f51b0b63b631c08266c90d0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466842
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
RDMA transport have used rgroup and TCP transport have used tgroup
for such case. Hence FC transport changes to use fgroup instead of
fc_poll_group.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I91b7ad6a1c6e45caf92801b0635b18d48b3c9810
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466841
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Keeping a global discovery log page was meant to be a time saving
mechanism, but in the current implementation, it doesn't work properly,
and can cause undesirable behavior and potential crashes. There are two
main problems with keeping a global log page.
1. Admin qpairs can be assigned to any SPDK thread. This means that when
multiple initiators connect to the host and request the discovery log,
they can both be running through the spdk_nvmf_ctrlr_get_log_page
function at the same time. In the event that the discovery generation
counter is incremented while these accesses are occurring, it can cause
one or both of the threads to update the log at the same time. This
results in both logs trying to free the old log page (double free) and
set their log as the new one (possible memory leak).
2. The second problem is that each host is supposed to get a unique
discovery log based on the subsystems to which they have access.
Currently the code relies on whether the discovery log page offset in
the request is equal to 0 to determine if it should load a new discovery
log page or use the cached one. This is inherently faulty because it
relies on initiator provided value to determine what information to
provide from the log page. An initiator could easily send a discovery
request with an offset greater than 0 on purpose to procure most of a
log page provided to another host.
Overall, I think it's safest to not cache the log page at all anymore
and rely on a thread local fresh log page each time.
Reported-by: Curt Bruns <curt.e.bruns@intel.com>
Change-Id: Ib048e26f139927d888fed7019e0deec346359582
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466839
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Most variables related with I/O buffer are in struct spdk_nvmf_request
now. So we can pass nvmf_request instead of nvmf_rdma_request to
nvmf_rdma_request_fill_buffers and do it in this patch.
Additionally, we use the cached pointer to nvmf_request in
spdk_nvmf_rdma_request_fill_iovs which is the caller to
nvmf_rdma_request_fill_buffers in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia7664e9688bd9fa157504b4f5075f79759d0e489
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466212
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Most variables related with I/O buffer are in struct spdk_nvmf_request
now. So we can pass nvmf_request instead of nvmf_tcp_req to
nvmf_tcp_req_fill_buffers and do it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I00eff578a98891e99fcb9a3aafa3d99126d6f1c1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466089
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Most variables related with I/O buffer are in struct spdk_nvmf_request
now. So we can pass nvmf_request instead of nvmf_fc_request to
nvmf_fc_request_fill_buffers and do it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe87e7641e5c364b20a6d877ce7928c612b0b83a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466088
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This is a small performance optimization and an effor to unify I/O
buffer management further among transports.
it is ensured that the request is the first of STAILQ when
nvmf_fc_request_execute() completes successfully.
Hence change TAILQ_REMOVE to STAILQ_REMOVE_HEAD for the case.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If982842bf53ba00426a854a18eaadf8a1b8d642d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466676
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This is an effort to unify I/O buffer management further among
transports. RDMA and TCP transport have named pending_queue
pending_data_buf_queue. So FC transport follows RDMA and TCP transport.
The next patch will change pending_data_buf_queue to use STAILQ
instead of TAILQ.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I57c3c678a1e92ec262eb8940418529a62b6768c3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466675
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This is a small performance optimization and an effort to unify
I/O buffer management further among transports.
It is ensured that the request is the first of STAILQ when
spdk_nvmf_tcp_send_c2h_data() is called or the case
TCP_REQUEST_STATE_NEED_BUFFER is executed in spdk_nvmf_tcp_req_process().
Hence change TAILQ_REMOVE to STAILQ_REMOVE_HEAD for these two cases.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0b195874ac22a8d5ecfb283a9865d2615b7d5912
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466637
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In this patch, we directly point the hdr_p
to the memory owned by the pdu_recv_buf to avoid
memory copy.
Change-Id: Iee0dd98058928f429bf7ad22103cd4826226400f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465158
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
RDMA transport executes spdk_nvmf_rdma_request_parse_sgl() only if
the request is the first of the pending requests in the case
RDMA_REQUEST_STATE_NEED_BUFFER in the state machine
spdk_nvmf_rdma_requests_process().
This made RDMA transport possible to use STAILQ for pending requests
because STAILQ_REMOVE parses from head and is slow when the target is in
the middle of STAILQ.
On the other hand, TCP transport executes spdk_nvmf_tcp_req_parse_sgl()
even if the request is in the middle of the pending request in the case
TCP_REQUEST_STATE_NEED_BUFFER in the state machine
spdk_nvmf_tcp_req_process() if the request has in-capsule data.
Hence TCP transport have used TAILQ for pending requests.
This patch removes the condition if the request has in-capsule data
from the case TCP_REQUEST_STATE_NEED_BUFFER.
The purpose of this patch is to unify I/O buffer management further.
Performance degradation was not observed even after this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idc97fe20f7013ca66fd58587773edb81ef7cbbfc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466636
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use spdk_nvmf_request_get_buffers() and spdk_nvmf_request_free_buffers(),
and then remove nvmf_fc_request_free_buffers() and nvmf_fc_request_get_buffers().
Set fc_req->data_from_pool to false after spdk_nvmf_request_free_buffers().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I046a642156411da3935bc2fa2c2816fc2e025147
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465877
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Use spdk_nvmf_request_get_buffers() and spdk_nvmf_request_free_buffers(),
and then remove spdk_nvmf_tcp_request_free_buffers() and
spdk_nvmf_tcp_request_get_buffers().
Set tcp_req->data_from_pool to false after spdk_nvmf_request_free_buffers().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I286b48149530c93784a4865b7215b5a33a4dd3c3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465876
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use spdk_nvmf_request_get_buffers() and spdk_nvmf_request_free_buffers(),
and then remove spdk_nvmf_rdma_request_free_buffers() and
nvmf_rdma_request_get_buffers().
Set rdma_req->data_from_pool to false after
spdk_nvmf_request_free_buffers().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie1fc4c261c3197c8299761655bf3138eebcea3bc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465875
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds new APIs spdk_nvmf_request_get_buffers() and
spdk_nvmf_request_free_buffers() to be used among transports.
Subsequent patches will replace transport specific APIs by them.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib153e2c5806b7276915a0aa91179fe9dbcb2a1f0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a prepration to unify buffer management among transports.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6b1c208207ae3679619239db4e6e9a77b33291d0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466002
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a preparation to unify buffer management among transports.
struct spdk_nvmf_request already has SPDK_NVMF_MAX_SGL_ENTRIES (16) * 2
iovecs. Hence incresing the number of buffers twice will be no problem.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idb525abbf35dc9f4b8547b785b5dfa77d106d8c9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465873
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
One of stop conditions in data WR release function was wrong. This
can cause release of uncompleted data WRs. Release of WRs that are
not yet completed leads to different side-effects, up to data
corruption.
The issue was introduced with send WR batching feature in commit
9d63933b7f.
This patch fixes stop condition and contains some refactoring to
simplify WR release function.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie79f64da345e38038f16a0210bef240f63af325b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466029
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Purpose: Reduce the recv/readv system call.
Method: Use a big recv buffer to conduct the read.
Though it will introduce addtional buffer copy,
we hope that the overhead introduced by buffer copy will
be smaller compared with frequent recv/readv system call overhead.
And the design is to make a trade off between them.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9286fd9cec0b512cea8e3f2c335c5bf862b98573
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464842
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Purpose: Prepare the further optimnization in the
target side whening receving pdu headers, we expect
to use zero copy.
Change-Id: Iae7f9106844736d7160d39d0af1f5941084422ec
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465380
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This follows the practice of RDMA transport and is a preparation to
unify buffer allocation among transports.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib85625f2a0eca01ef4028685dd838d6c41faad7b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465872
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This follows the practice of RDMA transport and a preparation to
unify buffer management among transports.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4e9b81b2bec813935064a6d49109b6a0365cb950
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465871
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a preparation to the next patch to use spdk_mempool_get_bulk.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I28a5ad941004f139c9032d85c2ef92680081f1ce
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465870
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This follows the practice of RDMA transport and is a preparation to
unify buffer allocation among transports.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I3cd4377ae31e47bbde697837be2d9bc1b1b582f1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465869
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
FC transport can use buffer cache as same as RDMA and TCP transport
now. The next patch will factor out getting buffers and filling
buffers to iovs in nvmf_fc_request_alloc_buffers().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d7b4552f6ba053ba8fb5b3ca8fe7657b86f9984
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465868
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is a preparation to the next patch to use buffer cache in
FC transport.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I116b064ea0b0a437f9a3293a6f3d46a0e5fc8ecf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465867
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This follows the practice of RDMA transport and a preparation to
unify buffer management among transport types.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic7dc8e6b826baf7f471d192630e8a048a35056ac
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465866
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
NVMe-oF FC transport have used its own buffer pool and have not used
common buffer pool yet.
It looks that there is no particular reason to prevent FC transport
from using the common buffer pool.
This patch removes FC transport specific buffer pool and changes
FC transport to use common buffer pool instead. Add transport
as a parameter to nvmf_fc_request_free_buffers() because similar
APIs of RDMA and TCP transport do that.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iae3a117466c21eaddbe78a8e8023d80ef37bb3e9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465865
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
NVMe-oF FC transport have used its own buffer pool and have not used
common buffer pool yet.
It looks that there is no particular reason to prevent FC transport
from using the common buffer pool.
This patch extract checking fc_req->data_from_pool from
nvmf_fc_request_free_buffers() to make the transition easier.
fc_req->req.iovcnt and fc_req->req.data should be cleared regardless
of fc_req->data_from_pool. Hence extract them into callees.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I36420f0e573d1ec3f9f3a75f6b2ced82ade89dd3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465864
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
NVMe-oF FC transport have used its own buffer pool and have not used
common buffer pool yet.
It looks that there is no particular reason to prevent FC transport
from using the common buffer pool.
This patch adjust the setting of the FC transport specific buffer pool
to the common buffer pool to make the transition easier.
Large alignment requirement consumes more memory but is acceptable.
Cache size calculation looks dated.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id3224b65f39187c4d8e99c00cf54b1cfdd902250
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465863
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
All of the RPCs in lib/nvmf/nvmf_rpc.c rely on knowing which nvmf_tgt
they should work with. They have historically relied on the assumption
that there will only be a single target in a given application. This is
true for the example application in the spdk repo, but it is not
necessarily true generally,
By adding an option tgt_name parameter to the RPCs we enable them for
multi-target NVMe-oF applications. We also further reduce the coupling
between the library and the example application.
Change-Id: I03b6695da05a42af3024842ed87d2ce2c296f33f
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465442
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
There are one or two RPCs that deal with application specific
configuration. We can leave these there for now.
Change-Id: I9c40aa3403d32d3e2214c8c904fb1c414ad99967
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465365
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function will allow applications (and RPCs)
to obtain an spdk_nvmf_tgt pointer by name.
Change-Id: I82792e06a819e06d9fddb5429830008653d92cd1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465349
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will provide a unique identifier which can be used to provide get
and set methods within the RPCs.
Change-Id: Idd144e99e49b8d26530f60530d2e908b18fa251b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is necessary to allow the spdk_nvmf_tgt structure to evolve over
time without having to further change the target API.
Change-Id: Ib0f0f9b1f190913feff0229c96df4e84b1bf35f7
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465363
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
As part of moving the nvmf rpc code to the library, we will need to make
it more inclusive of use cases outside of the example spdk nvmf_tgt
application. That application only supports a single nvmf target
structure. As such, many of the RPCs have this assumption built into
them.
In order to enable the multi-target use case, we need to configure a way
to translate between user supplied RPCs and actual target objects in the
library.
Change-Id: I5d3745afe9c2ca1c33f6e1a1bcc2b8bb3196ccd6
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465329
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Purpose: To reduce the duplicated code.
And one minor fix: add an empty line between two functions
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I12c9ddba6526c094cd2bd945e14f9d8bf5209adf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464504
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
From rdma_cma.h "Users must destroy any QP associated with an
rdma_cm_id before destroying the ID."
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I5ed0c25221c5401cdde8b31a4e217b9d79e7caaa
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464290
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Purpose:
1 Do not caculated the psh_len every time.
2 Small fix, for ch_valid_bypes, and psh_valid_bytes,
we do not need to use uin32_t.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9b643da4b0ebabdfe50f30e9e0a738fe95beb159
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464253
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We should be checking directly against the base of the iov when doing
memory map translations. The current behavior is to check against the
starting address of the buffer which is a close address, but not exactly
the same.
Change-Id: I7f65224a6836a814708438f2866d84ae22882216
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463893
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: <jiandong.zheng@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In case of failure during resource allocation within poll_group_create
there is a lack of return statement which could lead to NULL ptr
dereference.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I84abe64a1843117d76b97e62656bdfc4fe2b35d8
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463195
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_sock_group_poll() and spdk_sock_group_poll_count() had returned
0 on success. The implementation didn't match the specification
described in the header file, and couldn't be used to collect stats
correctly because 0 means idle.
This patch fixes the return value of spdk_sock_group_poll() and
spdk_sock_group_poll_count() to return number of events and
the callers not to overwrite the return value by 0.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7e2a17187fc74ea44d3acf2f35d63f5e5a254eda
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463710
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
NVMf statistics functions use spdk_get_io_channel function to get a
poll group. It increases reference counter in io channel and causes
problems on application exit. spdk_put_io_channel calls were added to
release the channel.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I832d1eae346c3bc3858ed0ed063ff7a7a897a2f5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463389
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
- New files and updates to existing SPDK files to add the NVMf-FC transport.
- Depends on an existing low level driver library. This driver is not part of SPDK repository.
- Makefile updates to build FC transport (using CONFIG_FC)
- Update configure script for FC build.
- New FC unit test for FC-LS commands.
- Update unittest.sh to run FC unit test (when built).
Signed-off-by: John Barnard <john.barnard@broadcom.com>
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Change-Id: If31d4d25feab76c2dbe90a7faf71d465c2c3a354
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450077
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Phenomenon:
Test case: Using the following command to test
./test/nvmf/target/shutdown.sh --iso --transport=tcp
without this patch, it will cause coredump.
The error is that the NVMe/TCP request in data buffer
waiting list has "FREE" state.
We do not need call this function in
spdk_nvmf_tcp_qpair_flush_pdus_internal, it causes the
bug during shutdown test since it will call the function
recursively, and it does not work for the shutdown path.
There are two possible recursive calls:
(1)spdk_nvmf_tcp_qpair_flush_pdus_internal ->
spdk_nvmf_tcp_qpair_process_pending ->
spdk_nvmf_tcp_qpair_flush_pdus_internal ->
>..
(2) spdk_nvmf_tcp_qpair_flush_pdus_internal->
pdu completion (pdu->cb)
->..
-> spdk_nvmf_tcp_qpair_flush_pdus_internal.
And we need to move the processing for NVMe/TCP requests
which are waiting buffer in another function to handle
in order to avoid the complicated possbile recursive
function calls. (Previously, we found the simliar
issue in spdk_nvmf_tcp_qpair_flush_pdus_internal for
pdu sending handling)
But we cannot remove this feature,
otherwise, the initiator will hang for waiting the
I/O. So we add the same functionality in spdk_nvmf_tcp_poll_group_poll
function.
Purpose: To fix the NVMe/TCP shutdown issue.
And this patch also reables the test for shutdown and bdevio.
Change-Id: Ifa193faa3f685429dcba7557df5b311bd566e297
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462658
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch adds measurement of time request spends from the moment it
was polled till completion.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I1fcda68735f2210c5365dd06f26c10162e4ddf33
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/445291
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds statistics for pending state in NVMf RDMA subsytem
which may help to detect lack of resources and adjust configuration
correctly.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I9560d931c0dfb469659be42e13b8302c52912420
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452300
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
RDMA polling statistics: number of polls and number of completion
entries returned.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: Iabcf2cb6f6a35f595b89b58cdfcd177a637dda13
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/445289
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds transport part to nvmf_get_stats RPC method and basic
infrastructure to report NVMf transport specific statistics.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: Ie83b34f4ed932dd5f6d6e37897cf45228114bd88
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452299
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Global tgt->discovery_log_page may contain old hostnqn log
page, so we will update the discovery log page if the offset
is zero.
Change-Id: Iba24409b16626d157d2782c6813fe5a0c27f1082
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463123
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <shahar.salzman@kaminario.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Initiator can use `nvme discover` command to display all
the subsystem's information, because we don't check
the allowed HOSTNQN for Discovery service, so here
adding this feature so that only return the log pages
to the allowed hosts.
Fix issue #576.
Change-Id: I51e6770bd67ea0b41caf9de3a8899923377e6255
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
When creating a new controller in the NVMe-oF target, hostnqn is
a must parameter, so we save the hostnqn to controller data
structure, and it can be used to verify the access right of
Discovery service.
Change-Id: I86a6f50d3209d5bbb8ac85508288173d826ea216
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462439
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
qpair structure is freed and an error code is returned to the caller in the case of failed qpair initialization in function spdk_nvmf_rdma_qpair_initialize (e.g. bad return value of rdma_create_qp).
The return code is handled by nvmf_tgt_poll_group_add function which destroys the qpair for the second time.
This patch fixes#857
Change-Id: I0773652ecccbbd634ad272106e0a93c1e591d7d2
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462011
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Lorne Li <lorneli@163.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Func spdk_nvmf_rdma_destroy_defunct_qpair is a "last chance option"
to destroy qp manually if some driver/hardware doesn't drain qp's
failed wr as expected.
There's a probability that ibv_poll_cq polls wr of the destoryed qp
after spdk_nvmf_rdma_destroy_defunct_qpair's execution. Although in
practice the risk of this situation is minimal(if not non-existent),
add a log here so that we could detect this situation easily.
Change-Id: Ifa9534397513bcea34c18fbb8168eef8f53599c1
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462441
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently rqpair will be destroyed directly in ibv_poll_cq path
if it has been drained, regardless of whether there are outstanding
I/Os issued to bdev layer. So after outstanding I/Os completing,
spdk_nvmf_rdma_close_qpair will be called from nvmf layer, accessing
a destroyed qp.
This path defers qp destruction in nvmf_rdma_destroy_drained_qpair
func until nvmf layer closes qp.
Fixes 851
Change-Id: I8bcce66f8053ddb105702ac603d5d73af54bdcfc
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461237
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
move the staement location of TCP request setting and remove
the duplicated code.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia659756185547ff4f8aa26c5bc01f63defe6c113
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462589
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This priority is used to differentiate the sock priority on the TCP connections
between NVMe-oF TCP target and other TCP based applications.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I6ee294e647420b56d1d91a07c2e37bf34ce24e03
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461801
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
spdk_dma_*malloc() is about to be deprecated.
Change-Id: I5bcac50baca785255eb068086e67c07d120b042f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_dma_*malloc() is about to be deprecated.
Change-Id: Ic42db528bbae4b3ca2e91cb9ac46def99ecb5f28
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459431
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
No need to have it under lock. Additionally in case of failure
there was a lack of rdma_destroy_id(). This is addresed within this
change as well.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Idbb36d51ad4ef7ef81051463f56efc87ef00c966
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462054
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In case of failure during pd or map allocation freeing list of devices
was missing.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: If62f7b072f3894fd1a7e856c19b4ea51646dd20e
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462079
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In case of pd allocation by nvmf hooks there is a lack of null
check as oposed to pd allocation by ibv_alloc_pd.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Iead6e0332bdee3da4adb6e657af298215c4e2196
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461576
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch adds statistics for BDEV IO pending state in NVMf subsytem
which may help to detect lack of resources and configure pool size
correctly.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I6c60c27efe3efed194b2d2c46a707af7c2808fe9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/445290
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This patch adds number of admin and IO queue pairs per poll group in
NVMf statistics. It can be useful to troubleshoot load sharing issues.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I2a9c0fc99cf5d0729eb130d30540ae52b5207fc9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/445288
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This patch adds nvmf_get_stats RPC method and basic infrastructure to
report NVMf global and per poll group statistics in JSON format.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I13b83e28b75a02bc1dcb7b95cbce52ae10ff0f7b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452298
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Now that the resume path can correctly handle the case where a namespace
was removed and a new one added with the same nsid, this no longer needs
to be asynchronous.
Change-Id: I693045e66a7d4e75255b526d8f5ca5ef8695533e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459606
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Set DIF context of the corresponding request to PDU when
- processing in-capsule data of the command,
- processing data of C2H PDU, or
- processing data of H2C PDU.
Change-Id: I3a668a55be21dbe2ee6ecf26476290670bd7b4a8
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458929
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
When NVMe/TCP initiator transfers in-capsule data, NVMe/TCP has to
process it as in-capsule data. If DIF insert/strip is enabled,
in-capsule data size will be increased by NVMe/TCP target to insert
metadata. However size of in-capsule data buffer had not been
increased, and buffer overflow occurred when NVMe/TCP initiator
transfers in-capsule data to NVMe/TCP target with DIF insert/strip
being enabled.
This patch increases size of in-capsule data buffer size to store
metadata. 16 byte metadata per 512 byte data block is the current
maximum ratio of metadata per block.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I88b127efd7a945bde167a95df19a0b9175cb8cd0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461333
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
We updated readv_offset before generating DIF to avoid adding
the temporary variable _rc in the previous patch, but that caused
write error when inserting DIF.
Fix the bug in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id0788280a83cbea2554c851db77751432fc00cba
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461116
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When handling the capsule command header, call spdk_nvmf_request_get_dif_ctx
by passing the NVMf request and the reference to the DIF context, and set
the flag dif_insert_or_strip of the NVMf/TCP request to true.
spdk_nvmf_request_get_dif_ctx returns false immediately when the
corresponding NVMf controller disables DIF insert/strip.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I16f6b322f2692d5f9653d011a490e7929ec37365
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458928
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
When the NVMf controller's flag dif_insert_or_strip is enabled, DIF is
inserted for write I/O and stripped for read I/O, and the corresponding
NVMe-oF initiator should not be aware of the DIF setting of the
backend bdev.
Hence this patch hides the DIF setting of the backend bdev
when the flag dif_insert_or_strip is enabled.
Change-Id: I3c14880c2e94cba7f76b1bca78afb36bfe884e26
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456731
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The first idea was that the caller of spdk_nvmf_request_get_dif_ctx()
should check if the current transport enables DIF insert/strip before
calling spdk_nvmf_request_get_dif_ctx().
But NVMf controller knows if DIF/insert/strip is enabled now by the
previous patch. Hence spdk_nvmf_request_get_dif_ctx() checks if the NVMf
controller enables DIF insert/strip at its head.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I78253d356b694800c3a9a9608514df58e0c631a6
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461314
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Add a flag dif_insert_or_strip to struct spdk_nvmf_ctrlr that indicates
whether DIF insert/strip is done.
Copy the DIF insert/strip setting of the corresponding transport options
to the flag at NVMf controller creation.
The purpose of this patch is to make DIF insert/strip not per-transport
option but per-controller option because we may want to be able to
control DIF insert/strip per controller at some point. Besides this patch
will clean the implementation.
Besides align indent around the change.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I57f65960b430e55f4021ed514aacd85581ff9993
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This patch is used to do the following work:
1 It is optimized for NVMe/TCP transport. If the qpair's
socket has same NAPI_ID, then the qpair will be handled
by the same polling group.
2. We add a new connection scheduling strategy, named as
ConnectionScheduler in the configuration file. It will be
used to input different scheduler according to the customers'
input.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ifc9246eece0da69bdd39fd63bfdefff18be64132
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454550
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add the optimal poll group get function.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia9e57c6924a6563d79269cf535814883e83698cd
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454549
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
By capturing this pointer onto the stack, we inform the compiler
that we don't expect it to change. That allows the compiler to
generate more efficient code.
Change-Id: I0f3ff9373662198e915269c4498e4902a2cdb808
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459754
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
This signals to the compiler and analysis programs that this
won't change during iteration, so it may produce better code.
Change-Id: I478c0c9445d4ddf8a69ab1b3deaf628b82a0eaea
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459753
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
This is a place holder and subsequent patches will use the option
dif_insert_or_strip and provide JSON RPCs to configure it.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7e3fbb1d49c47647a9a0a1a2149152801591b283
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456452
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add a helper function to get DIF context when the passed NVMf request
is for I/O queue, NVMe read, write, or compare command, and its NSID
is valid.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I796c20607c7b64a8be85da5131c5ea95ffd9f8e4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458713
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Add a helper function to get necessary DIF information and set
them into the passed DIF context and return. This function will
be called only when the specific requirement is satisfied and
the caller will be added in the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic435886ca936a211f34278b813f547ffa43b9000
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458712
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
When DIF is inserted or stripped,
- in the TCP transport layer, we can use LBA based length throughout, but
- in the NVMf controller layer and BDEV layer, extended LBA based
length must be used, and NVMf controller gets the length from
tcp_req->req.length.
Hence by adding and using two variables, elba_length and orig_length
to struct spdk_nvmf_tcp_req, set the extended LBA length to
tcp_req->req.length before calling spdk_nvmf_request_exec(), and then
restore the original LBA based length to tcp_req->req.length after
calling spdk_nvmf_tcp_req_complete().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9309b8923c6386644c4fd8ef3ee83a19f5d21ce5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458926
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If tcp_req->dif_insert_or_strip, increase the length from LBA based
to extended LBA based by using its own DIF context.
Change-Id: Ie9f5cf757328dda795b43a7b6c70a72259865115
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458925
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The next patch will extend the length from LBA based to extended
LBA based and use it as buffer length to insert or strip DIF.
So cache sgl.unkeyed.length at the top of spdk_nvmf_tcp_req_parse_sgl
and use it throughout.
Besides, one unrelated change-the-line to improve the readability
is included.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2a1dc9379bb5671ec80b5b478504c9879a4f0fff
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458924
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Generate and insert DIF to each data block when reading more than a single
byte.
This update is very similar with the use case of spdk_dif_generate_stream
in iSCSI target.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I063919a32153ac0daf6d6eb1836c0d5995b65d33
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459092
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Load reservation information based on ptpl configuration file, and
restore the information to NS data structure.
Change-Id: I5f46d49a6d1e6e49aab93ca7cd654469a3a08659
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455912
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If DIF mode is local and C2H data is extended LBA payload, DIF should
be verified just before sending the payload.
Add a helper function nvmf_tcp_pdu_verify_dif and call it in
spdk_nvmf_tcp_send_c2h_data after completing nvme_tcp_pdu_set_data_buf.
When nvmf_tcp_pdu_verify_dif returns error, treat the error as fatal
transport error because the error is caused by the target itself.
Handle the fatal NVMe/TCP transport error by terminating the connection
as described in the NVMe specification.
On the other hand, data digest error is treated as a non-fatal transport
error because the error is caused outside the target. This is reasonable.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9680af2556c08f5888aeaf0a772097e4744182be
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458921
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
I used pahole to see whether the alignment of the structure
is reasonable. After reorgnization, we can saved 16 bytes and 1
cacheline according to the information by pahole.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I1347e7c582fe2b00707e2841690b87d53cc61e33
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460572
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Using naming rules consistent with other related libraries is helpful
to ensure the quality as verified by this patch series.
This patch changes a few parts to use iov and iovcnt for SGL operations.
Besides, name of an array points to the head of the array and is
constant. So copying name of array to an another pointer is not
necessary and can be removed.
Change-Id: I2324f28126b3088098c1c767cf6c060f22c175c3
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Previously we had used nvme_tcp_pdu_set_data() for incapsule data.
This patch changes handling incapsule data to use
nvme_tcp_pdu_set_data_buf() as same as H2C and C2H.
This unification is necessary to support DIF insert and strip
in NVMe/TCP target later.
Change-Id: I02cae8db94e51cf79a354dd64ad45f0e491ec08e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455920
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
NVMe/TCP target had assumed the size of each iovec was io_unit_size.
Using nvme_tcp_pdu_set_data_buf() instead removes the assumption
and supports any alignment transparently.
Hence this patch moves nvme_tcp_pdu_set_data_buf() to
include/spdk_internal/nvme_tcp.h and replaces the current code to use it.
Besides, this patch simplifies spdk_nvmf_tcp_calc_c2h_data_pdu_num()
because sum of iov_len of iovecs is equal to the variable length now.
We cannot separate code movement (lib/nvme/nvme_tcp.c to include/
spdk_internal/nvme_tcp.h) and code replacement (lib/nvmf/tcp.c)
because moved functions are static and compiler give warning if
they are not referenced in lib/nvmf/tcp.c.
The next patch will add UT code.
Change-Id: Iaece5639c6d9a41bd35ee4eb2b75220682dcecd1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455625
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
And also add spdk_sock_group_get_ctx function
Change-Id: I2a2a58b0588ff7d99d3538ea0a633a3b8c7a234b
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454538
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Maximum size of NQN is already defined to be SPDK_NVMF_NQN_MAX_LEN,
and hence use fixed size string whose size is SPDK_NVMF_NQN_MAX_LEN
+ 1 for spdk_nvmf_vhost::nqn.
This change will reduce the potential malloc failure.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2b9c7cc21200b3e88b5485ebfdcd5040bc6e3589
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459742
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add file based reservation information definition, the data structure
can be used to store all the reservation information to a json
based configuration file, and enable this feature with REGISTER
command.
Change-Id: Ic93cfc5934a4ad96f11b96ec77bacb877edf6c10
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455909
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In our previous code, we will handle all the PDU until there is
no incoming data from the network if we can continue the loop.
However this is not quite fair when we handling multiple connections
in a polling group.
And this change is setting a maximal NVME/TCP PDU we can handle
for each conneciton, it can improve the performance. After some
tuing, 32 should be a good loop number. Our iSCSI target uses
16.
The following shows some performance data:
Configuration:
1 Command used in the initiator side:
./examples/nvme/perf/perf -r 'trtype:TCP adrfam:IPv4 traddr:192.168.4.11 trsvcid:4420'
-q 128 -o 4096 -w randrw -M 50 -t 10
2 target side, export 4 malloc bdev in a same subsystem
Result:
Before patch:
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MiB/s Average min max
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51554.20 201.38 2483.07 462.31 4158.45
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51533.00 201.30 2484.12 508.06 4464.07
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51630.20 201.68 2479.30 481.19 4120.83
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51700.70 201.96 2475.85 442.61 4018.67
========================================================
Total : 206418.10 806.32 2480.58 442.61 4464.07
After patch:
Starting thread on core 0
========================================================
Latency(us)
Device Information : IOPS MiB/s Average min max
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57445.30 224.40 2228.46 450.03 4231.23
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57529.50 224.72 2225.17 676.07 4251.76
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57524.80 224.71 2225.29 627.08 4193.28
TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57476.50 224.52 2227.17 663.14 4205.12
========================================================
Total : 229976.10 898.34 2226.52 450.03 4251.76
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I86b7af1b669169eee2225de2d28c2cc313e7d905
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459572
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
By now (5.1 is released), the Linux kernel initiator supports the
success optimization and further, the version that doesn't support
it (5.0) was EOL-ed. As such, lets open it up @ spdk by default.
Doing so provides a notable performance improvement: running perf with
iodepth of 64, randread, two threads and block size of 512 bytes for 60s
("-q 64 -w randread -o 512 -c 0x5000 -t 60") over the VMA socket acceleration
library and null backing store, we got 730K IOPS with the success
optimization vs 550K without it.
IOPS MiB/s Average min max
549274.10 268.20 232.99 93.23 3256354.96
728117.57 355.53 175.76 85.93 14632.16
To allow for interop with older kernel initiators, we added
a config knob under which the success optimization can be
enabled or disabled.
Change-Id: Ia4c79f607f82c3563523ae3e07a67eac95b56dbb
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457644
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
If users set the persist through power loss configuation file,
that means the Namespace has the capability to support ptpl
feature, here we added a ptpl_activated flag to indicate that
the users enable the feature or not. Users can use Set features
or Reservation Register commands to change the value.
Change-Id: Iae3fd44085c5be5bf9574e49efa567e8212dee20
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455906
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Vhost testing crashed from Nightly testing, because a member
access within null pointer of type 'struct ibv_send_wr'.
Change-Id: If8f34f23864883ea73516d2d1fe3b30137c04316
Signed-off-by: Hailiang Wang <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458913
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
SQSIZE parameter validation in Connect command was broken because QID
field in qpair was used before intialization.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Change-Id: I8a0b359937d661df3b9888e6084e7d0b4a9056ea
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455667
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
NVMe bdev module support separate metadata now but NVMf subsystem
cannot process bdev with separate metadata yet.
Hence reject any bdev with separate metadata to be attached
explicitly by this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I793c6c5f61deb766d7bf427ff67ccc57a48974cf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457167
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
For reservation feature in NVMoF, we can't support the persist through
power loss feature, now we will add the configuration file parameter
with Namespace, after users set the configuration file parameter with
one NS, then the PTPL feature can be enabled.
Change-Id: Id72699093f7e68318b9529f7bacc5c9804f7f86b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455905
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Unsetting this flag will decrease the number of WRs retrieved during CQ polling and will decrease
the oeverall processing time. Since RDMA_WRITE operations are always paired with RDMA_SEND (response),
it is possible to track the number of outstanding WRs relying on the completed response WR.
Completed WRs of type RDMA_WR_TYPE_DATA are now always RDMA_READ operations.
The patch shows %2 better peformance for read operations on x86 machine. The performance was measured using perf with the following parameters:
-q 16 -o 4096 -w read -t 300 -c 2
with nvme null device, each measurement was done 4 times
avg IOPS (with patch): 865861.71
avg IOPS (master): 847958.77
avg latency (with patch): 18.46 [us]
avg latency (master): 18.85 [us]
Change-Id: Ifd3329fbd0e45dd5f27213b36b9444308660fc8b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456469
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For the nvme device, I/Os are completed asynchronously. So we
need to check the outstanding I/Os before putting IO channel
when we hot remove the device. We should be sure that all the
I/Os have been completed when we change the sgroup->state to
PAUSED, so that we can update the subsystem.
Fix#615#755
Change-Id: I0f727a7bd0734fa9be1193e1f574892ab3e68b55
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452038
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
According to the TP 8000 spec in Page 26:
Maximum Number of Outstanding R2T (MAXR2T): Specifies the maximum
number of outstanding R2T PDUs for a command at any point in time
on the connection.
Note that by the spec, the target may only support single r2t
(which is the minimum possible), it doesn't have to use multiple r2ts
even if the initiator supports that. So remove the maxr2t and
pending_r2t variable in the tcp qpair structure.
In the original design, we think that maxr2t is the maximal active
r2t numbers for each connection. So if the initiator sends out maxr2t=16,
it means that all the commands of a qpair can use such number of R2T pdus.
So we need to wait for the available R2Ts for the request when the maxr2t
reaches the maximal value. But it is the wrong understanding of the spec.
In fact, each command has its own number of maximal r2t numbers, then we
do not need to use the wait method for R2T method anymore. So we remove
the state TCP_REQUEST_STATE_DATA_PENDING_FOR_R2T. Futhermore, we adjust
the related SPDK_TPOINT_ID definition.
In current patch, the target will support one active R2T for each
write NVMe command. Thus, we remove the function spdk_nvmf_tcp_handle_queued_r2t_req.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7547b8facbc39139b4584637ccc51ba8b33ca285
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455763
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This matches the Linux kernel target. Users can
still decrease this default when creating the
transport (i.e. -p option for nvmf_create_transport
in rpc.py).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icad59350a2cd35cfc4ad76d06399345191680c05
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454820
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We could run into issues with this if we were using an arbitrarily large
amount of cores to run SPDK.
Change-Id: Ia7add027d7e6ef1ccb4a69ac328dbdf4f2751fd8
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452250
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will help us not iterate through the whole list of connections when
only some of them have pending recvs.
Change-Id: I681bc98befbdda4e77ef333b7a086c08b2708eb3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449266
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will help save MMIO overhead. Especially in the SRQ case.
Change-Id: I6fb70cf6de4763450f97961f41ccdce3acec2e63
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449265
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
By creating a list of qpairs, we can avoid looping over every connected
qpair to process sends each time we poll.
Change-Id: If24bbc363176f52fbfb756d56719edd885a21a11
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449264
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
By batching ibv sends each time we poll, we can reduce the number of
MMIO writes that we do.
Change-Id: Ia5a07b0037365abfa8732629c34d34a9ed49ac70
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449253
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It isn't obvious why hosts are being disconnected at the moment.
Change-Id: I5515ba40883ccb20921d0da013b27670212bf649
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453034
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There are cases where srq can be a detriment. Add a flag to allow users
to disable srq even if they have a piece of hardware that supports it.
Change-Id: Ia3be8e8c8e8463964e6ff1c02b07afbf4c3cc8f7
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452271
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The individual transports will adjust these sizes when
necessary. In fact, we have to remove this check, since
RDMA transport may adjust the io_unit_size based on the
max number of SGEs - and can adjust it to a value that
will fail this check if we reload the configuration.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2708c7f5aaa54a368ec932ec40dd6447f1a4fde0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452474
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This restriction helps reduce the amount of padding when
printing out the event trace, allowing it to fit in a
small number of columns.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa31e5a6967c7b9bc7028069effb71533f80596f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452736
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This was not used by any of the trace register descriptions.
Let's remove it rather keeping it around if we don't need it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idda809e2911db5be555ff6aa13695484a14bf665
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452734
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
During connect call based on queue type (AQ or IOQ), SQ size should be
validated against max sq size for that particular queue type.
Change-Id: I977d7556e4d04e37004d16c87efffd3b467fa62c
Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452376
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We can use the rpoller->srq to check if a qpair is valid when processing
recv completions.
Change-Id: I6aa360adc48a3312ddcf79f10e2a65b502a7314f
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452247
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Mempools are based off of a ring structure which allocates its elements
as a power of two. It also only exposes n-1 elements to the user. So
when we create a mempool with 2^n elements in it, we have to allocate a
ring with 2^n+1 entries. By decreasing the number of elements in these
key mempools by 1, we can save a decent amount of memory.
Change-Id: I942c9dd4cf59096969bc2559fb46fd2084a07f09
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448875
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It used to be that we would get async events very infrequently. However,
with the introduction of SRQ, this number has gone up tremendously.
Change the way we report our these events so that we don't spam/confuse
people running the target.
Change-Id: I33070281fa854cbc17784d61bbbb870196ca8780
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452159
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If there are outstanding recvs for a qpair when it is destroyed, we need
to clear the qpair from it before reposting it. Otehrwise, we have a
potential heap-use-after-free of double free (depending on whether the
recv completion is in error state or not).
See github issues #730
Change-Id: Ic2009c761cbcc5e89174f62fbd0872d0489c67ca
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452122
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
At the RDMA level, we allow processing requests that should contain a
data transfer, but specify a length of zero to be passed up the stack
without a data buffer. See spdk_nvmf_rdma_request_get_xfer. In the case
of the reservation requests, we weren't checking whether req->data was
NULL before trying to copy into it causing us to segfault if we got a
malformed reservation request.
Found when using the fuzzer.
Change-Id: I320174ec72a8d298ab6ca44ef6a99691631f00ca
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451786
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's not an error if the NVMe hard drive was formatted to 512 + 8 but
has no protection type, so we will also disable the protection for
NVMoF target.
Change-Id: I07e605cff9545f46c642f7ca783a4727a26abece
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451926
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We were setting this value in the target from our initiator, but it
turns out the rdma_conn_params struct is responsible for setting the
opposite side so we need to add it in the target side when accepting
connections.
Also, add a test to demonstrate target functionality when we overwhelm
the SRQ. It is useful to note that performance really tanks when you
start overwhelming the srq so it may be useful to use this test case to
check performance gains in edge cases over time.
Change-Id: Iac541bd9fc1d82eca9f21e7abc3f625663a6c460
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451678
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make the use of spdk_uuid_compare() to be consistent in the file,
also change the SPDK_INFOLOG to SPDK_DEBUGLOG to avoid the
repeated log messages for RESERVATION CONFLICT response.
Change-Id: I72fefbd520cefcaf25182c3ca3d21e3d87d17e94
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450884
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Now Host can get an asynchronous event notification when
registrants were unregistered/preempted or reservation was
released from the associate namespace, Host can send
get log page to clear related log pages and reservation
report to get the full overview of current reservation
configuration.
Change-Id: Idc57c19812490c7536503308989871515e9f2361
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/439935
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In case of a DSM Deallocate (unmap) with multiple ranges, individual
bdev IOs are submitted for each range. If the bdev IO cannot be
allocated, the request is queued on io_wait_queue; however previously
submitted ranges may complete before memory is available for the next
range. In such a case, the completion callback will free unmap_ctx,
while the request is still queued for memory - causing a segfault
when the request is dequeued. To fix, introduce a new field tracking
the unmap ranges, and make sure the count is nonzero when the request
is queued for memory.
Signed-off-by: Yair Elharrar <yair@excelero.com>
Change-Id: Ifcac018f14af5ca408c7793ca9543c1e2d63b777
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447542
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We were incrementing over the end of the descriptor list and assigning
undefined values to the rsp opcode in SEND_WITH_INVAL case. We were only
hitting this error when mixing sgl and inline requests in the same
workload. We were just by chance hitting a four bit value that was set
to all 1s from the in capsule data from the last request.
Change-Id: Ied06356f3d22fa34a2cd869dfad6bdca8720791d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450873
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was not being properly set in the multi-sgl path.
Also add a verification step to the fio configuration file to prevent
against future regressions.
Change-Id: I510b6acd92bc2fbc9b6fbec1d59945cc53584ad3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reservation notification log page can be returned via the
get log page command with correct page number, users can
get zeored page buffer if the controller didn't have any
reservation notification log.
Change-Id: I99f5e4b8917a6919eb68359628efa1bead4b21b5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/439934
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
All the reservation commands are processed on subsystem's thread,
however the reservation notice log are controller related, and
the get log page command with reservation page will be processed
on controller's thread, so we use the same thread for generating
the log.
Change-Id: Ie000320d74242b979f6638d703523f063347ec29
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449852
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Existing code only update the subsystem's poll group reservation
information when unregistering the key, however, new registrant
and update the key actions also need to be updated.
Change-Id: Ib8db9eb457977757251403edb92eda073b846e59
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451274
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Rqpair qp and resources maybe not be created, if rqpair fail to
initialise. For example, in function new_qpair, the code run to
spdk_nvmf_qpair_disconnect, but rqpair is initialised in
poll_group_add.
Fix#557 segmentaion fault(core dump)
Change-Id: I1892e6d13e2d53dd5a7c4856d775f9b3b85da961
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450986
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Hailiang Wang <hailiangx.e.wang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If rsp->status.sc != SUCCESS and xfer == DATA_CONTROLLER_TO_HOST,
We would not send the data WR, so clean the num_outstanding_data_wr.
Fix#728
Change-Id: I32259788e495ed76f8f02a9d871bd56356d93dc4
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450726
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
A host can use the Asynchronous Event Command to be notified of
the presense of one or more avaiable reservation notification
log pages. A reservation notificaton log page should be created
whenever an unmasked reservation notification occurs.
Change-Id: I8b83e5319725286dd0a5efc1b22d8ac4673e31e1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/439931
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Nvmecli tool doesn't add parameter check when submitting
to NVMf target, so we add additional check in NVMf target
to prevent such cases.
Change-Id: Ieb2b3b3c22d71913f2743a0f9cdad4aba184c320
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Existing condition for updating subsystem poll group's reservation
information is wrong, when received the RELEASE command, the
reservation type may be changed to none, but it will not be
saved to the subsystem's poll group.
Change-Id: Idc177a0f03fb9611d6eda1e25a5b90caaa73d1be
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450727
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Should speed up operations, and allows us to remove the 16 byte link
object from the request structure.
Change-Id: Ie62df1f44d22580a7a7ae41c498295841d1e3064
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448080
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Purpose: To support the multiple SGL later.
Change-Id: I133a451100b736353cf98a6aaca879d290ff5b67
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448259
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function will be exteneded later for multiple SGL
support.
Change-Id: I1f6962ec03c72e335efaa311a12d3891312fcc53
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449968
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We were only using one enum from this whole struct, so there is no need
to store it. Plus the queries we use to update it are so infrequent and
only occur during connect and disconnect so I think we can save quite a
bit of space by removing this without compromising performance.
Change-Id: Icf29977a3c10cb289564fa2760a0059f07a0f8cb
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448072
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There was a lot of duplicated code here between states. I'm trying to
minimize the duplicated code without making it confusing.
Change-Id: I13183431e554c8a9f501b3385bbd7b59e2c83161
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448066
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Catch an edge case where a multi sgl request is longer than the allowed
transfer size.
Change-Id: I79779050fe951d16f1240e2c3d8cf5037e576ea2
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/440766
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is important to avoid thrash when we don't have enough buffers to
satisfy a request.
Change-Id: Id35fd492078b8e628c2118317f674f07e95d4dba
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449109
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The buffers are really specific to the request and not the wr or data
object. In the case of multiple wr requests, the maximum number of
buffers per req is equal to the number of SGEs in the NVMe-oF request
*2.
Change-Id: Ic59498bfed461d180adb2fb9a481ac5b11fa9252
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449108
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Enable parsing an nvmf request that contains an inline
nvme_sgl_last_segment_descriptor element. This is the next step
towards NVMe-oF SGL support in the NVMe-oF target.
Change-Id: Ia2f1f7054e0de8a9e2bfe4dabe6af4085e3f12c4
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/428745
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It's not standard to put a newline here - let's use a null
character instead.
Found while using nvme-cli - when creating a subsystem with
default serial number, the right justified callout text had
an extra newline in it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8a81dafb4f6c30f7bf2dcebfa7a5b19cfe3ab5fc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449645
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Changing i to iovcnt in all references to the req->iov structure will be
important when we start processing multi-sgl requests.
Change-Id: I90a9b6d872b94f846ae7d29a45dd2703eafa6175
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449201
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be used by the multi-sgl version of this function as well.
Change-Id: Iafeba4836a77482fa2a158f86f1c17fe7fdeb510
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449104
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The filter function can be used for IO commands, because all
the Admin commands related with reservations are not supported
in SPDK for now.
Change-Id: I44f0bf0017bafaee87d5f8ac03b0fd368f44c810
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436941
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This event is generated by NICs utilizing the SRQ feature when the last
RECV for that qpair is processed. I have confirmed this feature.
Change-Id: Ib6d6b6d02987f789b4d5dd3daf734e3351ee1974
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448063
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
With ASAN to run this cases, it will report issue about heap used after free
in spdk_nvmf_rdma_qpair_destroy. Resources have been released before,
change the order to in this tailq to release resources.
ERROR: AddressSanitizer: heap-use-after-free on address
0x6080000080e0 at pc 0x0000006e1e3f bp 0x7fd48b6c3df0 sp 0x7fd48b6c3de0
READ of size 8 at 0x6080000080e0 thread T3 (reactor_1)
0x6e1e3e in spdk_nvmf_rdma_qpair_destroy spdk/lib/nvmf/rdma.c:813
Change-Id: Ia1c12bca84955a2de60399e6b265c9b8901bb51e
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448534
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Now data structure spdk_nvmf_subsystem_pg_ns_info holds all the
reservation information from the associate namespace, so for the
IO processing routine we don't need to send a message to the
subsystem's thread to check the IO command is permited or not.
Change-Id: Ib6be6abf7bf5f24c230dff80c163a1eb963e20d0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448256
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Each subsystem's poll group will have a copy of namespace's
reservation information, for those NVMe commands which may
change the reservation state, the commnad itself should be
returned after updating each subsystem poll group's
reservation state. Then it's safe to check the reservation
state in each poll group's thread.
Change-Id: I64a5baedee9024bcac3957b29eb0330a20f21684
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446213
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Not doing so can cause us to hit asserts during the shutdown path. This
should fix an intermittent failure we are seeing on the test pool where
we hit the assert rdma_req->state != RDMA_REQUEST_STATE_FREE in
spdk_nvmf_rdma_request_process.
Note that this problem doesn't cause any data corruption when debug is
not enabled, it just causes us to probcess a subset of commands through
the state machine one extra time suring qpair shutdown.
Change-Id: Ibc36bfea87ec4089b8e2c7a915f48714fddb0b09
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447843
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will become important later on.
Change-Id: I94e5af03359e476afbc68664e43f44269ad5974c
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448074
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When we have a shared receive queue, the number of outstanding items
associated with a completion queue is deterministic, and limited by how
many RECVs we have total in the SRQ. So, we can set the total size of
the Completion queue at the beginning of time and never resize it.
Change-Id: I787e4c5bbd52ac8948a323d1301f926f887cd91c
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447492
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Consolidating error paths is common practice in SPDK so do that here to
make the function more uniform and save space.
Change-Id: I98c5d5f7feeb688f1d8b24f4d2d3461a43d00c1d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448191
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Array channels in the subsystem's poll group are indexed by
nsid - 1, so rename the previous num_channels to num_ms
makes more sense. Also embed the channels into a namespace
data structure here, and this can be reused in the following
patch.
Change-Id: If5d9aab4b1d5bcf7a3c22f29fa58d84752f0d4cc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446211
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This unifies the clean up path between SRQ and normal
operation.
Change-Id: I396d7e3749579f27b5bb1e89b9d6761a77ba5beb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446979
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Depending on whether SRQ is enabled, resources may be allocated
to the rqpair or to the rpoller. Create a struct to hold these
pointers that can be used in both locations to avoid duplicated
code.
Change-Id: I2c8fc59009201d9e41721e6462a81732b529a9e0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446978
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Eugene Kochetov <evgeniik@mellanox.com>
This wasn't used anywhere.
Change-Id: I405af3c808be284d19218f3f04c1e90e33e31de8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446977
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The purpose is to use the single readv to read both
the payload the digest(if there is a possible one).
And this patch will be prepared to support the
multiple SGL in NVMe tcp transport later.
Change-Id: Ia30a5e0080b041a65461d2be13db4e0592a70305
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447670
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We were only using one pd per device anywas, and this is necessary for
shared receive queue support.
Change-Id: I86668d5b7256277fe50836863408af2215b5adf9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447385
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Both Mellanox and Soft-RoCE NICs work with this approach.
Change-Id: I7b05e54037761c4d5e58484e1c55934c47ac1ab9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446134
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Persist through power loss feature is not supported for now.
Change-Id: Id2a5088389dc28b9d28d88c04ff819d20ea11902
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436940
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For number of registered controllers field in Reservation
Status Data Structure, we caculate all the controllers
in the subsystem which Host Identifier are same with
existing registrants.
Change-Id: Ib4de22c7020dbd8294f448f23c0c5c8c142629dd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436939
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The possible issue could be following if you shutdown NVMe-oF target
with TCP transport as an example,
=================================================================
==61022==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 560 byte(s) in 1 object(s) allocated from:
#0 0x7ffff6efcfe0 in calloc (/lib64/libasan.so.3+0xc6fe0)
#1 0x4c6216 in spdk_nvmf_tcp_listen /home/ziyeyang/spdk/lib/nvmf/tcp.c:680
Indirect leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7ffff6efcfe0 in calloc (/lib64/libasan.so.3+0xc6fe0)
#1 0x4a77b8 in spdk_posix_sock_create /home/ziyeyang/spdk/lib/sock/posix/posix.c:291
After checking the issue, it seems that we did not call
spdk_nvmf_transport_stop_listen when removing the subsystem listener.
And this patch can solve this issue.
Change-Id: Ic75d99cb0c6a3ba1c47ac79a2d8e3887b0f6b012
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447020
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The reservation holder may release the reservation on
a namespace, release notification feature is supported
in comming patches.
Change-Id: If5d3158e691fcc782f7cf0b67a326bf62edf0531
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436938
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Unregistering by a host may cause a reservation held by the host
to be released. If a host is the last remaining reservation holder
or is the only reservation holder, then the reservation is released
when the host unregisters. This may occur with Acquire/preempt
and Register/unregister commands.
Change-Id: If59fe2fdaa69c8ad70f364618d6c281494ad6245
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446821
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A registrant can obtain a reservation on a namespace by executing
acquire command. Acquire command is associated with specific namespace.
For now only Acquire and Preempt reservation acquire action is
supported, Preempt And Abort will be supported in future.
Change-Id: Ifcbb6b414827393ffc266ceada5982b743716321
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436937
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reservations can be used by two or more hosts to coordinate
acccess to a shared namespace, host must register to a namespace
prior to establishing a reservation. Unregistering by a host
may cause a reservation release, this feature will be supported
after reservation acquire patch.
Change-Id: Id44aa1f82f30d9ecc5999a2a9a7c20b2af77774a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436936
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Borrow the ideas from iSCSI and optimize
the nvme_tcp_build_iovecs function.
Change-Id: I19b165b5f6dc34b4bf655157170dec5c2ce3e19a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446836
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Not all RDMA drivers fail back the dummy recv and send operations that
we send to them when destroying a qpair. We still need to free the
resources from these qpairs to avoid eating up all of the system memory
after multiple connect and disconnect events. Since we won't be getting
any more completions, the best heuristic we can use is waiting a long
time and then freeing the resources.
qpair_fini is only called from the proper polling thread so we can safely
call process_pending to flush the qpair before closing it out.
Change-Id: I61e6931d7316d1e78bad26657bb671aa451e29f4
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/443057
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In the error path, we were first decrementing a variable and then
asserting that it must be >0. These operations should occur in the
opposite order.
Change-Id: I6cec544faf17bb75cbfca3d3a3c173dc5db14f99
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/446440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When the decision was made to uncouple the number of shared buffers from
the queue depth and allow the user to decide for themselves, the default
was also significantly lowered, which caused some issues when trying
torun performance tests (See https://github.com/spdk/spdk/issues/699).
While this is a user modifiable variable, it is still best to keep the
higher default value.
The original value was equivalent to max_queue_depth *
SPDK_NVMF_MAX_SGL_ENTRIES * 2 with the defaults for max_queue depth and
max_sgl_entries being 128 and 16 respectively. Hence 4096
fixes: 0b20f2e552
Change-Id: I809e97a10973093a2b485b85bca7160091166f70
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/446525
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
I think this simplifies the process a little bit.
Change-Id: Icc87a59c9f6fd965ef35531975b7036d85c4bc95
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We were only using one value from this array to tell us if the qpair was
idle or not. Remove this array and all of the functions that are no
longer needed after it is removed.
This series is aimed at reverting
fdec444aa8 which has been tied to
performance decreases on master.
Change-Id: Ia3627c1abd15baee8b16d07e436923d222e17ffe
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since we no longer rely on the state queues for draining qpairs, we can
get rid of most of them. We cn keep just a few, and since we don't ever
remove arbitrary elements, we can use stailqs to perform those
operations. Operations on Stailqs carry about half the overhead as
operations on tailqs
Change-Id: I8f184e6269db853619a3581d387d97a795034798
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch expose backend's bdev's PI setting to the corresponding
NVMe-oF Initiator by Ideintify command, and removes the check if
block size is 512 multiple.
These change enables NVMe-oF Initiator to send extended LBA payload.
Change-Id: Ia7aa8332d36f056872a515b6da90c83112edb909
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/445056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If the current recv_state of qpair is same with the state to be set,
we will print error message. And checked the current code,
we should add a check to avoid this.
Change-Id: I49334f637c48e565e785d1fe6d0f000e18b2048a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445653
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Purpose: solve the coredump issue for the buffer
return later in spdk_nvmf_tcp_request_free_buffers.
If keep this statement, we cannot return the buffer
to the polling group.
Change-Id: Ib5c95ba54b37540950e654110fe6317cab507076
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445435
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Error logs in nvmf_rdma_dump_request lead to report error about
address points to the zero page, add judgement to return.
this issue occurs in heavy load fio testing.
Change-Id: I50302be88b3af53f718e3800aa16df7c506ca4e8
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441110
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
From TP8000 spec 7.4.7,
"In response to a C2HTermReq PDU, the host shall terminate the connection.
If the host does not terminate the connection in an implementation specific
period that does not exceed 30 seconds, the controller may terminate the
connection on its own".
It means that the timeout is designed for: when the target is
sending out C2hTermReq, if the host does not terminate the connection,
the target should terminate the connection.
PS: For detecting the malicous connection without sending response
(such as no response of R2T PDU) which should be another patch.
Change-Id: I586dbb235d99aeab5d748a19b9128cd8b0cef183
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The persistence feature can't support for now, but as the features
are mandatory for reservation, so add the two function here, and
we can enable it with future patches for power loss persist feature.
Change-Id: Ic358eda00058809bbfd6984b0861f8b6b5aabecd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438213
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When this structure was brought up to the generic layer, the tcp
transport was using max_io_size and the rdma transport was using
io_unit_size. In the interest of conserving memory, we should use
io_unit_size instead of max_io_size.
Change-Id: I2633306fcbfd8c3d557445959c745cb2d9a0999e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We should never be going over these limits in the respective transports,
but add asserts to check this during testing.
Change-Id: Ifcaa82ccf58546a38020b31df54ee5d1d9822b8b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442777
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This intermediate state is unused and meaningless. the qpair transitions
into this state right before calling a synchronous operation and then
transitions to active as soon as that operation completes successfully.
If the operation did not complete successfully, we were leaving qpairs
in this weird intermediate state when for all intents and purposes they
had reverted to an uninitialized state. Keeping qpairs in the
uninitialized state until they have been added to a poll group creates a
meaningful distinction between states that can be actionable from the
transport level.
Change-Id: I6de9bc424b393b6fff221aa2f4212aaa91488629
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443471
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Connections in the uninitialized state haven't been added to a poll
group yet, so submitting dummy requests to them will be pointless since
they will never be polled. We need to reject the connection and destroy
the qpair immediately.
Change-Id: Id5dd711882e1ae7c13ae32c06da2285186b00a1b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443470
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since there are multiple events/conditions that can trigger a qpair
disconnection, we need to funnel them to a single point of entry. If
more than one of these events occurs, we can ignore all but the first
since once a disconnect starts, it can't be stopped.
Change-Id: I749c9087a25779fcd5e3fe6685583a610ad983d3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For devices that support fewer SGE elements than our default values, we
need to adjust the I/O unit size so that we don't ever try to submit
more SGLs than we are allowed to.
Change-Id: I316d88459380f28009cc8a3d9357e9c67b08e871
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This value was not being decremented when we got SEND completions for
write operations because we were using the recv send to indicate when we
had completed all writes associated with the request. I also erroneously
made the assumption that spdk_nvmf_rdma_request_parse_sgl would properly
reset this value to zero for all requests. However, for requests that
return SPDK_NVME_DATA_NONE rom spdk_nvmf_rdma_request_get_xfer, this
funxtion is skipped and the value is never reset. This can cause a
coherency issue on admin queues when we request multiple log files. When
the keep_alive request is resent, it can pick up an old rdma_req which
reports the wrong number of outstanding_wrs and it will permanently
increment the qpairs curr_send_depth.
This change decrements num_outstanding_data_wrs on writes, and also
resets that value when the request is freed to ensure that this problem
doesn't occur again.
Change-Id: I5866af97c946a0a58c30507499b43359fb6d0f64
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The typical rdma qpair disconnect function goes through the function
_nvmf_rdma_disconnect_retry. When this function was introduced, it was
discovered that we could receive a qpair disconnect event for a given
qpair before that qpair had been assigned to a poll group. In order to
ensure that the disconnect procedure completed properly, we waited on
the current thread in _nvmf_rdma_disconnect_retry for the qpair to be
assigned a poll group before we finally disconnected. see rdma.c:2250.
Since _nvmf_rdma_disconnect_retry was not necessarily called from the
poll group's thread, we relied upon the assumption that the group
variable would never be set back to NULL. See the comment on rdma.c:
2243.
However, in _spdk_nvmf_qpair_destroy we were setting the group back to
NULL. This operation can result in the following set of operations
across multiple threads that prevent a qpair from ever being fully
destroyed.
1. thread 1: receive a disconnect event - call nvmf_rdma_disconnect
2. thread 1: from nvmf_rdma_disconnect call
spdk_nvmf_rdma_qpair_inc_refcnt - setting rqpair->refcnt to 1.
3. thread 2: call spdk_nvmf_rdma_poller_poll.
4. thread 2: in spdk_nvmf_rdma_poller_poll reap a completion with an
error status which causes us to call spdk_nvmf_qpair_disconnect -
rdma:2846
5. thread 2: spdk_nvmf_qpair_disconnect calls _spdk_nvmf_qpair_destroy which sets
qpair->group = NULL
6. thread 1: from nvmf_rdma_disconnect we call
_nvmf_rdma_disconnect_retry which checks if qpair->group == NULL. If
that is the case, we assume that the qpair has not been assigned a group
yet and send ourself a message to call _nvmf_rdma_disconnect_retry again. see rdma.c:2253
7. thread 2: from _spdk_nvmf_qpair_destroy we call
spdk_nvmf_transport_qpair_fini which results in a call to
spdk_nvmf_rdma_close_qpair. which sends dummy send and recvs to the
qpair.
8. thread 2: we call poller_poll and get completions for both the send
and recv dummy requests. This results in a call to
spdk_nvmf_rdma_qpair_destroy.
9. thread 2: spdk_nvmf_rdma_qpair_destroy checks rqpair->refcnt and when
it sees that it does not = 0 (see step 2 above) it returns without
freeing the resources. see rdma.c:629
10. thread 1: we keep churning in _nvmf_rdma_disconnect_retry sending
ourselves messages because rqpair->group is going to be null. Thread 1
never reaches line 2257 where it sends a message to call
_nvmf_rdma_qpair_disconnect. _nvmf_rdma_qpair_disconnect is the function
that decreases the rqpair->refcnt and allows us to make forward progress
on destroying the qpair.
I encountered this issue while trying to disconnect from our target
using the kernel initiator with an x722 NIC. I think the timing on this
bug comes out with that specific configuration because come of the calls
in the disconnect path on thread 1 fail causing it to take longer giving
a chance to the second thread to delete the qpair.
There are really two issues at play here. We don't have a single point
of entry for disconnecting RDMA qpairs, and we rely on the qpair->group
variable never being set back to NULL. This patch addresses the second
issue, and the next patch in the series addresses the first.
Change-Id: I65395d0bbb67edfa7bad2ddc70906606c3d83781
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This doesn't fix any bug, but it makes more sense to leave the qpair
in the NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY state until it
receives at least one byte.
Change-Id: Ic5f34a733a80b58f65a1334fae7e07dbded2b3d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The management channel was used in the RDMA transport prior
to the introduction of poll groups and made its way over to
the TCP transport when it was written. Eliminate it in favor
of just using the poll group.
Change-Id: Icde631dd97a6a29190c4a4a6a10a0cb7c4f07a0e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
max_read_depth should be based on max_qp_init_read_atomic, or the
maximum number of read values that the initiator will accept as
outstanding.
The device attributes object contains values for both the initiator
(remote side) and the target (local side). All attributes with the name
init in them are meant to correspond to the initiator. The
qp_read_atomic value represents the number of reads and atomic
operations that can have this device as the target. qp_init_read_atomic
represents how many read operations the initiator has said that we can
have outstanding that have the initiator's rdma device as the target.
Since this number represents how many outstanding reads we will send to
the initiator at once, we should use the qp_init_read_atomic value.
Change-Id: Iacc044e8321080de8accd9128ac3777bbb948afc
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442409
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a holdover from before poll groups were introduced.
We just need a per-thread context for a set of connections,
so now that a poll group exists we can use that instead.
Change-Id: I1a91abf52dac6e77ea8505741519332548595c57
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442430
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The READ and ATOMIC in the comment above are capitalized, so
make this all caps too.
Change-Id: I49fae2ceb826b22953d9b26d42b95f17e2dac617
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442427
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
request.c didn't have much code, so let's collapse
it into ctrlr.c and make that the place where all
software emulator of the NVMe controller, including
request handling, is done.
Change-Id: Id7c98010cb222a414a5aa0b78bfb299a0ffc418f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440592
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously, all I/O commands were implemented by simply
passing them to the bdev layer. Now, some I/O commands will
be emulated. Prepare for that by moving the code for this
function to ctrlr.c, where the emulation will occur.
Change-Id: Id34e5549e5ce216d602fb347b4506fbd324eed4e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440591
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was previously very unmap specific. Make at least the top level
DSM call more general purpose by eliminating the unmap_ctx.
Change-Id: I9c044263e9b7e4ce7613badc36b51d00b6957d3a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440590
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These are left over from the removal of virtual mode over a year ago.
Change-Id: Ia797c4570bf9090346ff22ab9c7d719a78d023d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440589
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This was only used by the target, and it didn't actually need it.
Change-Id: Ibcef410165efdc16077da24419580ed51b087d70
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This type was actually two entirely different types for
the initiator and the target, so just make it void.
Change-Id: I15512d9d4efd790dce0fa4323b7230de66144bc6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When a connection goes to close and has no I/O outstanding,
the current_recv_depth was being decremented beyond 0 and rolling over.
If the poll group then finds a successful receive completion on the next
poll (for a command that arrived prior to starting the disconnect but
hadn't been processed yet), it would trip the max queue depth check
added recently and start another disconnect process. If only one command
arrives in this window, everything actually works out ok.
However, if there are two receive completions sitting in the completion
queue after the disconnect process is started, the first one does the
double disconnect and the second one does another disconnect which ends
up dereferencing a null pointer.
Since there is always a special reserved slot for the dummy recv, don't
do decrements or increments of the current_recv_depth for the dummy
recv. This allows the code to still enforce the actual max_queue_depth
on recvs without underflowing or overflowing the counter.
Change-Id: I56c95b2424e956a3b007b25c50cbf47262245b8f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Since we have different requirements for submitting RDMA read and write
operations, we should track them separately so that we don't block
writes when the device does not have enough resources for read
operations.
Change-Id: I5d6424c0e26f2f5362866d1bb21eb46700c245da
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Before, the number of WRs and the number of RDMA requests were linked by
a constant multiple. This is no longer the case so we need to make sure
that we don't overshoot the limit of WRs for the qpair.
Change-Id: I0eac75e96c25d78d0656e4b22747f15902acdab7
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This gives us more realistic control over the number of requests we can
submit.
Change-Id: Ie717912685eaa56905c32d143c7887b636c1a9e9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441606
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
rw_depth was a misinterpretation of the spec. It is based on the value
of max_qp_rd_atom which only governs the number of read and atomic
operations. However, we were using rw_depth to block both read and write
operations which is an unnecessary restriction. write operations should
only be governed by the number of Work Requests posted to the send
queue. We currently guarantee that we will never overshoot the queue
depth for Work requests since they are embedded in the requests and
limited to a size of max_queue_depth.
Change-Id: Ib945ade4ef9a63420afce5af7e4852932345a460
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441165
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be necessary later on when we need to throttle send and recv
requests in software.
Change-Id: Ifb25eaabd15e101fbfc2959a08a321f80857b280
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441604
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Both initiator and target are using the minium 10 seconds
timeout value, so set it in kas field when initializing
the controller.
Change-Id: Idda68bdfe27613ebaf706a0de497145d3f9ed766
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441995
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently, the code does not comply with the spec,
so remove such code for 19.01 and will add the code
which complies with the spec for 19.04
Change-Id: Icd3b2573fbc46dc2fa7a00c6672c23ea01ffe0ee
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/441985
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If there is socket read error, we should directly disconnect
the socket instead of set the tqpair into RECV_ERROR state.
When it is in ERROR_RECV state, it does not mean that
we should close the socket immediately.
Change-Id: I975906653c13eb3fa5195799c517015435176785
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/441830
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Assigned CQ size when creating CQ may run over due to
heavy workload with too many qpairs. Enlarge it dynamically
can prevent IBV_EVENT_CQ_ERR caused by CQ's runover.
This patch fixes issue #498:
https://github.com/spdk/spdk/issues/498
Change-Id: I6c2d7194d4147d812d49d4fe787fcba5c6bbede9
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440853
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This change was provided by GitHub user vikasbrcm to fix issue 562.
I am uploading his change to facilitate testing of the issues and
possibly get it merged before the 19.01 window closes.
Change-Id: I58fb1058f68c6c02006ceed6e577be627e6dbc09
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441611
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The controller shall treat a Keep Alive Timeout in the same manner
as connection loss. If the Keep Alive feature is in use and the
timer expires, then the controller shall:
1, stop processing commands and set the Controller Fatal Status
(CSTS,CFS) bit to '1';
2, terminate the NVMe Transport connection;
3, break the host to controller association;
A timer poller is added to each subsystem to monitor timeout event.
Change-Id: I001afab8a6764f30c39df37fa96384180d117486
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch will solve the following two cases:
1 Free the pdu resources. Add the checkout of c2h_pdu_data_cnt of the qpair.
2 Do not recyecle the req accoriding to the pdu in the send_queue, but directly
recylcing the reqs in TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST state.
Change-Id: I5856c3421019ec49d576d3dae4c62fefbb3925ca
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440847
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Fix potential bug. In _spdk_nvmf_subsystem_add_ctrlr(), befor free(
ctrlr) we should free ctrlr->qpair_mask. Because we set qpair->ctrlr
= NULL, when destroy qpair the qpair_mask is not released. For the same
reason, req->qpair->ctlr = ctrlr is placed at the bottom of the function.
Change-Id: I38e268b532ff3ce87721c02f15ac4f674856d103
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440858
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This change is related to enabling multi-sgl element support in
the NVMe-oF target.
For single SGL use cases, there is a 1:1 relationship between
rdma_requests and ibv_wrs used to transfer the data associated with
the request. In the ingle SGL case that ibv_wr is embedded inside of
the spdk_nvmf_rdma_request structure as part of an rdma_request_data
structure.
However, with Multi-SGL element support, we require multiple
ibv_wrs per rdma_request. Insted of embedding these
structures inside of the rdma_request and bloating up that object, I
opted to leave the first one embedded in the object and create a pool
that requests can pull from in the Multi-SGL path.
By leaving the first request_data object embedded in the rdma_request
structure, we avoid adding the latency of requesting a mempool object
in the basic cases.
Change-Id: I7282242f1e34a32eb59b55f326a6c331d455625e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/428561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Purpose: To avoid the buffer contention among different
polling groups if there are multiple core configurations
for NVMe-oF tcp transport.
Change-Id: I1c1b0126f3aad28f339ec8bf9355282e08cfa8db
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It was possible to leak pollers if we had multiple devices in the
transport. The new err_exit path fixes this.
Change-Id: Iafd5643c67fae741113f10afe761af1988cb6a9b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439419
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is implemented at a generic level.
Change-Id: Ibf8167e828f8da27cc26cd04e611c3f3c084319a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440418
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is used to dump the requests state if
the tqpair's resource is not freed.
Change-Id: Ic4780662558d73267d4f1ebabfc22780fafec4ec
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440846
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This is shared between all currently valid transports. Just move it up
to the generic structure. This will make implementing more shared
features on top of this a lot easier.
Change-Id: Ia896edcb7555903ba97adf862bc8d44228df2d36
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch series is geared at solving github issue 555.
Ultimately the goal of this series is to add a per-poll-group buffer
cache to prevent starvation.
Change-Id: I8ddaa47487665c2f9adce2109eb71b8fa71a7927
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is an attempt to clean up requests sititng in the
waiting_for_buffer state before destroying it for good.
Change-Id: I8ae047e4d7fd01f30419ae346e4da49355dc033d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440127
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This allows us to drain all of the pending requests from the qpairs
before we destroy them, preventing them from being picked up on
subsequent process_pending polls.
Change-Id: I149deff437b4c1764fabf542cdd25dd067a8713a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440428
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Due to qpair timeout handling refactoring,
we removed the qpair destroying related code.
And this patch is submitted to address this issue. With
this patch, we can detect sock close of the fd from
the initiator, and correctly free the qpair related resource
(e.g., pid) managed by nvmf layer.
Otherwise, the initatior thinks the qpair related source is
freed, however it is not freed in the target side.
Change-Id: Ia2de07bd849fa5d3bc0e0e0d4941464dfd16d266
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440242
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The purpose of this patch is to remove the duplicated code
used in spdk_nvmf_rdma_request_free
Change-Id: I3f74466a7ec788000eff9c2a75c9ea2cacaf5cc2
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/439942
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The nsid field can be used for per namespace basis
reservation notification.
Change-Id: Ia7212020ec893ea367afe79933e1629895fe41b8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439930
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Observed some issues related with AER in the testpool,
which states that the subsystem is not ready. So change
the check, which will be more accurate. We only did not
allow the subsystem in inactive state or deactivitating
state. For others, we can still queue the requests.
Change-Id: Ic041298dfc5f7d7bfab5f5e5314ade377273df32
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439797
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When the host connects the target and does the io related job,
if we use ctrlr + c, it will be crash. The issue
is that we found the rqpair->qpair.group is NULL.
Change-Id: Id36cfac2be9abc707bf75a2e1ddb3f414610b6f1
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/437232
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This operation is not attached to a send request so we need to put the
request into the completed state right away since there is no send
associated with it during the draining process.
Change-Id: I294f99950b00a584d8940bb4f93ac046c478d3b3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We found ibv state value may be unreasonable, so before we
use the state value we do some judgement. The unreasonable
state probably means hardware issue, so the process flow
become unpredicatable.
Fix GitHub issue #508.
Change-Id: I213f4d684b103cce7bc072aecd591e2c491e0596
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/436920
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If this happens, we have something going seriously wrong and we need as
much debug information as we can get.
Change-Id: I305512790461443316b9f231fa2afeb69593af1b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438097
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We do not want to present those subsystems which are not
ready.
Change-Id: I7f5c171fbac4c31d839421e37e93e62569c0e87a
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/437222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The least needed data buffer number should only
be larger for completing one RDMA (read/write RDMA).
Change-Id: I44eb51db279fc055f687eb78b6a642dbb5cb23f3
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/437808
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Previously, we allocate the buffer size according
to the MaxQueueDepth info, however this is not exactly
a good way for customers to configure, we should provided
a shared buffer number configuration for the transport.
Change-Id: Ic6ff83076a65e77ec7376688ffb3737fd899057c
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/437450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This makes the timeout check for each qpair in the group
efficient. If there are many qpairs in the group, we
can scale.
Change-Id: I75c29a92107dc32377a2ef7edb5ac92868f1c5df
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/435277
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reason: I checked the code in different transport,
the qpair is already freed, so we dot need to set
any state.
Change-Id: I3d78c259c3f79ea4426dc9408e5c3469bc171358
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/437493
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Remove the unnessary fields in spdk_nvmf_tcp_transport
Change-Id: I632608ba654b30f3511f5e1d925c6743c9100365
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/437271
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
At least in case of RDMA transport, poll_group_create (spdk_nvmf_rdma_poll_group_create)
can return error (NULL).
Change-Id: If1576b3515e7f9ede76af08bfa6b1c8399dcda09
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/436887
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Check for QP reference counter in RDMA QP destroy function was wrong
and QP resources were never released.
Change-Id: I6ab0ce39452e8263f89589d138c90f749516ebb1
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-on: https://review.gerrithub.io/436974
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
The ctrlr may be NULL, so we need to add a check here
to present segment fault.
Change-Id: I6c5361cc829af065082a95df0b8cc2f8d49a6002
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/436950
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For TCP/IP transport, we need to remove the socket
from the polling group since we do not want to keep the
tgroup info in the NVMe/TCP qpair, it should be general.
Change-Id: I4b064d8378f66ea5d91ac554fe628d9ccebd07f4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/434128
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since we already make the recv state handling in a correct
way, so we do not need this check any more.
Change-Id: Id71ab2e0ef60be302f8cf6ea776259d7312663ec
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/436896
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is a failsafe for finding and reporting data buffers that span
multiple Memory Regions. These errors should never be triggered, but
finding and reporting them will help any debugging.
Change-Id: I3c61e3cc510f5a36039fc1815ff0de45fce794d5
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/436054
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Discover commands previously blindly used the nvmf_req->data structure.
This only works if the entire command fits in a single contiguous
buffer. commit 1d9be84bfd changed the default buffer size such that
this would become a problem for as few as 8 subsystems.
Fixes github issue 525
This change may also help prevent data corruption as we were copying up
to nvmf_req->length data into the buffer. For requests with multiple
data buffers this can cause us to copy off the end of that buffer.
Change-Id: I788259da988b2458f57ee2795e1c5d3ced8803dd
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/435544
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The purpose of this patch is to fix the issue when there is no
data buffer allocated, the previous method is wrong to set the
recv pdu state.
The reason is that:
1 When there is no data buffer allocated, we still need to handle
the incoming pdu. It means that we should switch the pdu recv
state immedidately.
2 And when there is a buffer, we resume the req handling with the
allocated buffer, that time we should not switch the pdu receving
state of the tqpair.
Change-Id: I1cc2723acc7b0a17407c3a2e6273313a4e612916
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/436153
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The usage of this list is duplicated with
the state_queue[TCP_REQUEST_STATE_DATA_PENDING_FOR_R2T]
list of tqpair, so remove it.
Change-Id: I7a67a5c8049bb9492bf97e0d60e2040a29b0a7e4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/436274
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Fix the issue in both target and host sides.
Change-Id: I1bf31072b2164a3035b443fe6c5418a6a7829d81
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/436099
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously, this field is used to optimize the code.
When we receive the capsule cmd pdu, we need to allocate
the related buffer, if there is read or write request.
If the related buffer is not valid, then we cannot enter
the next pdu handling phase. So we use this field to mark.
After carefully checking the code, I think that we use
the tcp_req which is assoicated with the pdu, thus it is
efficient.
Change-Id: Ic1634d706dd40a706269bce199bf6031ea0462c0
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/435995
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>