This is a preparation to support shared IO CQ case, and we will
create/delete SQ/CQ separately, so define the queue state as the
first step.
Change-Id: Ie7b5807dc4aa5a2c117e15f61f3a9baa60135653
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10529
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add dtrace probes aroung qpair/controller/subsystem management
to help with debugging issue #2055.
Change-Id: I0b981bffadee3fe4172ad6916c059bf357959dde
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10237
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Chaining may be faster, but this is really an implementation detail of
the idxd driver. Push the decision on how to implement a vectored crc
down into the individual drivers and eliminate it from the generic
framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iedbdc5a6dbd3f7d1674d0a83f6827588f4b6b2fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10291
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This uses a batch with the fence flag for now. There are several other
implementation options that will be explored in the future.
Change-Id: I4f344d671400508de05f80b026d42f775c5b9588
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10289
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Compare two scattered memory regions
Change-Id: I6ce5c9e7bc1ee1ef0e9173c00e86628d43a1e41f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10287
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Each portion of the discovery log has a header which
includes a 'genctr'. This number indicates the
current generation of the discovery log. If this
number changes during the process of fetching the
discovery log in multiple chunks, wait for the
current fetch to complete, but then start over.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5f8623593b7f935eecc37a98daf92e7d8c0dd566
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Return if outstanding_commands > 0. This reduces
indentation for the rest of the code in the
function and simplifies the diff for an upcoming
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I49f09eff7c0908829819e6b797c922211c56e7db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It makes it easier to read the logs, as the state values are printed as
integers.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I70a9e8860401c18e9305a5fc5771df0bc564d337
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10800
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds support for using zero-copy operations to execute IO
requests in the TCP transport. Of course, they're only used if the
underlying bdev supports them. Additionally, only requests with no
in-capsule-data can be executed using this mechanism.
Added several new states to accommodate for the difference in a way
zero-copy is handled. Also, these flows very depending on the type of a
request (read or write). It stems from zero-copy semantics: to perform
a write we need to wait for zcopy_end completion, while for reads
zcopy_end can only be submitted once we send all of the requested data
to the host.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie02b494c24bc1acc98557cb4b02e867abf9064e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10796
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Zero-copy requests are kept on the outstanding queue for the whole
duration of the request - from the initial zcopy_start submission to the
completion of zcopy_end. This means, that there's a period in which a
request doesn't wait for a completion from the bdev layer, but is still
on the oustanding queue (after zcopy_start callback, before zcopy_end
submit). If a qpair gets disconnected while a request is in this state,
we need to manually force its completion, as otherwise it might hang
indefinitely (e.g. waiting for host data).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I53731b8e363b725efa564ca3c7d89b46f5fb2a24
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10793
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
The zero-copy requests can also be queued when a subsystem is paused, so
we need to properly resume and submit them by using zcopy_start.
Since only requests that haven't received the zero-copy buffer (i.e.
before zcopy_start was called) can be queued, we don't need to bother
with checking zcopy_phase.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie629688f6961eb2ae05741df496720b91be4d80d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10792
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If there is still some inflight IO which prevents
vhost_session_stop_done(), stop_poller can try
within 4 seconds, and then call vhost_session_stop_done
with -ETIMEDOUT.
This can avoid endless blocking in ctrl pthread if there
is no response from vhost session or its backend bdev.
Then spdk vhost target can still serve all other vhost
devices and operations besides the error one.
Change-Id: I2fc78b4da926c936a2e42dc0e66ce1c60001330d
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10393
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It ensures that we decrement io_outstanding counter for requests for
which zcopy_start failed. Also, removed a note stating that such
requests are reverted to regular IO path, as this is not the case.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8eaee88abfd94b73614b367fef9bb938c9962617
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10791
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since spdk_bdev_zcopy_end() cannot really fail (it only fails if we pass
a bad bdev_io), we can simplify the nvmf zcopy_end functions by making
them void and always expect asynchronous completion.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6e88ac28aba13acadea88489ac0dd20d1f52f999
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10790
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since this path now supports sending zero-copy, use it for zcopy_start.
Additionally, it makes it possible make zcopy_start void, as it reports all errors
asynchronously via request_complete(), and remove some of the duplicated
error checks.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I41f43ce1651432d9a7d74e3680d4a3f780128a1d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10789
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If a request gets resubmitted and is completed immediately (i.e. the
processing function returns SPDK_NVMF_REQUEST_EXEC_STATUS_COMPLETE), the
upper layer needs to be notified via spdk_nvmf_request_complete().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I236d6e68d1d7d83afdaa30d8bb07e2b133f43155
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10788
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Additionally, the NVMe completion status is now updated and the IOs are
queued if the bdev layer doesn't have enough IO descriptors. It makes
the zcopy operations behave similarly to the other IO operations.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I455ae781e32aa6e60d144d2c91f109bd8be46664
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10787
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It makes their names consistent with the bdev API.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I314051f0980b46959d6560aa25885f13b4c28f2a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10786
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The zcopy_start requests are now executed through
nvmf_ctrlr_process_io_cmd. It makes the zero-copy share checks with the
regular IO path.
Note, that zcopy_end doesn't utilize this path and is directly submitted
to the bdev layer, as it doesn't need to perform these checks (they were
already verified in the accompanying zcopy_start).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic9ffecb15e7c351a9d60e731cc711d6500b845db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10785
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will allow the zero-copy requests to share more code with the
regular IO path.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idb57447694903a56d984f95aa2f1a3bddc3e4e82
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10784
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It will make it possible to submit zero-copy requests through
spdk_nvmf_request_exec().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibc14fe77cd477b11ed55d1350a7486caaad81add
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10783
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The code should never reach these functions for requests using
zero-copy.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If9f30e05a43b340a982604d5b985242d63ce252b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10782
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's more descriptive that way, as it's clear the function works on a
single request. Also, passing a request instead of zcopy_phase makes it
more convient to use.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If4d7b087511e128f3590ac7b3b5adcb8ace12003
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10781
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It makes it possible for the user to specify whether a transport should
try to use zero-copy to execute requests when possible.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I40a92b0d7a6707f4c9292795f380846acb227200
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10780
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These set_level/get_level functions and REGISTER(log),
have nothing to do with log_flag.c, why we put them there.
I think we might as well put them to log.c.
And "define MAX_TMPBUF 1024", repeated.
Change-Id: I5ade71b923d61446a5f81f0d2f26fdc4a3057f02
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10923
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We already use destructor functions in env_dpdk to do
some cleanup at process exit, so let's also add one
to call rte_eal_cleanup. This ensures all hugepage
files are freed before the process exits.
Fixes issue #2267.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8c93f503d77f35717b3d18a63ea49b31789dbc00
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10983
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Let user pass a name of tracepoint group. Currently the only way to
enable traces with '-e' option is to pass the tpoint mask, which is
cumbersome. This patch modifies our API to accept strings as parameters.
Example:
-e nvmf_tcp:2,thread
enables nvmf_tcp's second tracepoint and the whole thread tpoint group.
Modified spdk_trace_enable_tpoint_group() - it will be also used in
the changed form later in the series to accept tpoint mask when using
RPCs to activate/deactivate traces.
Change-Id: I6b02363cce3b44b0b578877bc2505f5a4e2fffdd
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10818
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make trace_create_tpoint_group_mask() an external function.
This is going to be used in following patch.
Change-Id: I06cd1652bb30abddd49536bc76ec134a01121537
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10830
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
we don't remove the socket fd from socket group when
nvmf_tcp_poll_group_add() return error, and when
closing the socket there is an assertion.
This was found via llvm_nvme_fuzz via TCP transport.
Change-Id: Ib4ab6fc3fc5e2bc6a9545f6ce854bae8f1157fd5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10849
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If blobs held in a blobstore are opened a lot, lookup
by RB_TREE will be much more efficient.
Change-Id: I7075b95c597a958e7bb10890f803191309532021
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10917
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
We recently improved qpair disconnect process and added assert
if we get a completion without any error when a qpair is disconnected.
However unexpectedly we saw this case very often when we ran the test
test/nvmf/host/multipath.sh for the real hardware in the test pool.
So we remove the assert and change the ERRLOG to INFOLOG.
Fixes one of the issues in #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: Iedbf7e0afa5025da6a810043ba95348ba5b856b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10901
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We have very frequent failures when we run test/nvmf/host/multipath.sh
in the test pool.
Call stack showed nvmf_stop_listen_disconnect_qpairs() accessed
qpair->ctrlr even if qpair->ctrlr was NULL.
nvmf_stop_listen_disconnect_qpairs() did not check if qpair->ctrlr is
not NULL before accessing qpair->ctrlr->subsys.
When a qpair is added to a poll group, qpair->ctrlr is cleared to NULL.
The test code test/nvmf/host/multipath.sh executes multiple
reconnects for path error.
So a conflict might occur between adding a qpair to a poll group and
disconnecting a qpair in a poll group.
In this case, it may be acceptable even if we disconnect a qpair whose
qpair->ctrlr is NULL. It will be better than SIGSEGV.
Fixes one of the issues in #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: I308fcb886dd410d01e3361c1850dec9a8eacbccf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10860
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If nvme ctrlr is resetting or initializing, free_io_qids
bitmap is already freed or not created yet. In that case
an attempt to create IO qpair leads to segmentation fault.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I6a97bf81d5a568db20d23b3f88cf01e994ba42e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10827
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
In current implementation RDMA qpair is destroyed right after
disconnect. That is not graceful qpair shutdown process since
there can be requests submitted to HW and we may receive
completions for already destroyed/freed qpair.
To avoid this, only disconnect qpair in ctrlr_disconnect_qpair
transport callback, all other resources will be released in
ctrlr_delete_io_qpair cb.
This patch is useful when nvme poll groups are used since in
that case we use shared CQ, if the disconnected qpair has WRs
submitted to HW then qpair's destruction will be deferred to
poll group.
When nvme poll groups are not used, this patch doesn't change
anything, in that case destruction flow is still ungraceful.
However since CQ is destroyed immediately after qpair,
we shouldn't receive any requests which point to released
resources. A correct solution for non-poll group case
requires async diconnect API which may lead to significant
rework.
There is a bug when Soft Roce is used - we may receive
a completion with "normal" status when qpair is already
disconnected and all nvme requests are aborted. Added
a workaround for it.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0680d9ef9aaa8737d7a6d1454cd70a384bb8efac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These definitions will be used in the next patch to check if
device is rxe
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Icc073344103991ff24fc3bb88a1ceb9867de6f6a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10727
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This patch aims to introduce a change in enabling
tracepoints inside SPDK. Currently every hit tracepoint will
be stored inside an internal buffer, what is inconvenient when
looking for certain information (eg. starting IO to record
some tracepoints, stopping the IO and having the tracepoint
buffer flooded with irrelevant information before copying the
contents connected with IO operations).
The tpoint mask option (-e) has been extended with ':' character.
User may now enter tpoint mask for individual trace points
inside chosen tpoint group.
Example: "-e 0x20:3f", where "0x20" stands for tpoint group,
':' is a separator and "3f" is the tpoint mask.
Change-Id: I2a700aa5a75a6abb409376e8f5c44d5501629877
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10431
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Some devices may support SRQ depth lower than defaulut
value 4096
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I58da0ac268a6d4c4a7e3b500ae37b8fad4810e17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10654
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Otherwise, this field is left unassigned and the host receives some
garbage cid.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If1e1fe8c7543bcedfbb897200696e05b71c57e0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10770
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When aborting a request in a NEED_BUFFER state, we set it's completion
status and remove it from the pending_buf_queue. Since it's no longer
on that queue and there's no completion it's waiting for, we need to
manually kick.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1272d441aec3b3090cd8c143a2112a8a6866fcf0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10769
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The tracepoint passes qpair pointer as an argument, while not specifying
it in its definitions, which makes the following assertion to fail:
trace.c:83: _spdk_trace_record: Assertion `0 && "Unexpected number of tracepoint arguments"' failed.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I315e9bf0465db7033ac0f1169536c459ac4e9250
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10761
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Currently if we remove a listener from a subsystem, we
disconnect *all* qpairs that have the same transport ID
as the listener being removed.
Fix that, since we should only disconnect qpairs from
controllers associated with the subsystem that had the
listener removed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6cf7422d14f23bf02ba6c4b034b172870694b3e6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10690
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There is situation that num_extent_pages is zero and original pointer is
also NULL, the realloc() could return a Not NULL pointer.
Related UT has been added and updated.
1) In the default allocation (num_clusters == 0), the extent_pages is not allocated as expected.
2) In the thin provisioning allocation (num_clusters != 0), the extent_pages will be allocated if extent_table is used.
More related information as below:
The crux of the problem is that according to POSIX:
realloc: "If ptr is NULL, then the call is equivalent to malloc(size)"
malloc: "If size is 0, then malloc returns either NULL or a unique pointer value that can later be successfully passed to free"
blobstore was relying on realloc(NULL, 0) always return a unique pointer value, and not NULL. This is not portable behavior.
Change-Id: Ibc28d9696f15a3c0e2aa6bb2371dc23576c28954
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10470
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This id can be used as the 'portid' for discovery
log entries. Previously we were putting the entry
index in the portid field which was incorrect.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f373585fe671ba7e69eb8e07f603f8e8ac1e270
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10589
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This API is a helper for getting the full discovery
log page from a discovery controller. It will read the
log page header to get the total number of entries,
allocate a buffer for all of the entries, and then
issue a series of get_log_page commands to read each
4KiB worth of entries.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I02666ef5adcb9fc8825a221655811ace708f97b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10564
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_ctrlr_construct_namespaces
We can just reallocate here to be more efficient.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8cfc87da23aee6c05ff83aea2165683dddba1dbd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10688
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the one place this was called, we can call nvme_ns_construct
instead. There's no harm in re-fetching the identify pages.
Change-Id: I91292ff9650bdc7edd5588a05837b671dcac1922
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10102
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
No code changes. Move these up so they can be used by some of the
regular command submit paths in future patches.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib8e54d47f7df35771b6c89d7c49d5182cae79e47
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10285
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_ioviter_next will walk through two iovecs and yield pointers
to common length segments. For example, given a source iovec (siov) with
4 1KiB elements and a destination iovec (diov) with 1 4KiB element, the
following will happen:
first spdk_ioviter_next:
src = siov[0].iov_base
dst = diov[0].iov_base
len = 1KiB
second spdk_ioviter_next:
src = siov[1].iov_base
dst = diov[0].iov_base + 1KiB
len = 1KiB
third spdk_ioviter_next:
src = siov[2].iov_base
dst = diov[0].iov_base + 2KiB
len = 1KiB
fourth spdk_ioviter_next:
src = siov[3].iov_base
dst = diov[0].iov_base + 3KiB
len = 1KiB
fifth spdk_ioviter_next:
len = 0
This is a useful utility for performing operations where both the source
and destination are scattered memory. As an example and a test vehicle,
spdk_iovcpy has been updated to use this internally.
Change-Id: I7e35e76d38e78d07ea1caf6282d0dfc02182aa83
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10284
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The host may have specified a hostnqn to use to connect to
a discovery ctrlr, so we can't just use the default ctrlr
opts to connect - we need to call the probe_cb (if there is
one) to get any options that the host may have specified.
Tested by using discovery_aer tool, creating a subsystem and
listener, and then adding a host on the target side that matches
the hostnqn specified to the discovery_aer tool.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07266e984e0094d3a768e6a0d5ea3a3bd71e32ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10547
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In NVMF Revision spec 1.1a, discovery log should be updated
when removing hostnqn of subsystem.
Update unit test to check the discovery log when removing
hostnqn and destroying subsystem.
Signed-off-by: Peng Lian <peng.lian@smartx.com>
Change-Id: I51c597a2493295a677a7aa68e4f13a887f7e1140
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10668
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Macro PCI_CFG_SPACE_EXP_SIZE isn't defined in kernel 4.9.x, so
we use a fixed value here instead.
Fix issue #2282.
Change-Id: Ife1444e919e4c1dc9328437002c15befe2f393e7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10691
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This change implements mechanism to allow user to
define multiple tpoint masks separeted with a comma
(e.g. 0x400, 0x8).
This is going to be used in the next patch to implement
enabling of individual tracepoints inside a tracepoint group.
Change-Id: I963f89684aa62b6e1dde57e22ddf835aa2c89f05
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10536
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
it is possible that some specific transport doesn't support
NVMF_MAX_ASYNC_EVENTS (although it is a recommended value by spec)
with that change it is possible to reduce aerl on transport specific
layer so it can be advertised correctly during identify controller
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ife6465b5324fb39f9b343c6f42b860e9dd1164b2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10422
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Not every transport requires accept poller - transport specific
layer can have its own policy and way of handling new connection.
APIs to notify generic layer are already in place
- spdk_nvmf_poll_group_add
- spdk_nvmf_tgt_new_qpair
Having accept poller removed should simplify interrupt mode impl
in transport specific layer.
Fixes issue #1876
Change-Id: Ia6cac0c2da67a298e88956734c50fb6e6b7521f1
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7268
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
On aarch64 platforms, doorbells update from guest VM may not be seen
on SPDK target side. This is because there is memory type mismatch
situation here. That is on guest VM side, the doorbells are treated as
device memory while on SPDK target side, it is treated as normal
memory. And this situation cause problem on ARM platform.
Refer to "https://developer.arm.com/documentation/102376/0100/
Memory-aliasing-and-mismatched-memory-types". Only using spdk_mb()
cannot fix this. Use "dc civac" to invalidate cache may solve this.
Profiling data did not show big performance degradataion.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I9a18718f8c4307b3007b18c32ab02e6796548958
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This RPC lists all PCI devices attached to an SPDK application. Each
device is identified by a BDF and contains a buffer with a copy of its
config space.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I852f421fde105d975458f8e63b8da4f92ed2c69b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10652
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
This function serializes a buffer as a hex string.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I09ab93bc626f6f6543b7c1ef033bcf807050862a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10651
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
These APIs are not safe, since they do not hold the
pci device lock across calls, which can cause problems
if a device is inserted or removed while handles
returned by these APIs are being used.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I01a80f26d0a0ca4cdfc7181359932b38da8dd43a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10659
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This is a safer alternative to spdk_pci_get_first/next_device,
since those APIs do not hold the lock between calls.
Future patches will remove those APIs, and change callers to
use this new API instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I71c7e8c1feb9112da8be32a8056b30e105e30463
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10655
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
If a subsystem has no listeners, then there is no need
to update the discovery log when adding a host, or setting
a subsystem to allow all hosts.
This eliminates some unnecessary discovery log update
notifications, especially when setting 'allow any hosts'
on a subsystem immediately after it is created (and before
it has any listeners).
Update unit test to check the adding a host to a
subsystem without listeners does not rev the genctr.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I63dab5df564269e574bb925890088f52063aa378
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The discovery log isn't updated when a subsystem is created
or deleted, it's only updated when a listener for a
subsystem is added or removed.
So remove the nvmf_update_discovery_log() in the subsystem
create and delete paths. They just generate extra AER
completions that potentially cause the host to do unneeded
work.
Note that if a subsystem is deleted with active listeners,
the subsystem delete path will remove each of the listeners
before deleting the subsystem itself. So the discovery log
will still get updated when those listeners are removed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id01bbfa3b24d3e1279a614a2fd60be41387a03b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10545
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The NVMe bdev module enables asynchronous IO QP creation by default, after
calling `spdk_nvme_ctrlr_alloc_io_qpair` and `spdk_nvme_ctrlr_connect_io_qpair`,
the queue pair is in connecting state at the beginning, then users may call
`spdk_nvme_ctrlr_free_io_qpair` immediately, and the common layer will
change queue state to NVME_QPAIR_DISCONNECTING and NVME_QPAIR_DESTROYING,
so in function `nvme_pcie_ctrlr_delete_io_qpair` the workaround to wait
for create cq/sq callbacks will not be called, instead of using the common
layer queue state here, we should use the internal `pcie_state`.
Fix#2245.
Change-Id: I801caf26563464b135035bf7fa2f63def13de9f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10445
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Batching will be made available for DSA specifically through the new
idxd_perf tool.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic51d9ad3692074805b1ffa705cea8be35737c778
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9846
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The NVMe bdev module will support two features, delayed reconnect and
delete after multiple failures of reconnect to improve error recovery.
The recently added two APIs, spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async(), were not good enough.
spdk_nvme_ctrlr_reset_ctx was not necessary. It had only a pointer to ctrlr.
Using a pointer to ctrlr directly saves us from undesirable malloc error
processing.
Separate spdk_nvme_ctrlr_reset_async() into spdk_nvme_ctrlr_disconnect()
and spdk_nvme_ctrlr_reconnect_async(). spdk_nvme_ctrlr_disconnect()
disconnects ctrlr including disconnecting adminq.
spdk_nvme_ctrlr_reconnect_async() moves the ctrlr state to INIT.
Then rename spdk_nvme_ctrlr_reset_poll_async() by
spdk_nvme_ctrlr_reconnect_poll_async().
Finally deprecate spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().
The following patches will change the NVMe bdev module to use these new APIs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id1d6858dcdc5fc2e9db0a6ebf3f79cab4f9bbcb7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10091
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Future patches will remove some of the more complex conditions
between different configure flags. As a result duplicate entries
might be present in DPDK_LIB_LIST.
Just for tidiness of the DPDK linker args, the DPDK_LIB_LIST_SORTED
is added. Using sort function removes duplicate entries in the list.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I318fd0cebbd30a80d281175b7d48bb3249abb841
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
This library was never used, remove it.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5d0255f4b9ddbe98b349b4253f87e5332fe7057f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10526
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK supports only maintained LTS versions of DPDK.
For SPDK 22.01 this means DPDK 20.11 and 21.11.
This patch removes paths for earlier versions of DPDK.
There is no need to check if library is present for the following:
- rte_telemetry was added as rte_eal dependency in DPDK 20.05
- rte_kvargs was added as rte_eal dependency in DPDK 18.08
- rte_pmd_aesni_mb, rte_pmd_isal, rte_pmd_qat were removed in DPDK 20.11
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I30c4cdb0fe0634db50bc34d7d6c232806ff49960
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10525
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
At this time raid5 bdev does not depend on rte_hash in any way.
Meanwhile NVMe-oF Fibre Channel transport does.
This patch reflects that in the mk file for env_dpdk.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2ba3e016337866f80fc7a6043cef87bf33cf2373
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10523
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The nvmf library will use INTEL VID/SSVID/IEEE values by default,
each transport can overwrite them if needed.
Change-Id: I9dad521c4d080b6f0cc1aaeb4b5d5f6863c6846d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10095
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also we don't treat exceptions when getting INTEL log pages
as a fatal error, the initialization will still contine.
Change-Id: Ic2fd2be510fde2679c1546482934d0a180266936
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10341
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When NVMe passthru command (IO or admin) fails on submission (e.g. it
is not supported), set DNR bit in completion status field. There is no
sense in retrying the command in this case.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I55960c128bd9fc31f6defef0b9832259a71684b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8578
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If NVMe admin passthru command is not supported by underlying bdev,
set status code in NVMe completion to INVALID_OPCODE.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I29c4e1f8263b76b27c199cfd2d9b2474432ec70b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10517
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The originally detected problem is that SPDK NVMf target fails command
with invalid opcode with status code INTERNAL_DEVICE_ERROR instead of
INVALID_OPCODE. All unknown commands on IO queue are passed to
underlying block device layer as NVME_IO type. It is not checked if
this type of commands is supported and, when command fails,
INTERNAL_DEVICE_ERROR is set as status code. If command fails on
submission, status code is set to INVALID_OPCODE which is more
relevant.
This patch adds check if command type is supported to
bdev_nvme_*_passthru functions. If not supported, it is failed with
ENOTSUP.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I4d7f7639da17dd3b1dc3eee7eb1b4a4f876117a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8567
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
SPDK shouldn't use `free` to free the memory allocated by rte_zmalloc_socket.
Otherwises, the vhost-blk/scsi will continuously crash.
In this patch, SPDK don't free the dpdk allocated memory,
DPDK will free it finally. Add a flag to indice the resubmit handle.
Change-Id: I85fd84b7d27a091830006a0f84d541c48290cbb3
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10383
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Move the CONFIGURE_AER state before SET_KEEP_ALIVE to
make sure that we run the CONFIGURE_AER state for
discovery controllers.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia4e24f6507c43e3fece06b9161ff8e0b8fa0e97d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We will actually run the CONFIGURE_AER state for
discovery controllers in a future patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib114beb886ab4b9214e4525479eb5ec7e038e5d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10331
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
in case of failure groups shall be destroyed
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I8933f4128a7a3361bbb55d6a9c08a540521e5bda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10435
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Doorbell offset starts from 0x1000 is defined by the NVMe
specification, so rename it to remove `VFIO_USER` prefix.
Change-Id: Ie34b12b3d2618f9b0ad0cf7ccbb103ad2c900f47
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10364
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Calculate supported maximum number of queue pairs based on
BAR0 size, this value isn't allowed to change at runtime, also
define BAR4/5 based on number of MSIX vectors.
Since the maximum number of queues is a large value(512), so we
still define a default value when starting, users still can
overwrite this value with a number no greater than 512.
Change-Id: I1b4b6bdf2ff9d129c8bdd493ffdf0a51f8772d51
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10334
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
libvfio-user will save a copy inside the library.
Change-Id: If7bb052b03fb92e46abe50fa945b812d149ef01d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10363
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reverts commit d9561c444f.
This patch is incorrectly iterating the CPU mask assuming it is
contiguous. However, rather than fix it, let's just let the kernel
scheduler place the thread where it thinks is best. It's going to prefer
idle cores anyway. So reverting is the simplest way forward.
Change-Id: I7b66cce7bfb6ddb108aa7576f508aa3b02b79138
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10475
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These can produce a lot of output, which doesn't really give any
additional information.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I572cd203d61c717ce6400f67ef27ec1d7bb54c0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
adding transport to tgt should be the last step
also there is an issue before change i.e. if calloc failed then
transport remains on the list
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Iaf29cfc7b0f535d40160c6fdf9ef6a7e6bfb127c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10429
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For error case, just set ctrlr->num_aers to 0, and
then the loop won't execute at all. This avoids an
extra call to nvme_ctrlr_set_state() and simplifies
the code a bit.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iff7bbf6e03d18b5f553b9e8527b4c803db583917
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10330
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Discovery services using the SPDK nvme driver may
use long-lasting connections that detect AER completions
to determine when there are changes in the discovery
log. This means that we still need to send keep alives
on discovery controller admin queues. So move the
SET_KEEP_ALIVE_TIMEOUT state immediately after
IDENTIFY, and run the SET_KEEP_ALIVE_TIMEOUT state
even for discovery controllers.
Note, we need the IDENTIFY's KAS value to properly
set the keep alive timeout, so we have to keep the
IDENTIFY state before SET_KEEP_ALIVE_TIMEOUT.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6403c28fb72d42629c5f9009a89c4bfd44d162
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10329
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Keep alive is valid for discovery controllers, so don't overwrite
the value requested with zero in nvme_fabric_ctrlr_scan().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7dcda6ebf4ab1c8a9085e4e3a02b814d8e586a97
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10328
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Next patch in series will look up the struct spdk_sock_group_impl
from spdk_sock_group in spdk_sock_get_optimal_sock_group().
Since this is third place it will be used, make this function common.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I43d6472016782e78709c1d52aa74abf594e5bfe6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10347
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When no optimal poll group exists for a qpair,
assignment for round robin happens in spdk_nvmf_tgt_new_qpair().
RDMA transport implments the logic for this assignment in
nvmf_rdma_get_optimal_poll_group().
TCP relied on the spdk_nvmf_tgt_new_qpair() instead.
This resulted in race condition when looking up and assigning
optimal poll groups - see #2113.
To remedy that, TCP now follows the same pattern as RDMA.
Next patch will improve the sock map lookup to fix the #2113.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I672d22ac15d06309edf87ece5d30f8e8d1095fbb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10270
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There is a bad memory corruption where the code for CUSE attempts to write
one byte (with value 0x1) after the memory is freed.
Context:
When the CUSE device is unregistered, the poller thread is signaled with
fuse_session_exit(), which writes the value 1 to fuse_session::exited.
The poller thread then detects with fuse_session_exited() that it must exit
the routing and finally destroys its own fuse session with
cuse_lowlevel_teardown() before it exits.
However, FUSE may also call fuse_session_exit() for its internal purposes.
I'm not sure exactly under what conditions that happens, but I added
some trace messages and I could clearly see that the CUSE thread exits
before it was requested to exit in cuse_nvme_ns_stop().
If the poller thread early-exits, it would destroy its own FUSE session
(and free the memory) before fuse_session_exit() gets executed, causing
the memory to be corrupted with a single byte of value 0x1.
Reproducer:
The bug can be reproduced by resetting the FUSE session to NULL after it
is destroyed. This will cuse_nvme_ns_stop() to crash with a segmentation
fault in fuse_session_exit() because it tries to access a NULL pointer.
static void *
cuse_thread(void *arg)
{
[...]
free(buf.mem);
fuse_session_reset(cuse_device->session);
+ cuse_device->session = NULL;
pthread_exit(NULL);
}
This fix:
The fix I suggest is to destroy the FUSE session with
cuse_lowlevel_teardown() after the thread is joined.
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I47202891a358f139506845110b012f840974b6fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9931
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_nvme_ctrlr_free_io_qpair can be called when
qpair is already disconnected. In that case qpair's
state is changed to NVME_QPAIR_DESTROYING and
transport's ctrlr_delete_io_qpair callback is
called. RDMA and TCP transports call
nvme_transport_ctrlr_disconnect_qpair in
the callback and since qpair's state is
not DISCONNECTED or DISCONNECTING, qpair
is disconnected for the second time.
If spdk_nvme_ctrlr_free_io_qpair is called
when qpair is in ENABLED state than nothing
changes, qpair will be disconnected before destroy.
PCIE/vfio_user don't implement transport disconnect
callback, so they are not affected.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I23e11856ecafb51669acf4a3118be049c11eecda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I58794b7b946eeb8ff82512905af0a296e3b534aa
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9817
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For DSM command, the NVMe drive may take a long time to finish it,
if we set a small timeout value for DSM command, the bdev/nvme module
will try to reset the IO queue pair when timeout happens,
in `spdk_nvme_ctrlr_free_io_qpair`, we will abort the outstanding
IO requests first, then in the `nvme_pcie_ctrlr_delete_io_qpair`,
we will poll the CQ for any requests that have been completed by
the NVMe controller, if there are NVMe completions in the CQ,
we will finish them again, thus double completions happened.
Here we rename `nvme_qpair_abort_reqs` to `nvme_qpair_abort_all_queued_reqs`,
so the common layer will just abort queued request, and let each
transport to abort outstanding requests case by case.
Fix#2233.
Change-Id: Icae6214239160c615418cb514fc51cfe77b59211
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10233
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a parameter which determines the owner of the
map - target or initiator. It allows to set different
access flags when creating Memory Regions
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0016847fe116e193d0954db1c8e65066b4ff82bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10283
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The unit test test_nvme_cuse_stop() manually creates 2 cuse devices
and executes nvme_cuse_stop(). Problem is that the Fuse session is
never initialized for those 2 cuse devices, causing cuse_nvme_ns_stop()
to access 'ns_device->session', which is a NULL pointer.
This bug is detected by ASAN as follows:
==77298==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000180 (pc 0x7fdac6d7d40e bp 0x000000000000 sp 0x7fff74768320 T0)
==77298==The signal is caused by a READ memory access.
==77298==Hint: address points to the zero page.
0 0x7fdac6d7d40e in fuse_session_destroy (/usr/lib64/libfuse3.so.3+0x1640e)
1 0x40dc7a in cuse_nvme_ns_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:851
2 0x40df59 in cuse_nvme_ctrlr_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:923
3 0x40f103 in nvme_cuse_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:1094
4 0x415803 in test_nvme_cuse_stop /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:393
5 0x7fdac724c1a6 (/usr/lib64/libcunit.so.1+0x41a6)
6 0x7fdac724c528 (/usr/lib64/libcunit.so.1+0x4528)
7 0x7fdac724d456 in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5456)
8 0x415a4e in main /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:415
9 0x7fdac62351e1 in __libc_start_main (/usr/lib64/libc.so.6+0x281e1)
10 0x403ddd in _start (/home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut+0x403ddd)
The fix is to call fuse_session_destroy() only if the fuse session is != NULL.
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I41881243227d83e8d1e6b90e72c1b6d62ccd98d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10225
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We allocate from the head, so it's better to free to
the head too for better cache utilization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c32244f446bd7a1df12eefc81245b3ef7e24070
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10193
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We allocate tasks from the head, so it's better to
free them to the head too for better cache utilization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I67c23e3d89cda16f94b1770eada5465015ddb6ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10192
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The NOTICELOGs really clutter the output during
application start - it's better to make these DEBUGLOGs
instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ae37d5d057d7b972017befbc0834de414b9710b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10191
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This limitation doesn't really take effect currently,
since the typical number of slots per channel isn't
bigger than MAX_COMPLETIONS_PER_POLL. But there's
no reason for this limit anymore - we should always
poll as many completions as we find.
It's better to remove this now, in case we have
configs in the future with higher number of slots
per channel.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5d90d41e5142622b79d9765fbc62da1516e2b8be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10189
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_bdev_io_get_nvme_status() does not follow NVMe abort command
about return values.
NVMe abort command sets completion status to SUCCESS both for success and
failure cases and differentiates only the bit 0 of cdw0.
lib/nvmf do not use spdk_bdev_io_get_nvme_status() but checks only
success or failure at completion.
So there is no issue now but let spdk_bdev_io_get_nvme_status()
follow NVMe abort command. In future, the user of spdk_bdev_abort()
may use spdk_bdev_io_get_nvme_status().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic4a08056bd8a1aee4c400f72ef5de7c68e23990b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9977
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
The pointer to struct spdk_nvmf_ctrlr is used to save mandatory
controller registers to the migration region.
Also rename some ctrlr/qpiar to vu_ctrlr/vu_qpiar.
Change-Id: Ifcb862bf4543a9df3c62d3b0a57b7f93228ccaba
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
1. use the transport lock to protect transport endpoints list.
2. don't use mixed errno and -1 as the return value, use -1 for all error cases.
Change-Id: I657fa06a6d82ee8dbeefaa3397df2285ba574b80
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9579
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When destroying controller, we will disconnect each connected qpair,
and in the spdk_nvmf_qpair_disconnect() call, qpair_fini() will also
try to hold the same lock, so existing vfio-user implementation assume
that qpair_fini() will not be called in the same context. Patch
https://review.spdk.io/gerrit/c/spdk/spdk/+/8963 remind me that
vfio-user has this issue. While here, we add one more thread poll
to avoid such issue.
Change-Id: I83b82ddcce3eb54c724291223e794dcb53a08059
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9998
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For live migration support in vfio-user transport, we need to pause
the subsystem when starting migration in source VM, then after
migration, the subsystem is in paused state, when exiting the
application, we will call spdk_nvmf_subsystem_stop() at last,
and existing code will assert this case.
Change-Id: If5214c45973b27f6092c4a6d71ede336e54d89e8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9407
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
The recent changes merged multiple Data-OUT PDUs within the same
sequence into a single subtask up to 64KB.
However, they were not enough.
For a large write operation, the hardware iSCSI HBA host sent an immediate
data whose size was not block size multiples and then more solicit
data through R2T exchanges.
One example for a 64KB write operation was as follows:
host sent SCSI Write with 5792 bytes and F = 1
target replied a R2T
host sent Data-OUT with 15880 bytes
host sent Data-OUT with 11536 bytes
host sent Data-OUT with 2848 bytes
host sent Data-OUT with 11536 bytes
host sent Data-OUT with 5744 bytes
host sent Data-OUT with 12200 bytes and F = 1
The hardware iSCSI HBA host can decide the size of the unsolicited data
but the SPDK iSCSI target can require the host to send the solicited data
whose size is block size multiples.
Hence we merge immediate data to the following R2T data if the immediate
data is not more than 64KB and more R2T data come.
Add another test case to check if the fix works for the above example.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4906b4e1a8b61e08862f4ccc27a6caf165126530
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9708
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This clean up will make the following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1ad288ec16aec69a168e0f3019b68e11132b65c9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9707
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Issue: spdk_top tracked pollers by the poller name string and the
thread_id they are running on. This shows incorrect stats when
multiple pollers exist on the same thread with the same name.
Solution: Added a unique poller id for each poller on a thread and
to allow spdk_top to track pollers by thread_id and poller_id.
Signed-off-by: Michael Piszczek <mpiszczek@ddn.com>
Change-Id: I1879e2afc9a929d1df9e8e35510f0092c5443bdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5868
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Wait until the namespace is attached, where it does this operation
again. As of this commit it doesn't really matter because it is just
filling in some values in a structure and if it does it twice it's not a
problem. But later when we only allocate active namespaces, we do not
want to allocate the namespace twice.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ie28653b178975d1ca80bf71ca6b5095224f1c5d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function destructs a single namespace and removes it from the
controller.
Change-Id: I4b7b3576beda85c9ddad4e0f2db6d1964fa72b82
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10024
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was sometimes used as the maximum array index and sometimes as the
maximum count. Make it consistent everywhere and give it a better name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I518efd99a7d36584624490b0b3497bb6e81ce9ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10101
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
list
The list should always be null terminated, but add an additional layer
of buffer overrun protection.
Change-Id: Iee31057fdca5ec4a6177615dff5171e5cb07984e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10027
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This quirk was already applied to the 0x0A53 SSD, but is
likely needed on 0x0A54 as well.
Possible fix for issue #2231.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36a48ab92d1698a411472f714b5108413bbc3c56
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10162
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the bdev-zone API, there are a few functions that takes a zone_id:
spdk_bdev_get_zone_info(), spdk_bdev_zone_management(), and the
spdk_bdev_zone_append() functions.
The way a zoned application is usually written is that it starts off
by getting the zone report for all zones (zone_id will be sent in as 0),
and then the application will keep the whole zone report in memory.
Therefore, an application usually have access to the zone_id/zslba for
all zones. However, there are cases, e.g. when getting an error on write,
where the completion callback will only have the lba of the write that
failed.
Add a helper function that can be used to get the zone_id/slba for a
given lba. Having this helper in bdev-zone will avoid SPDK applications
needing to provide their own implementation for this.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I978335f87f7d49bc33aed81afcaa6d9f0af8a1e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10180
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
DPDK vhost will call `new_device` when the VRINGs are
queue paired(virtio-net) or all the VRINGs are started.
However, for virtio-blk/scsi, SeaBIOS will only use one
VRING queue, DPDK added a workaround patch to add
`pre_msg_handle` and `post_msg_handle` callbacks to let
devices other than virtio-net to process such scenarios.
In SPDK, we will start the device when there is one valid
VRING, so there is a case that SPDK and DPDK have different
state for one device. For a virtio-scsi device, SeaBIOS will
only start the request queue, and in the BIOS stage, SPDK will
start the device but DPDK doesn't think so. If users killed
SPDK vhost target at the moment, in `session_shutdown`, SPDK
will expect DPDK to call `destroy_device` to do the cleanup,
but DPDK won't do that as it thinks the device isn't started.
Here in `session_shutdown`, SPDK will do this first, it's OK
that DPDK will call another `destroy_device` for devices that
have the same state both in SPDK and DPDK.
Fix issue #2228.
Change-Id: Ib76dd54c8fa302ffe6da9b13498312b7d344bbfe
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10143
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
_stop_session() is called while holding the global vhost lock,
and in the caller we do release the vhost lock, so even for the
error return from device backend, we don't need to release it
in _stop_session().
Change-Id: I08fef64f900bb42ee68bf02b4c4f1406e903a8a6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10142
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
`struct spdk_vhost_dev vdev` in `struct spdk_vhost_scsi_dev` can be
unregistered in `vhost_scsi_dev_remove`, so we can't use it
anymore in other places after `vhost_dev_unregister`.
Ideally `state->remove_cb` should not take the `vdev` as
the input parameter either, but I don't find it's used
anywhere, so leave it unchanged.
==29555==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000006df0
READ of size 2 at 0x602000006df0 thread T0 (reactor_0)
#0 0x7f3c246c0f0a (/lib64/libasan.so.5+0x9cf0a)
#1 0x7f3c246c3c15 in vsnprintf (/lib64/libasan.so.5+0x9fc15)
#2 0xa55cfa in spdk_vlog /spdk/lib/log/log.c:158
#3 0xa5596f in spdk_log /spdk/lib/log/log.c:110
#4 0x842e43 in remove_scsi_tgt /spdk/lib/vhost/vhost_scsi.c:208
#5 0x851508 in vhost_scsi_dev_remove_tgt_cpl_cb /spdk/lib/vhost/vhost_scsi.c:1149
#6 0x8383f1 in foreach_session_finish_cb /spdk/lib/vhost/vhost.c:1144
#7 0x9d3223 in msg_queue_run_batch /spdk/lib/thread/thread.c:703
#8 0x9d73fe in thread_poll /spdk/lib/thread/thread.c:919
#9 0x9d7c3b in spdk_thread_poll /spdk/lib/thread/thread.c:979
#10 0x8812fe in _reactor_run /spdk/lib/event/reactor.c:920
#11 0x881bf1 in reactor_run /spdk/lib/event/reactor.c:958
#12 0x88292b in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#13 0x873ff9 in spdk_app_start /spdk/lib/event/app.c:585
#14 0x408044 in main /spdk/app/vhost/vhost.c:105
#15 0x7f3c23691f42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
#16 0x407add in _start (/spdk/build/bin/vhost+0x407add)
0x602000006df0 is located 0 bytes inside of 8-byte region [0x602000006df0,0x602000006df8)
freed by thread T0 (reactor_0) here:
#0 0x7f3c2473191f in __interceptor_free (/lib64/libasan.so.5+0x10d91f)
#1 0x8369f2 in vhost_dev_unregister /spdk/lib/vhost/vhost.c:1024
#2 0x84f32d in vhost_scsi_dev_remove /spdk/lib/vhost/vhost_scsi.c:913
#3 0x83cdb7 in spdk_vhost_dev_remove /spdk/lib/vhost/vhost.c:1494
#4 0x83ed66 in vhost_fini /spdk/lib/vhost/vhost.c:1644
#5 0x9d3223 in msg_queue_run_batch /spdk/lib/thread/thread.c:703
#6 0x9d73fe in thread_poll /spdk/lib/thread/thread.c:919
#7 0x9d7c3b in spdk_thread_poll /spdk/lib/thread/thread.c:979
#8 0x8812fe in _reactor_run /spdk/lib/event/reactor.c:920
#9 0x881bf1 in reactor_run /spdk/lib/event/reactor.c:958
#10 0x88292b in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#11 0x873ff9 in spdk_app_start /spdk/lib/event/app.c:585
#12 0x408044 in main /spdk/app/vhost/vhost.c:105
#13 0x7f3c23691f42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
Change-Id: I511c4316a838cd92961d57c9193d384acd49d760
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10141
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously task->current_data_offset was updated by add_transfer_task().
However, the following patches will merge unsolicited data and solicited
data into a single subtask. It will be possible that add_transfer_task()
is called but subtask is not submitted. As a preparation, extract
updating task->current_data_offset into iscsi_pdu_payload_op_scsi_write().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5262bb883fa2a081be1f087181de98d4c3c24d69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9706
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When data segment size is 64KB and data digest is enabled, if
data segment and data digest are split into different two packets,
- pdu->mobj[0] became full first when reading data semgment,
- pdu->mobj[1] was allocated but unused and data digest was read.
In this case, two SCSI write tasks were submitted by mistake and
the second SCSI write task had no data.
Fix the bug in this patch.
When iscsi_pdu_payload_read() is called and pdu->mobj[0] is full,
allocate pdu->mobj[1] only if any of data segment remains to read.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9a0c36c05f90092c3c2122a7eb91e10976830b40
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9965
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We had not considered a case that incoming data to the second data
buffer was split into multiple TCP packets when merging incoming data
up to 64KB.
We do not change the unit test because we already have data check
and it is very hard to include partial read into the data check.
However, it is very usual that incoming data is split into multple
TCP packets. The feature to merge incoming data up to 64KB will be
actually enabled in the following patches. So we rely on the I/O test
to verify this fix.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I50d702d6c118bc16f0767845136e14414ccdf813
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9736
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The internal device list isn't used anywhere, and will cause ASAN
error because we didn't remove the entry from the device list when
destructing controller.
Change-Id: Ie97bf10ca44ff773a8bc5f0476611b3844ef901a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10109
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There is no need to sum SPDK_CEIL_DIV(length, io_unit_size) and
req->iovcnt as the later is always zero (assignment in spdk_nvmf_request_get_buffers).
Checking SPDK_CEIL_DIV(length, io_unit_size) is enough.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: I0fea1d86706d83e4dd9083e6f0ce09e4e4b033a8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10089
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The specification says "host specifies an offset (i.e., LPOL and LPOU)
that is greater than the size of the log page requested, then the
controller shall abort the command with a status of Invalid Field
in Command."
Offset is used (if needed) to retrieve specific records of
Discovery Log Page, so we don't check it for Discovery Log Page.
Change-Id: I76ce929600b9f2ca9b69397d25f339d55729e6d3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10093
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This can be used for multipath validation.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba62c5e90b22a9a85078fbc783d3a7273c029cde
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10137
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
CAP.CQR (Contiguous Queues Required) is always 1, so we should
return invalid field when PC bit is 1 and return Invalid Interrupt
Vector if Interrupt Vector is too big.
Also fix the issue that just creat/delete a CQ.
Change-Id: I7cc64a5946a1ed3161448fca8b433d08e5fee715
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10080
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
PCI event module currently requires use of SO_RCVBUFFORCE socket option
which is restricted to CAP_NET_ADMIN. Retry with SO_RCVBUF for non-root
(unprivileged) processes where this capability is not available.
Return -ENOSPC if receive buffer is not of sufficient size.
Fixes issue #2224
Signed-off-by: Tom Nabarro <tom.nabarro@intel.com>
Change-Id: I0bed1b1eac0c7e8601d3d172d8027380ec8be391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10126
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We already provides the API `spdk_nvme_ctrlr_is_fabrics` to return
input controller is fabrics controller or not, but it needs a controller
data structure as the input, so here we add another API to do the same
thing and it takes the transport type as the input, with this change,
both nvme and nvmf library can use the API.
Also we should treat UINT8_MAX(255) as valid fabrics transport type.
Change-Id: Ib62e7d3eca3da1ddb1a4cc55b0b62e274522f1ce
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10059
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This makes it more clear why reading a JSON
configuration file failed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9dae907c839d48068044b09f34b89a2de51cd811
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10057
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
ftl_dev_dump_bands accumulates a total in a local
variable, but the final value never gets used.
So just remove the variable completely.
Found with clang-13.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7a92f6bfa4ae56fc4d8189c887bf0f6d4a05d759
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10055
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
As per the nvme specs,
If OPTPERF is set to ‘1’ indicates that the fields
NPWG, NPWA, NPDG, NPDA, and NOWS are defined for this namespace and
should be used by the host for I/O optimization
Setting NPWA, NPDG, NPDA same as NPWG and NOWS same as MDTS
Fixes#2197
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: Ic769a21b6821fa731eeae83e7d30c380e8092e37
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10007
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
User may configure opts.max_qpairs_per_ctrlr, so use
opts.max_qpairs_per_ctrlr instead of using fixed default
NVMF_VFIO_USER_DEFAULT_MAX_QPAIRS_PER_CTRLR.
Also do not allow users to configure max_qpairs_per_ctrl >
NVMF_VFIO_USER_DEFAULT_MAX_QPAIRS_PER_CTRLR.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: Id9f1f796559f3bb8b2d3f5031606050470681b99
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9994
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We generally shouldn't do ERRLOGs based on bad
inputs from the host, so change some of these to
DEBUGLOGs instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id42a8e51964e0238cd2841be8ac1d4ccaa1d150d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10037
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Details of the changes here:
918fd2f146.
They mainly target the aesni_mb driver which was moved to ipsec_mb and
bump the minimal supported version of the ipsec to v1.0.
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: Ica3b4fd66684751939159511845eb6ac6f7d5205
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10003
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
When doing controller reset and shutdown, we may change the
CSTS.RDY and CSTS.SHN even there are pending IOs in the IO
queues, so here we add a timer in the reset and shutdown
callback, it will change the status when there are no
connected IO queues.
Fix#2199.
Change-Id: I3a54d30b257973661b269ad5e37637490f9390f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9908
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
SPDK nvmf target reports all listeners on all subsystems
in discovery pages, kernel target reports only subsystems
listening on a port where discovery command is received.
NVMEoF specification allows to specify any addresses/
transport types. Ch 5: The set of Discovery Log entries should
include all applicable addresses on the same fabric as the
Discovery Service and may include addresses on other fabrics.
To align SPDK and kernel targets behaviour, add filtering
rules to allow flexible configuration of what should be
listed in discovery log page entries.
Fixes#2082
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie981edebb29206793d3310940034dcbb22c52441
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9185
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
The qpair's state member is only 3 bits of a uint8_t,
and the in_completion_context bit is another bit in that
same uint8_t.
We know that the qpair's state is only ever updated by
one thread, but it is possible that the state could
be modified by one thread, while another thread
is modifying in_completion_context.
in_completion_context is only modified by the thread
that is polling the qpair (or the qpair's poll group).
But with async mode, another thread that has a qpair
on the same PCIe controller could poll its adminq and
reap the SQ completion for the qpair that's owned by
the other thread.
So do *not* set the generic qpair state to CONNECTED
from the SQ completion callback. Instead just set
the pcie_state to READY, and let the thread that owns
the qpair detect the qpair is READY and set the state
to CONNECTED itself.
Fixes issue #2157.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9efc0c954504f1841e1c3890ae78211ad0d1990e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9975
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This reverts commit 3b1f13ef29.
It seems like this particular commit is causing failures on the
CI side related to the following issue:
https://github.com/spdk/spdk/issues/2214
Reverting for now to make the CI stable.
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I7f32c612b8549477de0b9079d114cec4ec9bda59
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9986
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is a race condition between controller destruction and
subsystem state change, e.g. admin qpair may already be freed
when a namespace is added or removed. As result in function
poll_group_update_subsystem we may get heap-use-after-free error
Another problem is that some qpair's live time may exceed controller's
life time. To avoid it, start controller destruction process when the last
qpair finished the disconnect process (previously controller started
the descruction process before the last qpair starts to disconnect
and it could lead to raise conditions)
Fixes#2055
Change-Id: I11612979eb914a5fdcc7b9e3c812bf1e450b6120
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8963
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Matt Dumm <matt.dumm@hpe.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Some of our tpoints have a name exceeding 24 characters.
Althought this is not problem for SPDK, it might cause
confusion, because error messages are printed.
Tpoints regstered inside fc.c had their _REQ_ part removed,
since it was used in all of them.
Fixes#2208
Change-Id: I598eb9c1d252d8ca6c83f82e564a6b53037936f4
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9963
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
DPDK will report error when detaching the PCI device in secondary
process, because SPDK will return -1 in `pci_device_fini`, so
here we will reset the `attached` flag before that.
Also return the errno instead of -1.
Change-Id: I3efa4d97ceab504215faeb9d3d80a694bdd6014c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7944
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
When intr mode is enabled, it will be common that
poller is unregistered during interrupt processing.
Since poller unregister is a delayed operation,
mark it in spdk_thread object, and reap unregistered
pollers out of poller execution.
Fixes#2143
Change-Id: Ieb61fc7685f85af5c15e833dd1dd56f8c97a3b12
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5770
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch makes use of the changes from previous patch and to
show connections between bdev events and tcp events.
Change-Id: If28c256d74a9a5d581ee4d8292a08fc061fee968
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9622
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Currently we do not have any way to connect traces from different
modules in SPDK. This change modifies our trace library
and app/trace to handle adding relations between trace points
and a trace object.
Additionally this patch adds classes and fields to structs
inside trace.py to prepare it for future patches implementing
printing relation information.
Change-Id: Ia09d01244d923957d589fd37e6d4c98f9f7bbd07
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9620
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
There could be an outstanding for_each_reactor operation
in progress when the application shuts down. If we don't
wait for that operation to complete, we'll get a memory
leak for its context.
So when stopping the reactors, before setting the
state to EXITING, do one final for_each_reactor to
ensure any outstanding for_each_reactor operations are
flushed. Once this final for_each_reactor operation
is complete, we can set the state to EXITING safely.
Also use a global mutex and flag to ensure no more
for_each_reactor operations can start while the final
for_each_reactor is in progress. If a new operation
is requested, we'll just ignore since the reactors are
shutting down anyways.
Fixes issue #2206.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6d86eb3b0c7855e98481ab4164af82e76c009618
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9928
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
nvmf_ctrlr_cc_shn_done() and nvmf_ctrlr_cc_reset_done() are
almost same, so consolidate them together by adding a flag.
Change-Id: Ib7714d31a40f9d8d344ec2630c083c5d76dac8a1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9907
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Currently either HW Engine Channel or SW Engine Channel will be used.
In the case that HW Engine Channel is used while does not support related
operations like IOAT for CRC, it will shift back to the SW Engine's handle.
So that this is an issue that it still refers to the HW Engine Channel
while needs SW Eninge Channel to handle.
This patch introduces the SW Eninge Channel and always initializes there
in case that HW Engine does not support some operations.
Related UT also added to simulate the case the IOAT does not support CRC
and then SW Eninge needs to properly handle it.
Change-Id: I4ecdcd09ab669a616b37c567b45b1e6499800ec9
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In some cases a single virtually contriguos memory
buffer can be translated to several chunks of memory.
To make such translation possible, update structure
spdk_memory_domain_translation_result to use a pointer
to iovec.
Add a single iov structure or cases where translation
is always 1:1, it will make easier translation callback
implementation. For RDMA transport translation of address
is always 1:1, so treat iovcnt other than 1 as an
error.
Change-Id: I65605575d43a490490eba72c1eb19f3a09d55ec6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9779
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Instead of a union with domain type specific
parameters, store an opaque pointer to user
context. Depending on the memory domain type,
this context can be cast to a specific struct,
e.g. to spdk_memory_domain_rdma_ctx for RDMA
memory domains.
This change provides more flexibility to
applications to create and manage custom
memory domains
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: Ib0a8297de80773d86edc9849beb4cbc693ef5414
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Push operation complements existing pull
operation and allows to implement read data
flow using memory domains.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: I0a3ddcb88c433dff7a9c761a99838658c72c43fd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9701
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's too strict to fail the controller when there are no free requests.
Change-Id: I0a66ff2d294a2fd9326506ea50af4213aaaf8e92
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9848
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
We already support Set Features with Host Behavior, so here also
add the support in Get Features.
Change-Id: I27d973e81fd6be89cc67ad559439334fc1087c9e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9818
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
nvmf_subsystem_get_qpairs RPC handler may cause the program
could not exit normally (e.g. ctrl c). Reason is that,
spdk_get_io_channel() will be called during the getting
qpairs stream, which will add refcount value for each
existing channel. When end target, channel cannot be
destroyed since refcount be added additionally and
its value could not be subtracted to 0. As a result,
the program will hang in the process of exiting.
So here we don't need to allocate a new channel, just use
the exist one.
Signed-off-by: zhaoshushu.zss <zhaoshushu.zss@alibaba-inc.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: I08b4678edaa9404b8e8af125ebae572b31edf77e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9881
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
- remove metadata updater
- handle 'zero' flag in mempool allocator
- adapt ocf_mngt_cache_start() to new OCF API
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Change-Id: I34afd856cc1306ffe305f71a445e7474c9b0a2d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9129
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
OCF requires separate mempool for each request size from rage 4k-128k. Mempools
are meant for both IO and OCF's internal requests.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Change-Id: I3b37d287bbd6f963a24007c8485002f414fb8a7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7774
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Adds context to currently existing traces inside bdev.c file.
These are going to be used later to match with traces from
nvmf layer to enable io tracing.
Change-Id: I599a60412f39144cdd306315a184b2000a61286e
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9497
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
This is to help with binding trace objects together and
for the convenience (all trace definitions are in one place
instad of being scattered accross multiple files).
Change-Id: Ib15bc9c2eeee9c4d0816bcee509ab69f3f558e19
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9574
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Enable tracing of tcp qpairs in lib/nvmf/tcp.c.
Change-Id: I692e74a972dcddd0bff193f1703470926e28b4db
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9288
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
This change aims to help with tracepoint handling in the future.
Instead of printing each trace directly with given values, we
will use this function to pass status change value.
Change-Id: Icc7f2863703899f818f0a2d5f49b69aa4e26a62c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9690
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
The new name suits better to the following "data push"
operation
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ic3249f65de203f375477f8e87b0749b9502d165c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9878
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For GET/SET FEATURES command, some feature IDs will have data buffer
while some don't have, so here we will return the buffer length base
on the feature ID. The length is defined by the specification.
Change-Id: Ie9798585fc74544b77998aeebc2a20614c1c25f0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It is a bit confusing for the variable names to not match
the parameter names for spdk_bdev_io_get_nvme_fused_status().
These changes should make it a bit simpler to understand.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I833bf2a75cfec7b86da2d8a234f800de3510c2b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9867
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Add the paired spdk_json_write_named_uint8|16 function
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ib7ee9ae4dbe9a4e9cfa28750f0b9a0af597d260c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9788
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previous patches (5363eb3c) tried to work around the
32-bit unmap and write_zeroes LBA counts by breaking
up larger operations into smaller chunks of max size
UINT32_MAX lba chunks.
But some SSDs may just ignore unmap operations that
are not aligned to full physical block boundaries -
and a UINT32_MAX lba unmap on a 512B logical /
4KiB physical SSD would not be aligned. If the SSD
decided to ignore the unmap/deallocate (which it is
allowed to do according to NVMe spec), we could end
up with not unmapping *any* blocks. Probably SSDs
should always be trying hard to unmap as many
blocks as possible, but let's not try to depend on
that in blobstore.
So one option would be to break them into chunks
close to UINT32_MAX which are still aligned to
4KiB boundaries. But the better fix is to just
change the unmap and write_zeroes APIs to take
64-bit arguments, and then we can avoid the
chunking altogether.
Fixes issue #2190.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I23998e493a764d466927c3520c7a8c7f943000a6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9737
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
seems it is not required, string is used instead enum
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ic67af6cfa6f8e22a6593f904cd5b5c1b51cae6ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9623
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Previously we decided which LUN format is used by the macro constant
SPDK_SCSI_DEV_MAX_LUN. However, as long as we read SAM, the LUN format
can be decided per LUN ID.
Linux host SCSI driver supports 256 LUNs per SCSI device at the
maximum. So we cannot test this fix on any actual system but we
apply this fix for the potential future cases.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifa6a3b66431f5e1eade326348dd99b8b9653408b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9664
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Most SCSI hosts, Linux, Windows, VMware, supports 256 LUNs per
device now, and it is not easy to test even if any other non-free
OS or driver supports more than 256 LUNs.
Hence increase the macro constant SPDK_SCSI_DEV_MAX_LUN from 64 to
256. Then we do not need to expose it publicly now. So move it to
lib/scsi/scsi_internal.h.
Update the CHANGELOG together.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iacde46c3854f326eebfb8befb47d41fce383b027
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9631
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change a fixed size array to a linked list to manage LUNs per SCSI
device.
Keep the linked list sorted by LUN ID because this is necessary to
efficiently find the lowest free LUN ID or check the specified LUN is free.
To avoid traversing the linked list twice, change scsi_dev_find_free_lun()
to return the LUN which comes just before where we want to insert an new LUN.
Additionally, previously spdk_scsi_dev_add_lun_ext() had not checked if
the specified LUN ID was duplicated. Fix the bug in this patch.
Add unit test cases for the function scsi_dev_find_free_lun().
These changes will enable the following patches to increase
SPDK_SCSI_DEV_MAX_LUN from 64 to 256 without consuming additional memory.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7f6f070ddc680127cf86ae255055da2d1d29e4ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9630
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is the same effort as lib/iscsi.
By using spdk_scsi_dev_get_first_lun() and spdk_scsi_dev_get_next_lun(),
remove the dependency on SPDK_SCSI_DEV_MAX_LUN from lib/vhost.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib79d2c3f8e7a5027d3d2c03f8f886c588b86009b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9611
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Use two new public APIs spdk_scsi_dev_get_first_lun() and
spdk_scsi_dev_get_next_lun() to manage iSCSI LUNs by linked list and
to traverse all LUNs of the SCSI device.
By these changes, we can remove the dependency on the macro constant
SPDK_SCSI_DEV_MAX_LUN from lib/iscsi.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I3237e7f887dd761d6812ed4313f87b6e7884ea29
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9610
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
By this change, we will not need to traverse LUN list or tree in the
callback to hot remove.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe72fba824553d0189b9120884aa2113599a568d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9627
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is an effort to remove the dependency on the macro constant
SPDK_SCSI_DEV_MAX_LUN from lib/iscsi.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ieeff4136b16cc6bfa92614248f150bc9dfe3dc74
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9609
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Add two public APIs spdk_scsi_dev_get_first_lun() and
spdk_scsi_dev_get_next_lun() to remove the dependency on the macro
constant SPDK_SCSI_DEV_MAX_LUN from lib/iscsi and lib/vhost.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6546697f823fe9f4fa34e1161f5c7fa912dd2d59
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9608
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The spec treats the sizes (MQES or qsize from create/delete
IO queue command) as a 0-based value of uint16_t, but vfio-user
treats them as 1-based value, so we need to use uint32_t to
make sure the value can't overflow. The same for NLB(number of
logical blocks).
Change-Id: I7654b7e12234525c0fce78a713dd50097e9b3d58
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9632
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The function is used to check the IO SQ or CQ exists or not,
so return bool type is better and also rename it.
Change-Id: I1af2a25255b012dcf1e1952b828617ededf9b897
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9680
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Andreas Economides <andreas.economides@nutanix.com>
Previously vfio-user will hit an error when creating a deleted IO SQ.
For lookup_io_q() function, actually we should not use the queue's
address to check the queue is exist or not, as when there is a memory
removal, we will also set the queue's address to NULL and reset it again,
so here we use the queue state to indicate the queue is exist or not.
Fix issue #2174.
Change-Id: I56bdd63947c2b30a598c427e64ff2dafe1cc1d2e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9605
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Prior a regular round robin could result in strange performance
if an idxd device from another socket was used.
Signed-off-by: peluse <peluse@localhost.localdomain>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Id863c79067beabe73ef89d92b3fb3c436821b97a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9367
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
For debugging purposes, take a name for identifying fds added to a group.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If1654e56e19f7fa964446ef1b9e71debf74979d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9731
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Object's statistics (its index and start timestamp) are now tracked and
updated in the trace entry object.
This is the final piece of code that had to be copied from the trace
app. The following patch will remove that code from the application and
replace it with functions from the trace_parse library.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1acaefb5a516bbc59c0549b2f66a5c52368bcd2c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9435
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If5f1b1bea0d30419be46e704ddd7c2f9556b4627
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9498
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tracepoint arguments are now reconstructed when retrieving next trace
entry (possible from multiple buffers). Similarly to previous patches,
the code is directly copied from the spdk_trace app.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie7594d381660ef31c286b5e124d2a6df3835111e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9434
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Detaching a controller with `no_shn_notification` flag set will follow
the regular detach path making it asynchronous too.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0b9c6c23626b4cc1cfaedb3268024776a07b9195
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9003
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The controller detach had asynchronous API (with async/poll), but the
register operations were synchronous, so they would block on fabrics
controllers. In this patch, they're changed to their non-blocking
counterparts, making the detach fully asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I74df12ab40a54f1d675639672e03755c89768bef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8726
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously we only use an assertion to address this sceanrio.
Fix issue #2173.
Change-Id: I7a6e715977218d2a3a08c48a9935880f3fe4ec63
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9604
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Andreas Economides <andreas.economides@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
According to the specification, we should also post an AER
error event for this error case.
Fix#2171.
Change-Id: Ifb2343453ea5e36ce244938a939537ee6ed1c4e1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9584
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
POSIX defines PRId64/PRIu64/PRIx64 for printing 64-bit values in a
portable way. Replace a reference to %lu to remove the assumption
about the size of a long.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I622fd43e7acf2cb93d3ba4ba9e9367e6dd064a74
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9663
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Added a definition of a parsed trace entry and a function allowing for
iterating over these objects. The difference between a parsed and a
regular trace entry is that it includes more information gathered while
processing the trace file (e.g. lcore, object statistics) and provides a
contigous buffer for trace arguments.
For now, only lcore and the pointer to the actual trace entry are
filled. Tracepoint arguments and object statistics will be added in
subsequent patches.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4d5e30a7abb4860a5ba9db46f64ceae8bd14646f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9433
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The trace file is now parsed and the entries are put in a map, sorted by
their timestamps. The code is directly copied from the spdk_trace app,
with very little modifications.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7929497ffd3079b6974f5423c82a6128db2cee98
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9431
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It gives user access to things like the tsc rate and tracepoint
definitions.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib50126b331faa4508174c7cb707643a3d8db6a01
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9430
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Copied code responsible for mapping/unmapping the trace file. The
only modifications were related with tying it to the spdk_trace_parser
object.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia575101532c612b185bd971c69157623b52b9e81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9429
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This library will provide functions that parse traces recorded by an
SPDK application. This includes merging traces from multiple cores,
sorting them by their timestamp and constructing trace entries spanning
across multiple buffers. All of these tasks are currently implemented
in the spdk_trace app, so most of its code will be moved here (this is
the reason for using C++).
The motivation for extracting this code to a library is to be able to
use it from places other than the spdk_trace app, specifically the
`scripts/bpf/trace.py` script.
The main reason for creating a separate library instead of extending
libtrace is to avoid pulling in all of its dependencies. ISA-L is the
most problematic, as we only build it as a static library, which makes
it impossible to use with dlopen (making it unusable in scripts).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If101ca3425d7404abd51b0da2031358d0be44766
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9428
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We still don't support get log page with error
information LID.
Change-Id: I92db361dc956ea3ed4f6e7bdfdca763d0fea6886
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9583
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Similar is already done for json-rpc bdev_get_bdevs, it might be
useful for the upper layer which has no interest in all but only
in one specified.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ie1af1cb4778edd265914bbfdc2777f66c6c76572
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9362
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ifa8e80e0a25af7757181f480ab0405ec902a61ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9596
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
IO commands with invalid OPCs are not freeing the
associated request object after handling the response.
This would eventually result in requests on the qpair
becoming exhausted which ends up failing the controller.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7c1c46265a38b31181cd5d9a98c528816ab482d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9601
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When data digest is enabled for a nvme tcp qpair, we can use accel_fw
to calculate the data crc32c. Then if there are multiple
c2h pdus are coming, we can use both CPU resource directly
and accel_fw framework to caculate the checksum. Then the datao value compare
will not match since we will not update "datao" in the pdu coming order.
For example, if we receive 4 pdus, named as A, B, C, D.
offset data_len (in bytes)
A: 0 8192
B: 8192 4096
C: 12288 8192
D: 20480 4096
For receving the pdu, we hope that we can continue exeution even if
we use the offloading engine in accel_fw. Then in this situation,
if Pdu(C) is offloaded by accel_fw. Then our logic will continue receving
PDU(D). And according to the logic in our code, this time we leverage CPU
to calculate crc32c (Because we only have one active pdu to receive data).
Then we find the expected data offset is still 12288. Because "datao" in tcp_req will
only be updated after calling nvme_tcp_c2h_data_payload_handle function. So
while we enter nvme_tcp_c2h_data_hdr_handle function, we will find the
expected datao value is not as expected compared with the data offset value
contained in Pdu(D).
So the solution is that we create a new variable "expected_datao"
in tcp_req to do the comparation because we want to comply with the tp8000 spec
and do the offset check.
We still need use "datao" to count whether we receive the whole data or not.
So we cannot reuse "datao" variable in an early way. Otherwise, we will
release tcp_req structure early and cause another bug.
PS: This bug was not found early because previously the sw path in accel_fw
directly calculated the crc32c and called the user callback. Now we use a list and the
poller to handle, then it triggers this issue. Definitely, it will be much easier to
trigger this issue if we use real hardware engine.
Fixes#2098
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I10f5938a6342028d08d90820b2c14e4260134d77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9612
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This commit fixes a race condition when calling free_ctrlr(),
nvmf_vfio_user_close_qpair->free_qp will set controller `ctrlr->qp[qid] = NULL`
finally, when calling free_ctrlr() we also need to check `ctrlr->qp[qid]`
is NULL or not, when there are multiple IO queues, we need a lock to protect
`ctrlr->qp[qid]`. However, the call to free_qp() in free_ctrlr() is valid
only when killing SPDK target, for all other cases, e.g: VM disconnected,
the queue pairs are already freed, so here we can process these different
cases separately, and avoid extra lock.
Change-Id: I7ab71f08bf4d737843b2af42e27b1571be0b45e9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9351
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Ideally, SPDK should make sure no pending I/Os in this queue
pair are using the removed memory region. Currently we just
stop the submission path and leave a TODO comment here until
we have an asynchronous way to do this.
Also use the `<=` for the boundary check.
Change-Id: I63a2189022978811dc21f92f2599f28a5191ecd7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9352
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch revert commit 8f7d9ec. In function vtophys_iommu_init(), we
can use `dev->drvier` in RTE_DEV_FOREACH() loop to count number of devices
probed by device driver using vfio APIs, or we will count all the PCI devices
that bind to vfio-pci driver, only the probed device's IOMMU group is added
to vfio container.
The original implementation is correct to count `g_vfio.device_ref` in
vtophys_pci_device_added(), we don't need to count it in
vtophys_iommu_device_event() callback.
Fix issue #2086.
Change-Id: Ib1502a67960a49e9a2f93823cc8ceab2e8303134
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9236
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Abort any queued admin requests once admin queue gets enabled. A request
can get queued if a controller is being reset and it gets submitted
while admin qpair is being reconnected. If these requests aren't
aborted, the init process will stall, as requests don't get resubmitted
while controller is resetting and subsequent admin commands required for
the initialization would be queued too.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If456a297d2d434b3cc741816cbfb13b01d37e963
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Allow to return more than one memory domain.
This change aligns bdev and nvme API and provides
more flexibility for custom transports.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica9b12ad8463c361be6cb62ee2c0513eec0b486d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9546
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Enable dump of transport stats in functional test.
Update unit tests to support the new statistics
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I815aeea7d07bd33a915f19537d60611ba7101361
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8885
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This function may return an error and we should handle it
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I996ceb6e300bd9527385aded9068b2841e94a20d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9278
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
nvme_transport_poll_group_remove() clears qpair->poll_group. Hence
we should not use it after that.
Fixes#2170
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I57ceee8c66684e2d02b51b8a0f3d66aacbcb9915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9560
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
lookup_io_q() should return NULL when qid == 0 (admin queue). This
ensures that handle_del_io_q() won't delete the admin queue (which is
prohibited by the spec) and fixes#2172.
Also fixes a few related off-by-one errors.
Signed-off-by: Andreas Economides <andreas.economides@nutanix.com>
Change-Id: I7ab063f25bba45b755d84c9ddde82072cf01f5e8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9593
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We can avoid extra iteration to null queue pairs by using a list.
Change-Id: Idebd5a396959372004c33f8b21c555032af67aae
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9350
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Existing vfio-user assumes that the IO SQ/CQ are paired, so we
return an error when creating SQ with shared CQID.
Leave a TODO comment, we may support this in future.
Change-Id: I297ce53eb9a119469145e2453177d03252b629c8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9173
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Some spaces needed to separate some words from each
other.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8a3f5b0a848ca9cc8771c11545652c275c194e89
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9547
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Used to be in the lib directory but an upcoming patch needs
access to it so move it to a more appropriate location.
The changes in the .h file were needed to address compile
error when in the /include dir (didn't get errors elsewhere)
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I44d11fc620f213b13683d62dca899e2792ca6ed5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9450
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Renamed nvme_qpair_abort_reqs() to nvme_qpair_abort_reqs_with_cbarg() to
highlight the fact that it only aborts requests with specified cb_arg
and to distinguish it from _nvme_qpair_abort_reqs() which aborts all
requests immediately.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I32fec5ab0501b1beb8605689d73ec42a6424fba5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9323
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When a controller is reset, it goes through the whole init process,
including establishing keep-alive, so there's no point in sending it
during that time. Additionally, it can stall the initialization if it
gets queued when adminq is connecting.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib9fee42f6097972e8ba336958f81e1b6b1a5f006
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9310
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Additionally, this patch removes reading the CC and CSTS registers from
`nvme_ctrlr_process_init()`, as it's no longer needed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If4f9e57dbf249fbce87e90018cff389f59906e38
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8621
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When the controller initialization is failed due to not being able to
read the CSTS register, an ERRLOG is now printed instead of DEBUGLOG.
It should make it easier to debug initialization failures.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I42602bd08e263faf2e87fe4a68abcc82ddec11ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9556
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The CC register is now re-read again when disabling the controller as
preparation for subsequent patches, in which the synchronous CC register
read will be removed from nvme_ctrlr_process_init().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibfc8ed85bab188c3938451fbdfb771b969157807
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8619
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The host driver should do a wmb() before it updates the SQ tail doorbell
to ensure that any writes to the SQ are guaranteed to be visible if a
doorbell update is visible (a store-release) - and indeed, this is what
the Linux NVMe driver does.
Therefore, we require a rmb() after we read the tail doorbell in order
to synchronise properly with the host driver (we need a load-acquire),
and guarantee that the updates to the SQ are visisble to us.
Signed-off-by: Andreas Economides <andreas.economides@nutanix.com>
Change-Id: I57eb1e1f0dfb1091a8f10f40f8ee0e2604d9268c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9514
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The CSTS reads in DISABLE_WAIT_FOR_READY_(0|1) states are now done
asynchronously.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4ca8ad286e259e8fcfbf484223288554280347fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Checking if the controller is enabled (CC.EN == 1) is now done without
blocking.
Additionally, a copy of the controller configuration register (CC) value
is now stored in spdk_nvme_ctrlr.process_init_cc. It'll be updated in
subsequent patches whenever the register is written / read. This will
make it possible to make several function non-blocking without having
send asynchronous register reads.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8323cf0c31a5ea282840aab6cf8ca241ce8667be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8617
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is is the first patch in a series changing all register accesses in
the NVMe controller initialization path to be asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic4df9890992eafb402cf3372fe2ff3ac3c503932
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8615
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This removes some code that was duplicated in the
CHECK_EN and DISABLE_WAIT_FOR_READY_1 states.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie5d175540f71c692f7784c7ff22a48f34b9b7082
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8614
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It will allow the async callbacks to retain the existing timeout while
changing controller's state.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4210f2cf7d4171444c338b8926334b985129a6c7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8613
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It will make it easier to set this timeout once the register accesses
are performed asynchronously.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9d61555a17380ddd57567db8dd912ef6d739fc01
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8612
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It will make it easier to support asynchronous register set/get
functions.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9915609ff940596ae4d67388238cc685dfa426fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8608
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch introduces asynchronous versions of the ctrlr_(get|set)_reg
functions. Not all transports need to define them - for those that it
doesn't make sense (e.g. PCIe), the transport layer will call the
synchronous API and queue the callback to be executed during the next
process_completions call.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I2e78e72b5eba58340885381cb279f3c28e7995ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8607
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
libvfio-user will assert if the command didn't have data
buffers.
Fix issue #2124.
Change-Id: I43e9562c614bc65e54dbf650447369317c10f7b5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9264
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <levon@movementarian.org>
Currently, there is no way to prevent spdk_app_start() from calling
app_setup_signal_handlers() and setting SPDK's signal handlers.
We'd like to use our own set of signal handlers, therefore this
patch adds a flag to the spdk_app_opts struct that disables this
behaviour.
Signed-off-by: Andreas Economides <andreas.economides@nutanix.com>
Change-Id: I61d7cd66527d819fd5f687d5cc8a03be4fe10a6a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9380
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously the nvme_ctrlr_state_string() function
was only defined for DEBUG builds since it was only
used in DEBUGLOGs. So now that we're also using
it for an ERRLOG we need to define this function
in release builds as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic14b251fbd7da1519ad0fd4d8a183d6b6ca17984
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9468
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This type of errors is not fatal and can be observed when
qpairs are diconnected. The same approach is used in target
side.
Change-Id: Ic3c7b1731c0cbd2e98d776f0f0c5d82464b3d556
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9416
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When removing a listener, there's a small window when the listener is
already freed, while the controllers that were created on that listener
are still active. It happens, because the listener is removed before
disconnecting its qpairs and in turn destroying the controllers.
The ctrlr->listener pointer is now cleared when a listener is freed and
its value is checked for NULL before each use.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iace65a46eebc5972cc4ef0f1c1822ffc5d3e559e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9237
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
An error might occur after succesful transport creation
when the new transport is added to nvmf poll groups, e.g.
in nvmf_transport_poll_group_create. In that case
transport is not detroyed and poll groups are not fully
functional. To correct this behaviour, destroy transport if
spdk_nvmf_tgt_add_transport fails. Also update nvmf_tgt
initialization step to check that all poll groups were
created.
Change-Id: I116e6944729d846c1755c2844c77825f65db8c12
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9255
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This is needed for reporting additional information in JSON RPCs
Change-Id: I45da2ea78cd5415f7536130b6793ca29e929b86e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9338
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When poll_group is used, several qpairs share the same CQ
and it is possible to receive a completion with error (e.g.
IBV_WC_WR_FLUSH_ERR) for already disconnected qpair
That happens due to qpair is destroyed while there are
submitted but not completed send/receive Work Requests
To avoid such situation, we should not detroy ibv
qpair until we reap completions for all submitted
send/receive work requests. That requires some
rework in rdma transport and will be implemented
later
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Idb6213d45c2a7954b9ab280f5eb5e021be00505f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Poll group holds lists of qpairs in different states and
when we got rdma completion with error, we iterate these
lists to find a qpair which qp_num matches. qp_num
is stored inside of ibv_qp which belongs to spdk_rdma_qp
structure. When nvme_rdma_qpair is disconnected, pointer
to spdk_rdma_qp is cleaned but qpair may still exist in
poll group list and when we start searhing for qpair by
qp_num we may dereference NULL pointer.
This patch adds a check that pointer to spdk_rdma_qp
is valid before dereferencing it. To minimize boilerplate code,
wrap all check in macro. Add unit test to verify this fix.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I1925f93efb633fd5c176323d3bbd3641a1a632a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9050
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When a subsystem is being deleted, we disconnect all qpairs
and when the last qpair for some controller is disconnected,
we start controller desctruction process. This process requires
to send a message to subsystem's thread to remove the controller
from the list in the subsystem and after that send a message to
controller's thread to release resources.
The problem is that the subsystem also destroys all attached
controllers. This order is unpredictable and we may get
heap-use-after-free or double free.
To fix this problem we can rely on the fact that the subsystem
can only be destroyed in incative state, that means that all
qpairs linked to the subsystem are already disconnected and
all controllers are already destroyed or in the process of
destruction.
spdk_nvmf_subsystem_destroy API is now can be asyncrhonous,
it accepts a callback with cb argument.
Change-Id: Ic72d69200bc8302dae2f8cd8ca44bc640c6a8116
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6660
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
John Kariuki tested this patch on a system with
several Intel P5800X Optane SSDs, to determine the
performance impact of adding these two
spdk_trace_records() in the main NVMe I/O path.
The pathological case (512B random reads on a single
Xeon core) decreased from 13.10M to 12.88M, or 1.7%.
Normal workloads (4KB+) would incur a smaller penalty
since the I/O rate would be much lower - maybe even
unnoticeable..
This is a really valuable tracepoint to have enabled
by default, so I think this small amount of degradation
is acceptable.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie2543cadf3541eb74398d31ac0f495522ab49ec0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9303
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The issue is that:
We call cb_fn firstly, then add the related idxd_ops into the
chan->ops_list. Then if in user's callback, we call spdk_idxd_submit_*
function gain, then we will allocate new idxd_opts first. So if there is
recursive spdk_idxd_submit_* in the call back function, then
we will exhaust all the ops_pool resources. And the function will
report -EBUSY issue in _idxd_prep_command function.
And this patch fixes this issue by adding back idxd_ops before calling
user's call_cb function.
Change-Id: I32239fb29be875bc452803a48dc24ff5fa533016
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9401
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add more information to users which implementation cannot
be set for idxd.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ib26f117dc19fa1f28cc1d548549caba0b696a915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9399
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Similar is already done for json-rpc bdev_get_bdevs, it might be
useful for the upper layer which has no interest in all but only
in one specified.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I25f7d20ff953e612d7982207d57607f1c3bbaa90
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9361
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This new API signals that the ctrlr will soon be
reset. This allows the transport to skip unnecessary
steps in following calls to the driver prior to the
reset - for example, skipping PCIe DELETE_SQ/CQ
commands when freeing an IO qpair.
Note that if we are deleting a qpair after
prepare_for_reset was called, and the qpair is
still waiting for a CREATE_IO_CQ or CREATE_IO_SQ,
we cannot poll for those commands to complete,
but we also cannot free the qpair immediately.
So set a flag for this case to defer the
destruction until the outstanding CREATE_IO_CQ or
CREATE_IO_SQ callback is invoked (typically as an
aborted command when the reset happens).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I34c6276ae71e7d61ad4a3720f1a985b1ee96bd8b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9249
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Rename endpoint->fd to ->devmem_fd to better reflect its purpose.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic73cb6684228d4ecb3d58c1865c80a3d7d07c223
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9402
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This patch moves schedueler and governor related API from
the internal event.h to public scheduler.h.
With this it is possible to create subsystem responsible
for handling the schedulers.
Three schedulers and a governor were moved to scheduler modules
from event framework.
This will allow next patch to add JSON RPC configuration
to the whole subsystem.
Along with easier addition of other schedulers.
Removed debug logs from gscheduler, as they serve little purpose.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I98ca1ea4fb281beb71941656444267842a8875b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6995
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Make sure to cancel the scheduling during:
1) gather_metrics error (already present)
2) setting the scheduler to NULL (new)
3) shutdown of the application (new)
In all of the above scheduler cannot proceed to
the balance() function.
Prior to this patch lack of setting
g_scheduling_in_progress to false would block
_start_subsystem_fini() from proceeding.
Resuling in application never shutting down.
thread_infos are allocated during gather_metrics,
and freed on threads_reschedule.
The added cleanup is used during 1) and 2).
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5ab77c85aaa9f94bd4e1a1660b55ab2f2c47bbdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We report other errors in spdk_interrupt_register(), so add one if there is an
issue with adding the given efd to the fd group.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idcd810d2b3f644bab1514063b399b9767797130f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9388
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
transport specific options are already introduced however dump was missed
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I2fe533898d82534cc6cb8a3e5de4e2b2c72ae00a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9355
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Expose the already existing nvme_ctrlr_get_cc as
spdk_nvme_ctrlr_get_regs_cc, similar to spdk_nvme_ctrlr_get_regs_csts and
spdk_nvme_strlr_get_regs_cap etc.
Signed-off-by: Tomasz Bielecki <tomasz.bielecki@wdc.com>
Change-Id: Ibfcf6fbe64dee3719f381184fb728ab6e4d52526
Signed-off-by: Tomasz Bielecki <tomasz.bielecki@wdc.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9220
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Refine ANA state from per subsystem listener to per subsystem listener
per ANA group.
Add an array of ANA state per ANA group to subsystem listener. The array is
indexed by ANA group ID - 1.
Then in I/O paths, we get ANA state by
ctrlr->listener->ana_state[ns->anagrpid - 1].
The NVMe specification indicates the existence of NVM subsystem specific
ANA state when FFFFFFFFh is specified as NSID for the Get Features
and the Set Features commands. For these, we return the optimized state.
Update the nvmf_subsystem_get_listeners RPC to return all ANA states
of the underlying ANA groups. The nvmf_subsystem_get_listeners RPC is
not matured and not used in the test code yet. Hence compatibility is
not high priority.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia2d4d5361ac01236f595c22765fd35e4c5fdee0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9064
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is the first patch in the patch series to control ANA states not only
as a unit of subsystem listener but also as a unit of ANA group and create
user preferred mapping between namespaces and ANA groups within a single
subsystem.
This patch adds anagrpid to both spdk_nvmf_ns and spdk_nvmf_ns_opts, and adds
ana_group array to spdk_nvmf_subsystem to count number of namespaces per ANA
group within a single subsystem. The size of the ana_group array is equal
with the size of the namespaces.
For each subsystem, allocate ana_group array regardless of the value of
ana_reporting of the subsystem.
For each namespace, at its creation, initialize anagrpid explicitly to be equal
with nsid by default and increments the corresponding entry of the ana_group
array of the subsystem regardless of teh value of the ana_reporting of thee
subsystem.
Hence the contents of the created ANA log page is not changed even if the
algorithm to crete ANA log page is changed.
Additionally this patch adds a unit test case that one ANA group
has multiple namespaces.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I78539db4e7248c2953c6927ff8128cb5a7e34b96
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9102
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The next patch will add anagrpid to spdk_nvmf_ns_opts. This patch
is for the upcoming change and futhre potential changesto ensure the
ABI compatibility.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe1024a27ea4ab823d40d775470cde78b9eb8d2c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9112
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Previously we used assert for this check.
Fix#2141.
Change-Id: Ieaf8e34721a5b65e6d7790d56587239e2e6d75f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9383
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If a response is returned prior to _nvmf_request_complete being called then the cid in the response is
not set correctly and the PDU state is not reset which causes a hang and the PDU state machine is
expecting more data but none will be sent. There are two cases where this can occur:
1) If the request is bi-directional
2) If nvmf_tcp_req_parse_sgl returns an error (e.g max_io_size exceeded)
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: Icc3ed02a4499a12d8920e6433a746b72022a72fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9327
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If NN is very, very large, this allocates too much memory. For now, just
use a list.
Change-Id: I904977673d8fb6c86f03c94ba798c6cc07f4a4d8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9301
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This can be done immediately after receiving the controller identify
data for now.
Change-Id: I527a44c4d1f4d3ad2eeb8fc77e07086c2358cac3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9300
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Interrupts were all being registered with the name "fn". Directly pass in the
name instead to fix this.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I38e59d784298a61ca2130090ede500101389685c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9346
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Rui Chang <rui.chang@arm.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The root cause is that we do not the copy the contents
to the dst address correctly.
So provide a _sw_accel_copyv function to address this
issue.
Fixes#2096
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I25d0e60f51abd41ed77b2b23f08d609f4f052de0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9193
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Previously the accel_perf tool would look at whether it had HW
or SW commands to know whether to execute the callback right away
or schdule to avoid blowing the stack (SW calls are sync).
Moved this to the SW module (part of the accel engine) so the
caller doesn't have to worry about. Allowed for a few simplifcations
in the tool as well.
Also, instead of using send_msg to call the completion we add it to
a list in the sw module that a new poller uses to perform the
completions as this is more efficient the sending a message to the
same thread.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ifc6c5b8635f51e3fa1a825c8573378b3752d7d91
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9171
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Small but noticable on perf top. We were using a list to track
valid batches and checking it on every submission. Instead just
put a valid channel ptr in the batch struct and clear it when the
batch is freed. There was no other reason for the list.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia26ceee578be4a82f4fd8abb2358ff18c56271a4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9149
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There is no need to track if governor was enabled during
initialization of the scheduler dynamic.
Instead _spdk_governor_get() can be used to determine that.
While here set the default g_governor to NULL.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ibdfde3d2714dd0b6d629f19f0d246b4fc3c9d214
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch cleans up the header file, structures and
parameters of governor API. While documenting the
functionality.
- made governor name const
- renamed _spdk_governor_list_add() to _spdk_governor_register()
This is preparation to making this API public.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie394109c839dead0e7ade946f95be8105b00e674
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8843
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch cleans up the header file, structures and
parameters of scheduler API. While documenting the
functionality.
- made scheduler name const
- removed typedefs for schedueler callbacks
- balance() now accepts uint32_t for array size instead of an int
- removed unused _spdk_lw_thread_set_core()
- renamed _spdk_scheduler_period_set() to _spdk_scheduler_set_period()
- renamed _spdk_scheduler_period_get() to _spdk_scheduler_get_period()
- renamed _spdk_scheduler_list_add() to _spdk_scheduler_register()
This is preparation to making this API public.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ia7b6b6a5eafb052ac275db6c04113a8ad442383f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8842
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Registering multiple governors would fail due to them having
the same name. Only saved by the fact that right now,
there is only one governor registered in this fashion.
Fix it by adding name of the governor structure passed
to the function name.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ic7a206da2c8f5dc1e72e41629bccf989c030f182
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8792
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
By default g_scheduler is now NULL. It can be set
either by event framework or RPC.
To keep RPC consistent, the 'static' scheduler was kept.
All access to the g_scheduler is done via _spdk_scheduler_get().
Access to its members is done to balance, get name for RPC and
through _spdk_scheduler_set(). All of them happen on scheduling
reactor and don't pose race condition when changing scheduler.
There is no need to delay init/deinit of scheduler via g_new_scheduler.
This variable was removed and all that happens in _spdk_scheduler_set().
To unset and deinitialize current scheduler,
_spdk_scheduler_set(NULL) has to be called.
This results in moving scheduler deinitalization to that
call too.
Every spdk_scheduler callback is now mandatory.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1fed766989de9bcaeb7e82b490c1275f3572d77d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
By default g_governor is now NULL. It can be set
either by event framework or schedulers directly.
Dynamic_scheduler and gscheduler specifically want
to use the dpdk_governor, so their initialization
now sets it explicitly.
To unset and deinitialize current governor,
_spdk_governor_set(NULL) has to be called.
This results in moving governor deinitalization to that
call too.
The "default" governor has been removed.
Every spdk_governor callback is now mandatory.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ibf76bd28bfbb159416026996fa217bb3325a3d31
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8810
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
There is no explicit need for the spdk governors initialization
to occur on per core basis.
This implementation detail for dpdk_governor is now hidden
in the init/deinit calls. There is no recourse when failing
deinit for a certain core, so ignore the return code.
Changed return type for deinit in governor and scheduler to void.
While here modified the callbacks for scheduler to no
longer require passing currently selected governor as an argument.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I7f0b7a09aa7f5d12ae47fca25186faeedac31a95
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8791
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Add the new device ID for VMD devices so VMD devices
can be unbound and used with the SPDK setup script.
Bus numbering for VMD devices is different on IceLake platforms,
and only half of the bus numbers are available. Add a function to
set the starting bus number and the max bus number by reading the
new BUS_RESTRICT_CAP and BUS_RESTRICTIONS VMD registers.
Signed-off-by: Sydney Vanda <sydney.m.vanda@intel.com>
Change-Id: I8905d4bcba84c74e3dadfb27262e668c4281b0c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8331
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When deleting an IO qpair, make sure that it's connection process is
finished (i.e. create CQ/SQ commands are completed) before freeing it.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I487dcef390d73ff4a7264ff97d965c9030916840
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9279
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This allows for creating admin qpair in an asynchronous mode and I/O
qpairs based on what the user specified in spdk_nvme_io_qpair_opts.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1801ea76d5a08b1bdc558eb0f18648f59454b654
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9077
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Split the connection process across two states, which allows the
transport to connect the admin queue asynchronously.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie84477331df0abf5ffdfc2a0ff5d5ada760c9e73
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9076
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows for submitting IO requests before the CONNECT command is
sent and not stopping the connect process due to the CONNECT being
queued. It could happen once IO qpair connection is asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I04e45b1c2f49f9da3412c843ea899341c56a0420
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8624
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This will allow us to call `connect_qpair_poll()` from
`process_completions()`. Otherwise, `nvme_fabric_qpair_connect_poll()`
which calls `process_completions()` might create a loop, which can cause
issues with `nvme_completion_poll_status` used for tracking the CONNECT
command by the fabrics code.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibd67bc3e011b238db264394e005b3268624cceef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8623
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The fabric connect command is now sent without. It will make it
possible to make `nvme_tcp_ctrlr_connect_qpair()` non-blocking too by
moving the polling to process_completions (this will be done in
subsequent patches). Additionally, two extra states,
`NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_SEND` and
`NVME_TCP_QPAIR_STATE_FABRIC_CONNECT_POLL`, were added to keep track of
the state of the connect command. These states are only used by the
initiator code, as the target doesn't need them.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I25c16501e28bb3fbfde416b7c9214f42eb126358
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8605
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These functions will allow for sending the connect fabric command
asynchronously.
Additionally, this patch changes the return code for
`nvme_fabric_qpair_connect()` when a timeout occurs from -EIO to
-ECANCELED. It gives better description of the error as well as make it
more consistent with `nvme_wait_for_completion*` APIs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I95806626d3573ebe4b1568157fd57013c4b909a7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8604
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Since the connect will be completed asynchronously, we
need to keep the pointer around so we can access (and
free it!) later when the command completes.
Also change the code to poll on the status using the
new nvme_wait_for_completion_poll(), as prep for upcoming
patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I28add8f967fd000afed1e50e491a16ea9da16c22
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8603
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This gives us a context that we can poll on when using the
nvme_wait_for_completion framework for async operations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5b23f9096eed5ce95848d1995c1ed0b4c74edc92
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8602
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Modified the async_list to be per-process instead of
on the controller object. This allows an NVMe multi-
process setup to have Asynchronous Events Reported
to each process that may interested in them. In the
previous case, where the async event list was on the
controller object, AER (Async Event Requests) would
not be reported to all the processes.
Fixes: #1874
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: I3e885c0cf5a0fd471d243bc7d96a8b7ffe65d14b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8744
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reservation commands aren't supported for vfio-user transport,
we should return error in case VM send such commands intentionally.
Also don't use one err variable for multiple functions return values.
Change-Id: Ie0d51ce3c85a1b8ef80c678a3ec9fb6523bab462
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9125
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Only one active socket connection is supported in libvfio-user,
RESERVATION should not be supported in this case.
Change-Id: I36a746f479da767d857425e84dfed5f2bef30b17
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9124
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Update submodule DPDK to version 21.08 and modify
CHANGELOG.
Added bus auxiliary dependencies to dpdkbuild/Makefile
and lib/env_dpdk/env.mk. This dependency was introduced
in DPDK 21.08.
Change-Id: I72d9fde456583dc129f4c7fced4f10875bbc38e2
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9211
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Cleaning up the API, implementations were removed in 7dbe0e7c.
Signed-off-by: Tomasz Bielecki <tomasz.bielecki@wdc.com>
Change-Id: Iae25bbb301da123f9e784678d4c4f0b8d39f262e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9221
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There may be multiple C2H data pdus recevied.
So we should use the following steps:
1 Use the SPDK_NVME_TCP_C2H_DATA_FLAGS_LAST_PDU
to check whether it is a last pdu or not.
Then we will not cleanup tcp_req, i.e., tcp_req->datao
will not be cleaned.
Then use the SPDK_NVME_TCP_C2H_DATA_FLAGS_SUCCESS
to check whether the controller will use resp pdu
or not.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9dccf2579aadd18f31361444e25bd4b3b76f06c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9192
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was accidentally merged as part of c3a5848.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I430e7035e9c2baf0b8d0592cda76265957c3791e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9251
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Callback for bdev modules is called 'module_fini',
meanwhile after its execution bdev modules were to call
'spdk_bdev_module_finish_done()'.
This function carries incorrect name, so it was deprecated
and replaced with 'spdk_bdev_module_fini_done()'.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I9a12dff746ea8b4b1570a3794470f7b24e29003e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9148
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
fini_start() is called for each bdev module before
iterating over all unclaimed bdevs to unregister them.
This allows bdev modules to behave differently during
each such unregister. Ex. unloading lvol store when
all lvol bdevs on it are unregistered.
Another use of this callback is to unclaim all bdevs
that can be at that point. Especially ones that will
not receive callback due to no bdev registered.
Ex. offline raid bdev, when some underlying bdevs are missing.
fini_start() being synchronous does not help in cases
where to release claim on the bdev, an asynchronous operation
is required. Ex. lvol store with no bdevs present, requires
async lvs unload to be called.
This patch adds async_fini_start flag for the bdev modules,
to be used when async fini_start is required. When done,
bdev module has to call spdk_bdev_module_finish_start_done().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I63438b325d4cc53fd236bf9ff143abf6bdd81c49
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9094
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is exension to the existing nvme_wait_for_completion* interface
that allows the user to poll for request's completion without blocking.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6c5b7203883f8e2fa28ceb039ce63aa50631f571
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8601
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is needed to enable using the nvme_completion_poll
machinery in an asynchronous mode.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8009ab81bcc02cba685f560be3d6540aab7ef3f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8600
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be done in stages. This patch adds the
nvme_tcp_ctrlr_connect_qpair_poll function and and makes the icreq step
asynchronous. Later patches will expand it and make the
nvme_fabric_qpair_connect part asynchronous as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ief06f783049723131cc2469b15ad8300d51b6f32
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8599
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We could not restore the setting of ana_reporting because it was not
included in the JSON config dump.
Add the parameter ana_reporting into JSON config dump by adding and
using a new helper function nvmf_subsystem_get_ana_reporting().
Besides, previously the JSON RPC nvmf_subsystem_get_listeners had
ana_state regardless of the value of ana_reporting. We make it
conditional in this patch. The JSON RPC nvmf_subsystem_get_listeners
had not been used in the test code in the repository. Hence this
change will be acceptable.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia4e04600c969c254e0a816d3eb34983ee951091e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9111
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The pcie layer can't always detect bad addresses
in the request at submission time - for example,
the transport may not have any trackers available
and the request gets queued at the generic
nvme level.
So this means that we might detect vtophys failures
during submission time, or in a process_completions
context - the latter happening when we complete
one request which triggers submitting a new request.
Currently if the vtophys failure happens during
submission context, we return -EFAULT to the
caller *and* call the completion callback. Nowhere
else in the driver do we do both - the intention
has always been that you get one or the other.
So make all of this consistent by tagging the
tracker and the qpair with a flag if we hit a vtophys
error in the submission path. Return 0 to the caller,
who will then later get a completion callback for the
bad request when the qpair is next processed for
completions.
I considered a separate TAILQ to hold these 'bad'
trackers, but that would have required duplicating
quite a bit of the tracker completion code for this
one case. The flag on the pqpair is already in the
hot cacheline, so it's cheap to check it. We will
only interate the outstanding_tr list when that flag
is set, so this should have zero impact to performance.
Fixes issue #2085.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I60b135fb32d899188e51545b69feb1b27758fd7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9234
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These functions accept extendable structure with IO request options.
The options structure contains a memory domain that can be used to
translate or fetch data, metadata pointer and end-to-end data
protection parameters
Change-Id: I65bfba279904e77539348520c3dfac7aadbe80d9
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6270
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Add a global list of memory domains with reference counter.
Memory domains are used by NVME RDMA qpairs.
Also refactor ibv_resize_cq in nvme_rdma_ut.c to stub
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie58b7e99fcb2c57c967f5dee0417e74845d9e2d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8127
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Memory domain is used to describe memory which belongs to
another address space (e.g. GPU memory or host memory)
Memory domain can be configured with callbacks to translate
data to another memory domain and to fetch data to
local buffers.
Memory domains will be used in extended
bdev/nvme API added in the following patches.
Change-Id: I0dcc7108a4fbf416a11575aa5cf5d7ec501b3d8b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8126
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
This avoids an infite loop when user sends and then immediately aborts
all requests while the transport cannot submit them (returning -EAGAIN).
It might happen in two cases: 1) if the transport doesn't have any free
requests or 2) if the qpair is still in the NVME_QPAIR_CONNECTING state.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia1d0b7c93f387524be0ad39597ec49851e1d3af7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8711
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Needed to make this code path asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6b18d4e9fc1a29b30c78f7f087b80913d00f9726
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8598
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
If a qpair is part of a poll group and it's not configured in the async
mode, it should be using poll group's process_completions variant.
Additionally, connecting qpairs to the poll group was moved up, so that
qpairs are already on the connected qpairs queue when waiting for the
connection to complete.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I08f75bd61a566d1ab60029b6202d9337df75733f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9074
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Replaced poll cycle count with a timeout when destroying a qpair that is
part of a poll group. Tracking the time instead of a poll count is more
stable, as the number of poll cycles can vary based on the application's
behavior when destroying a qpair.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7445bc1b411f2905aab7bf3dc7b2d3344712e1eb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9200
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Number of active namespaces may change. So, on ANA log page update we
should check if buffer has to be resized.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I1720317ea7f59e5afef73d5c4bd1cd69a7dd6520
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8583
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Log page reading function 'spdk_nvme_ctrlr_cmd_get_log_page' does read
into intermediate buffer and then copy to user provided buffer. So,
there is no need for user buffer to allow DMA.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I7337afa99c3ae666cc43ea2a48317de875334cfc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9177
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This PMD is availabe for BlueField2 DPU.
It requires libmlx5, so configure file is
updated to check if this library exists
Change-Id: Ic0cfbfdf24af393381667435009fb6afc49d9181
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8780
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In the next patch, the qpair is polled from a poll group and needs a
disconnect callback, which should also fail the qpair, so it makes sense
to have a separate function doing that.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ied76431520962b25220027be829a4609afb6bbda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
async_mode option is currently supported in PCIe transport layer
to create io qpair asynchronously. User polls the io_qpair for
completions, after create cq and sq completes in order, pqpair
is set to READY state. I/O submitted before the qpair is ready
is queued internally. Currently other transports only support
synchronous io qpair creation.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib2f9043872bd5602274e2508cf1fe9ff4211cabb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8911
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
spdk_reactor_set_interrupt_mode() writes/reads from fds created during
reactor_interrupt_init(). Since spdk_fd_group_create() depends
on eventfd, this will not work for systems that do not have it.
reactor_interrupt_init() handled lack of support for eventfd correctly,
while spdk_reactor_set_interrupt_mode() did not check for it.
Reported-by: Nick Connolly <nick.connolly@mayadata.io>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5181d436636c55cca3a06b1947e944502a9204ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9131
Reviewed-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
PRP1 value 0 is a valid Guest physical address and it may use two vectors
for one page, so we need to set `iovcnt` as the IO commands. Also
fix the calculation for Get Log Page command.
Change-Id: Ib036bcf1ca9f6644dfd8a7406fea37269cea5c73
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9122
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function is called in the IO processing context, also
rename vfio-user controller data structure with "vu_" prefix.
Change-Id: Icb8fd18772f64c1e633f4a8b8eac26d0bafeed16
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9121
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function is called in the IO processing context, also rename
internal queue pair variable with "vu_" prefix.
Change-Id: Iac6be1fddf9ee28e0485ff0c3a707de4e98be96a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9120
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The submission queue doorbell will be cleared to 0 at the end
of this function.
Change-Id: I0146f3dc208b8ca1d7fec4b2dae1d7395f304444
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8992
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
There is only one place to call it now, so just remove it.
Change-Id: Ib16d640c80dbe79df46c965c5ee8bcd092e57b0d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9013
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We should maintain phase bit in vfio-user target, it's not safe
to use Guest completion queue's phase bit.
Change-Id: I94856853784802f7695a0484b783d8de4042f8a6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8988
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
It's wrong to call enable_admin_queue() in the memory hotplug context,
as the function will change doorbells and clear queue pair memory.
We can remap the ADMIN queue pair's memory just same as IO queue pairs.
Change-Id: I090ea40ce0f0613c15319bfb18771a154ca788c7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8987
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
The map_q function is used to map Guest's physical memory address
to Host virtual address for SQ/CQ. This function can be used both
for initializing SQ/CQ and remap SQ/CQ, when used for remap SQ/CQ,
we don't need to unmap the related memory region.
Change-Id: Ia42e01a68482e5678dbaf8d0d4e8c04c1a94789d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9012
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Was set at a pretty high number during early development.
Instead of a #define, lets use the same math we use to
determine the size of the operation and descriptor pools as
that's the max number of batches that will be needed.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ide5311b4b6c931010413d0f82baed0e05d5fcde7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9099
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When address translation failed we were not putting the op back
into the pool. In one place also changed the return error from
translation from hardcoded value to the rc returned by the call
to be consistent with the rest of the code.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I6d0c9c0f469a213721c6a5ce37c4e9dbdcef8183
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9097
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These are fixed, no reason to tranlate them with every IO. This
patch is just for non-batched IO. Batched IO will come later.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ifb7bdf147658a23714d94bb7f88d7fedd1749153
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9073
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
DPDK requires each command line parameter to be
passed as a separate string in the args array. So
we need to tokenize opts.env_context to handle the
case where a user passes multiple arguments in the
env_context string.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifc72a7c25b31f1296d3aba3a3f7664ad8edf128f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9132
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When run spdk_top against vhost, the poller name is displayed as
"bvdev->bdev ? vdev_worker : no_bdev_vdev_worker" which should be "vdev_worker".
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: Ie97c69bd8cdd3fdafc5fe4dd80e1fde85a8e3238
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9130
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The current num_md_clusters doesn't include the the part before
md_start. So the bs_recover will get more num_free_clusters than it
should be. This patch can fix it.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: I911926beb69aca677da508ba71f292496c917e7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9034
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The generic transport layer still does a busy wait, but at least
the logic in the PCIe transport now creates the queue pair
asynchronously.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: I9669ccb81a90ee0a36d3f5512bc49c503923b293
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8910
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In prep for upcoming patch
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Iebc5bf5f6715c85cc09b404128338cd6f08c421e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9072
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We can instead use a combination of the op code and the batch
element in the op structure to determine if the op that is
completing is part of a batch or not so we know whether to
return it to the free list or not.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I8e27c7b991f5a20e82394a25a52ece560cab5543
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9071
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
No real material impact but easier to read and cleans up error
handling when submitting a batch. Before we put the operation on
the list of operations to poll when it was prepared, a few LOC
before it was submitted to HW. Now we do it in the hw submission
function. For batch elements, they are added just before the
batch operation itself (as was before).
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Icfeaeeb650d5b63f81e13a8ffd27e349c7118089
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9069
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The bit arrays were used for dynamic flow control in a previous
implementation. They are no longer needed as flow control is
now static and managed solely in the idxd plug in module. Use
simple lists of descriptors and completion records instead.
This is a simpler implementation and will allow for some future
clean up of structures as well.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I8c4cce12e88ac5416e3fe29a416487759214ec9f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8922
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Actually we should not re-enable the ADMIN queue during
the memory region callback, similar with the IO queues,
we should just do the remap, but to avoid changing other
values such as doorbells.
This is a preparation patch just to rename it, and will fix
the issue in coming patchs.
Change-Id: Ic19654e944e4c58ce0fac9582c6bb187110fe571
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8986
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
This field is required for Multiple SQs share one Completion queue.
Windows requires SQID field even not for the shared Completion
queue case, so we need to fill it.
Fix issue #2009
Change-Id: Ifb2f20196baeaf9b982154f0480f360bc97a8c2d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8971
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently, queue creation is delegated to nvmf_vfio_user_accept() for both I/O
and admin queues, by adding it to a ->new_qps list. But there's no good reason
for this indirection to exist, and it would make interrupt support for the
accept poller harder to implement. Instead, directly call
spdk_nvmf_tgt_new_qpair().
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I182bfac9c621eda2f746201db7022ab6fce1bf30
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9053
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The comment no longer reflected the code.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I5903e9483f01dd9b7a61c8f725eba4c38e1b8459
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9008
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Only the CID is required when posting a completion response to the
completion queue.
Change-Id: I6a009386994d19a2fa05a0f5ed8c3237263644f3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8970
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Multiple IO Submission Queue can share one Completion Queue, and
we use field 'cqid' to save it in Submission Queue, so when posting
completion response, we need to get the Submission Queue's CQID first,
then post the completion queue based on CQID.
Also rename vfio-user internal variables with 'vu_' prefix in this
function.
Change-Id: Ib73b91a86740d105b5fb5c73127484ebfb6b55ef
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8969
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
No actual logic change for this patch.
Change-Id: I486e889499ee2e4bdf05d1f633ded1712333fa88
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8968
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This issue is introduced by the refactoring, i.e.,
changed from dst to crc_dst. And this code
part is missed.
This patch can fix this issue.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iad03d08707167c957193f6101eb9166a30e5cdd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8935
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Currently, the poller that calls vfu_run_ctx() always returns SPDK_POLLER_BUSY.
Update libvfio-user and adjust the API usage so that it can accurately
report SPDK_POLLER_IDLE when needed.
Additionally, renaming the poller to better reflect its meaning: it's not just
for mmio handlers, but libvfio-user handling in general.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I5e598241ac0a692f03ee36242ff977c4e6f14987
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8568
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Return the number of events handled as expected by the poller.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I70c4d32bf091b2c1a293eaa41f00869a3a7303f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8563
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The transport poller is supposed to return the number of events handled to the
generic nvmf code; correct the vfio-user implementation so it does that.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I804e1e8c75701c0b22ea6afd350b455c39908511
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8562
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Windows will always sends a Set Feature Interrupt Coalescing even
SPDK reports we can't support it in Get Feature command. Here
we return Feature Not Changeable instead of Invalid Field which
is more meaningful.
Change-Id: Ie08086c3eba1e2d790a7ae4976653b6f9085028c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8923
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
.ctrlr_connect_qpair
Previously this was assumed to be a synchronous process so the generic
layer transport code updated the state after .ctrlr_connect_qpair
returned. In preparation for making this support asynchronous mode,
shift that responsibility down into the individual transports.
While none of the transports actually do this asynchronously, insert a
busy wait in nvme_transport_ctrlr_connect_qpair to wait for the qpair to
exit from the CONNECTING state. None of the upper layer code can
actually correct handle a transport doing this asynchronously, so the
busy wait will cover that.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3c1a5c115264ffcb87e549765d891d796e0c81fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8909
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If there is hardware issues, we do not need to assign
the result. Because we will report the error status to the uplayer.
Change-Id: I647ddd609a1d5d0d52cc4fee59699b9992da4fa4
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8864
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Those functions are exported publicly, so better to
add some assert functions to detect some null pointer
errors.
We do not use if/else check, because it is too heavy.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I16814efb84a5a41876657f0caf5f0a6d0c2db8f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8863
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This assert is used to make sure that there is no
active batch (spdk_accel_batch) task is used.
If there are active batches found, it means that
we did not handle this case in a good manner.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7ea6247d2d5a40cf4f3f31cd8b1240fed4648d62
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8857
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
After compiling SPDK with `--enable-ubsan` option, ocf tests fail with the
following error:
src/ocf/mngt/ocf_mngt_common.c:170:2: runtime error: member access within
misaligned address 0x200003800188 for type 'struct ocf_cache', which requires
64 byte alignment
The mentioned line of code is `list_for_each_entry()` macro used for iterating
lists. Forcing `struct list` alignment removes the issue.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Change-Id: I803dd962ff873679f42568e6f42fec7fed278f67
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8516
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The assertion should verify that a clone has been found. Without the
dereference, it makes no sense, as that pointer is dereferenced earlier.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I67fa17b33df6d507822a17ffc221a6d360985646
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8919
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Using `bs_allocate_and_copy_cluster()` instead of a zero-length write
makes it possible to inflate/decouple snapshots, as the writes would
fail with -EPERM, because the snapshots are marked as read-only.
Additionally, zero-length non-vector requests are now completed
immediately. It makes it consistent with the vector path (which already
does that) and allows us to use the zero-length reads as a context for
cluster copy.
Fixes#2028.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib7fdee352972ecf808833aa179820d85cfab7eed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8918
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The data buffer isn't available at the beginning.
Change-Id: Ieeb1a297ff52dfdc6cd999d04862a0cd96483650
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8932
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There were a few references to "SPDK thread context", which are no longer
relevant in the current codebase. Additionally clean up another XXX to be
clearer as to the context, and fix two minor typos.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I2aeda8aaa0edc973fdb67feccc32d792c5f7292e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8548
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Split the NVMe controller reset into pre-init and reinit stages so
that the latter begins with a call to nvme_ctrlr_process_init(),
returning -EAGAIN if the controller is not yet ready so that a poller
can call it again later.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: Ia182b04e438241b139109be93f3ed858fac7f3d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8486
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
BUG FIX: call nbd_bdev_hot_remove will stuck if
it is called when nbd has in-flight IOs.
nbd_bdev_hot_remove is asynchronous. It will
guarantee the stop of this nbd.
nbd hot remove test will be added later
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I0a0dfab31fafd3d61212ade53c74ad05dbbff268
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8039
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This parameter is still part of API spdk_sock_impl_opts
structure but it is not used. Keep it to support ABI
compatibility since it is located in the middle of the
structure and removing it may break socket opts initialization
or parsing.
Change-Id: Ib641ad7d965d68bc9ebb65dba531408d88cf6fa1
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8914
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Implemented nvmf code to allow transports to use ZCOPY. Note ZCOPY
has to be enabled within the individual transport layer
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: I273b3d4ab44d882c916ac39e821505e1f4211ded
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6817
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This batch_op field is not necessary because we can
use the comp_ctx->desc->opcode to judge whether it is related
a batched task or not.
Change-Id: Id329221ccf272c4c3bb8c1b5ec08433029a9a1f8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8865
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Implement an async variant of spdk_nvme_ctrlr_reset(). This initial
implementation only allocates a context and returns it to the caller,
relying on the caller to poll the context to execute the existing
spdk_nvme_ctrlr_reset() implementation.
Wire up spdk_nvme_ctrlr_reset() to use this async variant to verify
that NVMe controller reset still works.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: I75d4b75dbf5897db452ee65286aef5a4eb839fca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When the QP is set to INACTIVE state, we always unmap the QP's
address to NULL, we can just check the address is valid or
not before posting completion response, so there is no need
to do the special process for the aborted AERs.
Change-Id: I1dc9893e04810e9a15e1eeb4d9405b775eab38d7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8803
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
We should use the diff bits to decide the action to CC.
Change-Id: I08cf6f65711f86328825dcd9e204ed79766c59fe
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8802
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Similar with Create IO SQ command, we should also defer the completion
of Delete IO CQ command until the IO QP is disconnected finally. However,
since the NVMf library will disconnect/free the queue pair finally, we
can't use the queue pair data structure to save the context, so define
a delete_cq context for Delete IO CQ command.
Change-Id: I005ad86c2af59540323205e9e928a2d573d5c448
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8796
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The NVMf library doesn't process Create IO SQ command, so for
this command we will use a fabric connection command instead,
however, the fabric connect command is called asynchronously,
so we need to defer the completion for Create IO SQ command after
fabric connect command is completed.
Fix issue #2043.
Change-Id: Id8c71ca69fb750a3ffb0fb26b54fa8b6d87e8ff4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8786
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We can set endpoint's controller pointer to NULL before free_ctrlr, as
controller is a session in vfio-user, while endpoint is related with
Unix Domain socket.
Change-Id: If6d26c7522804029f8c2425bc478df9e2d53d32a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7966
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
The VM may already delete all queue pairs and just leave the
socket when killing VM, so we can check number of connected
queue pairs here, if no connected queue pairs, free the
controller immediately.
It's an optimization so that we don't need to loop all
queue pairs below.
Change-Id: Ia1be868ce2c74ce6953b1f44a81b51d605e642f0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7625
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In rte_power all that enabling/disabling turbo does is allows
for additional entry in frequency array for particular core.
Instead of exposing this API through spdk governor,
just make sure that dpdk_governor enables turbo by default.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I994b326a57c01889bccea26635753c56637259d2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8789
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Those calls went unused, in favor or much more useful
up/down/min/max variants.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I432896196a1a6edfc6799c8658df49567f73d457
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8788
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The spdk_governor_capabilities added lots of capabilities
which went unused, suposedly to mark which callbacks
a governor had implemented.
This made little sense, since capabilities are per core and
not implmenting this APIs made little sense.
With this patch spdk_governor_capabilities is brought in
line with rte_power_core_capabilities.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I85296fce2999cb41957162b63ee13d86a0be919f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8787
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Remove _spdk_scheduler_disable() to avoid confusion as there is
no spdk_scheduler_enable function. Since spdk_scheduler_disable
sets scheduler period to 0, use spdk_scheduler_period_set(0) instead.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I4f1390a635f80e8b92775aa4be2e37f5b95467f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7448
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
The size of the core_info->threads will always be equal
to reactor thread_count, there is no need to count it
separately.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Icfa84606bd29d7766738eb2053362a20d78c23af
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8733
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Replaced multiple functions calls to _reactors_scheduler_update_core_mode(),
with a for loop.
Since changing reactor to interrupt mode is rare operation, most of the
time we ended up with unnecessarily long callstack.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1c7858653be9e2256943c1da5a27001be41682b6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8714
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
There is only one g_scheduling_reactor (main core), the is_scheduling
flag for it is used to block starting new gather_metrics before
previous one is finished.
Meanwhile is_scheduling flag on other reactors was used to block
destroying lw_threads while scheduling happens. It was only needed
because scheduler interacted with the same lw_thread pointers as
each reactor. Previous patch removed this dependency, instead
spdk_thread ids is used. If an spdk_thread is destroyed,
while scheduling _threads_reschedule_thread() handles it.
It is no longer required to block destruction of lw_threads
based on this flag.
Instead of using the main core reactor flag, a g_scheduling_in_progress
is introduced.
Removed _spdk_get_scheduling_reactor() and instead shared the value
of g_scheduling_in_progress between reactor.c and app.c.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ica57326a552477add522174cc3e96b3bab918350
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8732
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Removing dependency on schedulers to directly modify
lw_thread field structures will help making schedulers
truly plugable.
Instead of using lw_thread, new structure is created
that holds copy of stats and refer to the thread by
spdk_thread id.
As an added benefit of not changing lw_thread directly,
we won't run into issue of balancing function changing it
while other reactor removes and frees it.
In the future an API will be added for scheduler to call
in order to move the thread directly. Rather than for
event framework to rely on modified core_info/thread_info
structure.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I8f85bb8dc080fd13b78b07ee9ef8e8be7051659b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8411
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Limit of 50% to mark thread as active or idle
didn't allow for multiple active threads to be placed
on single core.
Lowering the limit to 20% will allow that and force
more threads to be actively balanced.
Removing the limit was considered, but that would
cause too much thread moves when a thread with load
in single digits increased briefly. Either
due to actually doing any operation or placement
of other threads on the same core.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5f8c9ff15461feb71a2d82853cfe048e412ba921
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8289
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Fixes#1933
When decoupling parent the updated parent_id was
not persisted to the blob if it was a snapshot.
Due to having md_ro set to true, blob_set_xattr()
failed.
Later on the incorrect parent_id could cause troubles
like in the github issue, when deleting that snapshot.
This patch adds return code check for blob_set_xattr
and forces md_ro to false during blob md sync.
Since some of code paths are shared between decouple,
inflate and clone operations, the final callback for them
is doing revert of the original md_ro.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If017455f72e4d809fe533d9f986e5ae6bb8e2035
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
After this patch, nbd will no longer receive any requests if
NBD_CMD_DISC is received. But it will handle the requests
already received.
Previously we called spdk_bdev_abort() for NBD_CMD_DISC and
it will reply to the rest requests in the channel of this bdev.
But there should be no reply to NBD_CMD_DISC. Hence we silently
discards requests after NBD_CMD_DISC.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I551dea1887cb2d108ed5e0d621309f62cfaabbb9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8038
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
The specification says:
"A host may replace its reservation key without regard to its registration
status or current reservation key value by setting the Ignore Existing Key
(IEKEY) bit to '1' in the Reservation Register command."
So for this case we treat it as a new registrant, also add UT to cover
the added cases.
Change-Id: I5990f15da36706063a35565d110ed4c6eb30a3f3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Since we are using NVMf fabric library to emulate a PCIe based SSD via
vfio-user target, so there maybe some commands that are related with
PCIe SSD only, such as set/get features with interrupt coalescing
and Interrupt Mask Set/Interrupt Mask Clear registers. Even the
NVMf library doesn't support that, it is not a fatal error to Host
NVMe driver, so here we use the info log instead of error log for
this case so that to avoid noise logs.
Fix#2036.
Change-Id: I8283bcde5779080835d6ab827dbd852b3816176f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8766
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
The NVMf library will not implement interrupt coalescing and ignore them, but we can
report this via get_features.
Some OS may check the result from get_features so that it will not send set_features
for interrupt coalescing.
Change-Id: I7466bcbc0ea5b3b067751cdf1979b2e0681c0043
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8765
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use the macros for red black tree provided by Free BSD to speed up
io_channel lookup.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Icfd87a8a2f60c082a17b8c501a03faba83edb762
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7895
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
In current implementation, io_channel list will be accessed by
spdk_for_each_channel() and spdk_get_io_channel(). We will try to
accelerate spdk_get_io_channel() in the following change "thread: speed
up io_channel lookup by using rbtree" by changing io_channel from list
into RB tree.
To make it cleaner, we prefer to use ch->dev as the key for the
io_channel RB tree instead of ch->dev->io_device. This patch makes
spdk_for_each_channel() use the i->dev to find the expected io_channel.
And the io_device in structure spdk_io_channel_iter is not needed in
spdk_for_each_channel_continue() but we keep it for the compatibility of
spdk_io_channel_iter_get_io_device().
After this patch, spdk_for_each_channel() has to access both io_device
list and io_channel list, and spdk_for_each_channel_continue() still has
to access only io_channel list.
Both io_device list and io_channel list will become RB tree. Hence
performance degradation will be negligible. spdk_for_each_channel() is
not so performance critical than spdk_get_io_channel().
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Idd486b0aa1b63b57ede90527dcd1631cbb008a1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8749
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Use the macros for red black tree provided by Free BSD to speed up
io_device lookup.
This change was reverted once but is re-submitted because the critical
issue was fixed by the preceding patches.
In addition to the fix, add unit tests to verify the fix explicitly.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I97ed77f6e5ceacdf2593c9751b55a7d0b92c0b35
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8525
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously we used a counter of our own to make sure all batch
elements plus the batch itself were done before we freed the batch.
This was due to some observations early on that the batch desc
could complete before the individual elements and a lack of clarity
as to whether this was due to the simulator or the fact that
we poll on completions and could therefore "see" completions in
a different order at that time (we were using bit arrays to poll).
Now we use an ordered (in time) list to poll locations so if we
instead put the elements on the list first and then the batch desc
itself we are assured to always "see" them in order provided the
underlying device meets spec which there's no reason to assume it
does not.
This simplifies things a bit at the same time and still assures
that we call list calbacks in order and then the batch callback
without "special" handling.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I4d9e3997786f2116ce6515682b8117799c645f51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8397
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Left over from when the field was a void *
Signed-off-by: paul Luse <paul.e.luse@intel.com>
Change-Id: If9dfe2878f6afd6137d6d8efec90e310baf417f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8280
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
For clarity, this element was added when crc+copy API was
added so might as well have all the CRC related functions use
it instead of `dst` to avoid confusion.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic43adbd0df51c1a349847701ef318f452306d0b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8229
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
We've always used `dst` as the destination for CRC result, with
the recent addition of a copy_CRC API `dst` was needed for the
copy destination and `crc_dst` was used for the CRC. This
patch just makes all the CRC functions use `crc_dst` to avoid
confusion. The accel_task struct also has a `crc_dst1 field,
that will be used consistently in the next patch.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia84c4a9e7940c6ebd31410c12272bd22b0c6dd29
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8228
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Support in accel_perf is coming up in a later patch.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I63a1d3b9b1a3254fdca78e27c473b9b3468c93c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8202
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Allows for better performance by not hitting the same portal
address with every submission.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I1ec8eae6f3acec9e98161029cd5406ec08603aa6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8190
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
We find a few files to get the size of a member of a struct. How to
do it is a little complex. So add a macro to do it will be helpful
to read the current code and develop new features.
lib/dif had used member_size() internally but Linux use sizeof_member()
as the macro. Besides, SPDK have used upper case letters for similar
macros, SPDK_CONTAINEROF() and SPDK_COUNTOF(). Hence spdk_member_size()
may be good but propose SPDK_SIZEOF_MEMBER() as the macro.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2179c845a3b75fb71aa039075cc4dfd30617b898
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8738
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
In this case, user could specify the core number like:
-m [0,1,10] besides the core mask like -m 0xF
Change-Id: I48621c5a84e5436deae07101591d0ef85b1e129e
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8746
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Now nbd stop will not be processed if this nbd is not fully started.
However, it will remember the stop command and do it asychronously
until nbd is fully started.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Iea5ba143332c7d3fd85f816726788f05e7ae3c8d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8037
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
If there are completed asynchronous events that have not been notified
to the user, free them during controller shutdown to avoid memory leaks.
It can happen if an event completes before user has a chance to execute
`spdk_nvme_ctrlr_process_admin_completions()`.
Fixes#2032.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie608bf9100342f8dfd709e070326f67335d27fed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8740
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
NVMe bdev module manages ANA log page itself now. So NVMe driver
should disable managing ANA log page.
Add a new option disable_read_ana_log_page to struct spdk_nvme_ctrlr_opts.
Then NVMe bdev module enables it when calling spdk_nvme_connect_async().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id5249efe90a4d50763c3a7eaa1eb9572f60fbc8c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This fix is as same as for NVMe bdev module.
If a ANA log page has two or more ANA group descriptors, the second
or later of ANA group descriptors will not be 8-bytes aligned.
Then runtime error would occur as follows:
runtime error: member access within misaligned address 0x612000000074
for type 'const struct spdk_nvme_ana_group_descriptor', which requires
8 byte alignment
nvmf_get_ana_log_page() in lib/nvmf/ctrlr.c creates a ANA log page
data and processes 8 bytes alignment correctly because we got the
same runtime error before. However, lib/nvme had been missed at that
time.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idaa610544dc5cb659c387fcd38a2b4b97cbd06e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8398
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The next patch will add an new controller option, disable_read_ana_log_page.
Initializing ns->ana_state to optimized before reading ANA log page
will simplify the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I34e56a2b454e4c02e1899f972e0ad675f2ebe2a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8312
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This patch is used to add the support for users to configure
use kernel or userspace idxd library.
Change-Id: Ie159b897bc9595894ad8f333168efaea6c2a3d78
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7332
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch is used to add the kernel idxd support.
Without this patch, we can use userspace idxd driver
under accel_engine library (module/accel/idxd/accel_engine).
With this patch, we can also kernel idxd driver under the
accel_engine library.
Our approach is implementing a wrapper library to use IDXD
device by leveraging the kernel DSA driver in SPDK idxd library
(lib/idxd).
Then users can leverage the RPC later to configure how to
use the DSA device by user space driver or kernel driver.
In this patch, our approach is to use the idxd-config library
to export the WQs (Working Queues) exported by the kernel.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I3a25a4fe0327bd626bf6883dfbe54437d3209e51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7331
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
There is no need to map the PRP/SGL list RW since this memory is never written
to. In fact, SeaBIOS might submit a request where the PRP list resides on
read-only memory, so attempting to map it RW can break things.
Change-Id: I7e4e90b1fa7e33e81b8d5cd8dcb9568c038938ec
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7288
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Nvmf/vfio-user uses this API to map NVMe command sent from
VM from Guest Physical Address to Host Virtual Address, so
now we moved this API from the nvme library to nvmf/vfio-user
as an internal API.
UT code will be added back in coming patch.
Change-Id: I54817fc9811ccd9ddd97b3aa6762a2fce4bbdda6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Just remove this function pointer and add a new one,i.e.,
dump_sw_error.
Because this function pointer is only used to
read a sw err info. We can hide it in the detailed
idxd implemenation.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I42fe2220dae85df307b5af64e37acfd7f748915b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8707
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Latest nvme-cli (>= 1.13) fails to issue commands towards SPDK's cuse
ctrl device, e.g.:
$ nvme get-feature /dev/spdk/nvme0 -f 1 -s 1 -l 100
nvme_cuse.c: 654:cuse_ctrlr_ioctl: *ERROR*: Unsupported IOCTL 0x4E40.
get-namespace-id: Invalid argument
The reason is because nvme-cli now also sends NVME_IOCTL_ID to the
target device to determine if it's indeed a controller or a ns. In
case kernel returns ENOTTY then nvme-cli considers the device to be
a controller. Since cuse_ctrlr_ioctl() returns EINVAL in such a case
the nvme-cli fails.
To avoid this simply replace EINVAL with ENOTTY for the ioctls that
may be not supported by ctrl or ns device.
nvme-cli commit in question:
fa2b91da74
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I29003864bc2a5c1a8906d6d01beba3d6f4e31b0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8531
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Now that we've deprecated the RPCs for a release, we can remove the whole
library.
Change-Id: I0f1a357fcfb3404efac39aa021928841c2f22ff1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Bit 1 in the CMIC of the Identify Controller Data Structure specifies
if the NVM subsystem may have multiple controllers or not.
However, multi_host indicated a particular use case such that the NVM
subsystem is used by multiple hosts.
multi_ctrlr will be more appropriate.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0246096a5cc44721aeff3ff6f96473a2abe11964
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8719
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For cases where cpumask for a thread was not set,
all bits were turned on for whole length of cpuset structure.
This resulted in JSON RPC reponses with way too long cpumask
for what is useful.
Now the response is limited to the applications core mask,
as that makes sense so long as number of cores cannot change.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib5cf271d3b219ba679f1abe498516796693a87dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8288
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The round-robin logic is no longer necessary to spread
the threads around the cores. Starting from core other
than first is even counter-productive to bunching up
threads.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5fcee2bacc2d0b4af26336caf381ed954814d731
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8085
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Before this patch _find_optimal_core() returned
1) any core that could fit the thread
2) if current core was over the limit, the least busy core
3) current core if no better candidate was found
Combined with _get_next_target_core() round-robining
the first core to consider, resulted in threads being
unnecessarily spread over the cores.
This patch only places threads on lower lcore id,
or when current core is over limit then any core that can fit it.
Next patch will remove round-robin logic to always start with
lowest lcore id.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I54e373d3ca02a5633607d22978305baa1142f8bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8112
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Before this patch the idle time of a core was increased
by the amount of busy time of thread that was moved out.
No assumption was made as to how the remaining threads,
would behave during next scheduling period.
This approach is fine, as over multiple scheduling periods
we'd arrive at a point where threads could do no more work
or all cores would be busy.
Yet this requires multiple scheduling periods to sort out
the threads.
Later in the series core_load will be used to determine,
when to start moving threads out of the core. So changing
this assumption will allow for faster responses to thread load,
at cost of sometimes spreading threads too much briefly.
With this patch, we are assuming that threads remaining
on the core will do proportionally the same amount of work
during next scheduling period.
See an example illustrating the change:
Before moving Thread1
Thread1 Busy 80 Idle 20 Load 80%
Thread2 Busy 60 Idle 40 Load 60%
Core Busy 140 Idle 60 Load 70%
After moving Thread1 out (original code)
Core Busy 140-80=60 Idle 60+80=140 Load 30%
After moving Thread1 out (this patch)
Core Busy 140-80=60 Idle 60-20=40 Load 60%
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1f347983449b2fde476dab360c4df689965ca3ea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8279
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
In cases when all cores are already doing too much work
to fit a thread, active threads should still be balanced
over all cores.
When current core is overloaded, place the thread
on another that is less busy.
The core limit is set to 95% to catch only ones that are
fully busy.
Decreasing that value would make spreading out the threads
move aggressive.
Changed thread load in one of the unit tests to reflect the
95% limit.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3b3bc5f7fbd22725441fa811d61446950000cc46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8113
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We had not held mutex while removing bdev name or alias from bdev
name tree for most cases. Fix these in this patch.
spdk_bdev_unregister() already holds g_bdev_mgr.mutex when removing
name, and so we do not need to change it.
spdk_bdev_close() had not held g_bdev_mgr.mutex. What we want to lock
is only when removing name from name tree, that is, calling
bdev_name_del() in bdev_unregister_unsafe(). However, we need to
keep hierarchical lock ordering. Hence get and free g_bdev_mgr.mutex
outside of bdev->internal.mutex.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4e2c8604e27c8603725efa9bc0bee2013eccb2ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8527
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We had not held mutex when adding bdev name to global bdev name tree
in bdev_name_add(). Fix these in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I33813638f11da85263ec0c8849e566d247a45d43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8524
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If the specified name already exists in the global bdev name tree,
RB_INSERT() returns a pointer to it. Hence we do not have to call
bdev_get_by_name() when using bdev_name_add().
Hence update bdev_name_add() to return -EEXIST if RB_INSERT() returns
a non-NULL pointer, and then remove the bdev_get_by_name() calls.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2d4554ef7e5286270417def64b638b803eecfca2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8573
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The complier complains:
/usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’
output between 4 and 19 bytes into a destination of size 7
71 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
So we change the array size from 7 to 20, so it is enough to put 19 bytes
in.
Fixes #issue 2014
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I97dfbf9707d0e275382324fa7352b7a212b2aeb5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8694
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In the nightly test, the compiler complains:
trace.c: In function ‘_spdk_trace_record’:
00:07:12.523 trace.c:144:53: error: ‘argval’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
00:07:12.523 memcpy(&buffer->data[offset], (uint8_t *)argval + argoff,
00:07:12.523 ^
00:07:12.523 trace.c:145:36: error: ‘arglen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
00:07:12.523 spdk_min(curlen, arglen - argoff));
And this patch is provided to fix such issue.
Fixes #issue 2034
Change-Id: I4c78d63bdc6a7d166990ae1d18a6abf183efdee1
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8709
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Chengqiang Meng <chengqiangx.meng@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
If NGUID is not specified with nvmf_subsystem_add_ns json-rpc request
then it is possible to expose the same NGUID as bdev nvme module
attached.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ie0ed7189e55a5abd6bc0904fc356d26f62b50549
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8628
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch adds the ability to chain multiple trace entries together to
extend the size of the argument buffer. This means that a tracepoint is
no longer limited to the size of a single entry, so it can have any
number of arguments, and their size is also not constrained to a single
entry.
Some limitations are still there: a tracepoint can have up to 5
arguments and strings are limited to 255 bytes. These constraints stem
from the definitions of tracepoint structures, which could be easily
modified to extend the limits if needed.
To record a tracepoint requiring larger buffer, aside from reserving
`spdk_trace_entry` structure, a series of `spdk_trace_entry_buffer`
structures are allocated too. Each of them acts as a buffer for the
arguments. To allow trace tools to treat the buffer structures
similarly to regular entries, they also have the `tpoint_id` and `tsc`
fields. The id is always assigned to `SPDK_TRACE_MAX_TPOINT_ID` to make
sure that a buffer is never mistaken for an entry, while the value of
`tsc` is always shared with the initial entry. This also provides a way
for the trace tools to verify if an entry is part of a chained buffer.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I51ceea6b6e57df95d4b8bd797f04edbc4936c180
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8405
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
It makes the code more readable. Additionally, to avoid partial updates
to an entry, the check for the number of arguments was moved before it's
filled in.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9ba01b1bcdc29267571badaebd4a9b34ffd7f728
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8404
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
It allows us to get rid of the `next_circual_entry` variable and will
make it easier to retrieve multiple trace entries, which will be needed
in subsequent patches.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4666c9da518c2ac0b376e10aa73d1c58cff91f13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8403
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Returning an error from this function is not useful - there
is nothing the caller can do with that information. So
change the return value to void. Also add ERRLOG and assert
if a transport actually returns a non-zero status, to
force the transport implementer (which must be an out-of-tree
transport) to make changes as necessary.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I402afec045265db178af821d25b99a6dbe066eab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8659
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It is not uncommon for delete_io_qpair to fail, for
example when a controller is hot removed. So even
if SQ or CQ deletion fails, continue with freeing
resources and report success back up the stack.
There is really nothing the application can do to
account for this failing anyways.
Upcoming patches will add additional checks to
ensure failing delete_io_qpair status never gets
propagated to the caller.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iac007c1eba30f7a8c4936b3ffb6c837f28ee12ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8658
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Due to the recent changes for non block size multiples write I/O,
the data digest feature was degraded. If Linux iSCSI host enables
data digest and tries to detect LU from SPDK iSCSI target, data
mismatch error is detected and the connection is disconnected
unexpectedly.
The cause was that pdu->data_valid_bytes was not set for non-write
response PDUs which have a data segment.
iscsi_pdu_calc_data_digest() has been used only for non-write response
PDUs. Hence we did not need to change iscsi_pdu_calc_data_digest().
Restore the original implementation of iscsi_pdu_calc_data_digest().
Additionally, to avoid future degradation, rename the related
functions to iscsi_pdu_calc_partial_data_digest() and
iscsi_pdu_calc_partial_data_digest_done(), and add comments for
clarification.
This fix was verified by the reporter.
Fixes#2029.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6babcd1b56e79d3fa3cd26b2dfaad87a52788e63
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8635
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Fixes#2022
If queued aborts are present when trying to fail a ctrlr
using spdk_nvme_ctrlr_fail(), then the abort command completion
will attempt to retry one of the queued aborts.
This eventually leads to a segfault that can be avoided by not
retrying any queued aborts.
Change-Id: I897dcb8809e16af8bdd39d4381ab531e1cc29822
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8585
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The vfio-user target emulated NVMe device is treated as
PCIe NVMe SSD in the Guest VM, so when doing controller
reset or shutdown, we should abort the AERs which in the
NVMf library.
Users may switch kernel NVMe driver to SPDK NVMe driver
in the VM, without this fix, we will got "AERL exceeded"
response very frequently, because the AERs submitted by
previous driver will never be aborted in runtime.
Change-Id: I0222ed509629ccb0e98217414dd9043857105686
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8558
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When users remove kernel NVMe driver in the VM, after 120 seconds,
SPDK NVMf target will disconnect ADMIN queue pair due to association
timer timeout, and for vfio-user transport, the ADMIN queue pair
connection is associated with the socket connection, so when probing
the NVMe controller again, because there is no active ADMIN connection
for fabric register R/W commands, it will cause segment fault.
Here we set the association timeout value to 0 for vfio-user transport,
so that the ADMIN connection will not be disconnected when shutdown the
controller, the ADMIN queue pair will be disconnected when the socket
connection breaks.
Change-Id: I3613169229bae384405889653e50f581d30d7c07
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8557
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The NVMf library will set cdw0 based on specific command,
so we use it directly in vfio-user, otherwise, some NVMe
commands such as AER can't work.
Fix issue #2016.
Change-Id: Ie1a80a92c0856b61822ee51ce5d8faaaf1d463de
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8556
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Also fix one incorrect print log.
Change-Id: I3254baf4bbff4acfc0ef43f628d025931e8589ea
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8555
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
These macros are only valid for Fabric transports.
Change-Id: Ia456eebdcdab28e81226c1b3a7211fcb41b5e481
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8554
Community-CI: Mellanox Build Bot
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_bdev_register() and spdk_bdev_add_alias() had not held mutex when
adding bdev name or alias to global bdev name tree. This bug caused unexpected
error when traversing global bdev name tree.
The next patch will fix the bug. This patch is a preparation for the fix.
spdk_bdev_get_by_name() had not held mutex while traversing bdev
name tree. The major callers to spdk_bdev_get_by_name() had held mutex
when calling it. However, this was not clear.
Factor out the internal of spdk_bdev_get_by_name() into a helper
function bdev_get_by_name() and then change spdk_bdev_get_by_name()
to lock and unlock when calling bdev_get_by_name().
Then replace spdk_bdev_get_by_name() call in spdk_bdev_alias_add() and
bdev_register() by bdev_get_by_name() call.
spdk_bdev_get_by_name() call in spdk_bdev_examine() is not changed.
This is called only from JSON RPC and not related with the bug. So
we want to fix only unlocked access to global bdev name tree.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I25f07694e569eec10dba6c3c8543f6ce77412fe8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8523
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It is better to not fail connect commands when a subsystem
is not ready. The host will not be expecting that and will
typically treat it as a catastrophic failure (i.e. it won't
retry the connect).
So instead when this situation occurs, start a poller for
the connect request. We will continue to retry processing
it until the subsystem is ready to handle it.
Fixes issue #1985.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id8835df8f0edf1e889fdd7e754e261c2a880cbb6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8571
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It is possible for a controller to get added to the
subsystem before its admin_qpair has been assigned.
We need to account for that when traversing the subsystem's
ctrlr list when determining ns and ana_changes that need
to be reported for the ctrlr.
Found while doing stress testing with connects and
subsystem ns add/remove.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie54dc6ac202faeaeace054e6599f2dea2f30211e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8570
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
After correct trstring initialization, it is
overwritten with trstring value of the current
probe ctx. That leads to a problem when initiator
connects to a sbusystem with listeners of different
transport types (e.g. TCP and RDMA). If probe_ctx has
TCP type, than discovery probe initialized probe trid
with trtype=RDMA and trstring=TCP. As results, SPDK
creates TCP controller with trtype=RDMA and we hit
assert in nvme_tcp_qpair function.
Change-Id: I9355450c40c58fa55b016220703f6f7ae36b2571
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8464
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Pollers are supposed to return SPDK_POLLER_{BUSY,IDLE}.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I92bd184aaba9e3efb730b68a6024ebc9757ffd8b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8559
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If we continuous setup and teardown cuse session, It will teardown
uninitialized cuse session and cause segment fault, New function
cuse_session_create will do the session create operation and under
g_cuse_mtx to avoid this issue.
Signed-off-by: Weifeng Su <suweifeng1@huawei.com>
Change-Id: I2b32e81c0990ede00eea6d4ed3a7e44d534d4df3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8231
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This update will allow us to use spdk_nvme_detach_async() and
spdk_nvme_detach_poll_async() easier to aggregate multiple detachments.
Previously, we could do:
spdk_nvme_detach_async()
spdk_nvme_detach_async()
spdk_nvme_detach_async()
and then started doing spdk_nvme_detach_poll_async().
Hence aggregating multiple detachments is already supported.
After this patch, the following sequence is possible:
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = 0
The actual changes is to remove the variable polling_started from
struct spdk_nvme_detach_ctx because it is not necessary anymore.
Clarify this change via updating the header file and CHANGELOG.
Verify this change by unit test.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iebdf6c27c5304a2097b7084c315ccc99634ffa1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8468
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add a new function spdk_nvme_detach_poll() to simplify a common
use case to continue polling until all detachments complete.
Then use the function for the common use case throughout.
Besides, usage by simple_copy application was not correct, and
fix it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic14711cd8478bf221c0fe375301e77b395b37f26
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8509
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The function comment was referring to a non-existent caller; instead, expand
with a little more detail on the path taken for new QPs.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I42478194f3cfc18a6ff6c434964630ac42866f1d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8534
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When the property is 8 bytes but the host only requested
4, we need to mask and only return the bytes requested
by the host. Wait to do the DEBUGLOG until after
that has happened.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8f476a47e9fd07bf652fd64f3b1c17d650374167
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8506
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Interpret bare --with-dpdk opt as user's request to find installed
(provided by the distro) DPDK's libs|include files and use them during
the build.
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I9da99671b95af0121194b3a6d53636b0ded71f1b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: <tomasz.rochumski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Was using "dst" in some cases and "crc_dst" in others for crc32c
related calls. Update them to always use crc_dst
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Icf200f1734c64c29881f23b02b8d12bad81b3ca0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8186
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Recent work identified race conditions having to do with the
dynamic flow control mechanism for the idxd engine. In order
to both address the issue and simplify the code a new scheme
is now in place. Essentially every DSA device will be allowed
to accomodate 8 channels and each channel will get a fixed 1/8
the number of work queue entries regardless of how many
channels there are. Assignment of channels to devices is round
robin and if/when no more channels can be accommodated the get
channel request will fail.
The performance tests also revealed another issue that was
masked before, it's a one-line so is in this patch for convenience.
In the idxd poller we limit the number of completions allowed
during one run to avoid the poller thread from starving other
threads since as operations complete on this thread they are
immediately replaced up to the limit for the channel.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I913e809a934b562feb495815a9b9c605d622285c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8171
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In spdk_idxd_configure_chan(), if memory allocation fails in
TAILQ_FOREACH() {} code range, we will goto err_user_comp and
err_user_desc tag, in which we donot free chan->completions
and confused batch->user_completions with chan->completions.
Memleak problem and double free problem may occurs.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I0e588a35184d97cab0ea6b6c013ca8b3342f940a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
When a qpair is destroyed and the qpair is the last,
_nvmf_ctrlr_free_from_qpair() (in lib/nvmf/nvmf.c) sends two messages,
one is for _nvmf_ctrlr_destruct() and another is for
_nvmf_transport_qpair_fini().
We do not know which of two completes earlier.
_nvmf_ctrlr_destruct() frees the qpair->ctrlr in the end.
On the other hand, _nvmf_ctrlr_free_from_qpair() calls
spdk_nvmf_poll_group_remove() in the end, and spdk_nvmf_poll_group_remove()
accesses the qpair->ctrlr to free queued requests to the qpair.
Before one recent change, spdk_nvmf_poll_group_remove() had been called
before _nvmf_ctrlr_free_from_qpair() was called.
Hence extrace the operation to free queued requests from
spdk_nvmf_poll_group_remove() and inline it into _nvmf_qpair_destroy().
Fixes one showstopper error to investigate the issue reported in #1819.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I29c43ff7b289fc77a5de9c33e0266301c412e208
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Previously core load was only considered for main lcore.
Other cores were used based on cpumask only.
Once an active thread was placed on core it remained there
until idle. If _get_next_target_core() looped around,
the core might receive another active thread.
This patch makes the core load matter for placement of any thread.
As of this patch if no core can fit a thread it will remain there.
Later in the series least busy core will be used to balance
threads when every core is already busy.
Modified the functional test that depended on always selecting
consecutive core, even if 'current' one fit the bill.
Later in the series the round robin logic for core selection
is removed all together.
Fixed typo in test while here.
Note: _can_core_fit_thread() intentionally does not check
core->interrupt_mode and uses tsc. That flag is only updated
at the end of balancing right now. Meanwhile tsc is updated
one first thread moved to the core, so it is no longer
considered in interrupt mode.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I95f58c94e3f5ae8a468723d1dd6e53b0e417dcc3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8069
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Idle threads are always moved to main core, there are no
other considations. Doing it as separate first pass,
allows to have the core stats be up to date for second
pass for active threads.
Core load stats will be used later in the series to determine
optimal target core for an active thread.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6a9bc11b86e954e461f7badebf3a6e4d1718f63c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8067
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be needed when doing multiple passes over
all threads. See next patch.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I4e9c749d69314fc268cbcb9334862392100b651e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8066
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When picking a path to go down with a thread,
conditions unnecessarily piled up.
Instead do it either of two ways:
- move idle threads to main core
- find best core for active threads and move them there
There is no need to worry about cpumask of the thread,
since _find_optimal_core() will always return a core
within the cpumask.
If the found core is the same one as the current,
_move_thread() won't perform any action.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0f4782766c15c86b5db0c970cfc9547058845b2a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8065
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Refactors logic for finding the optimal core for a thread
to single function.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ifc2b09acb6f698640ce9602fec4f567eb32b79fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6732
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Refactor all thread moves and core stats updates to single function.
At this time in series only tsc of main core was modified and
only idle tsc of main core was used. Main core would be either
the destination core or the source core. In both cases, the idle
time for main core had to be updated.
This patch generalizes this logic to always move the execution
time from source core to destination core.
As a byproduct cores besides main core have the stats updated,
which will be useful later in the series. Once core load will
be the deciding factor for choosing a core.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I57564e8b2632f919869d74e8f10b01fb3dda3be9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6658
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Now that the trace library can handle multiple arguments, there's no
point in passing 0 for tracepoints that don't have any arguments. This
patch removes all such instances. It allows us to to verify that
`spdk_trace_record()` was issued with the exact number of arguments as
specified in the definition of the tracepoint.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idbdb6f5111bd6175e145a12c1f0c095b62d744a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8125
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Replaced calls to `spdk_trace_record_tsc(spdk_get_ticks(), ...)` with
`spdk_trace_record(...)`, which does the same thing but is more consise.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib96e0bc0225490dadf857e1ddd2a3ecbf71e98c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Now that each tracepoint can have more than one argument, we cannot pad
the missing ones, as it would take too much space. Therefore, we put
them at the end of a line and simply skip the missing ones.
Additionally, since empty arguments are no longer padded, this patch
stops recording arguments with names consisting of an empty string
(containing just '\0').
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5199a3219a31d6afd3178324a4f48563b84e6149
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7958
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Now that `spdk_trace_record` receives variadic arguments, we no longer
have to pass strings as uint64_t, but can pass them directly as
pointers. That also means that the recorded strings can be longer than
8B (up to 40B).
This patch changes the blobfs code to pass the filenames as strings and
gets rid of the code that converted them to uint64_t.
Additionally, the maximum length of string arguments printed by
`app/trace/trace` has been extended to 16 and they're also padded to 16
characters, to better align with other argument types.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibe94452bf1b27eba2b15ca8608d0c3b55c2db360
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7957
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch allows tracepoint to record a variable number of arugments.
An additional function has been added,
`spdk_trace_register_description_ext()`, which allows the user to
register definitions for tracepoints specifying all the arugments that
they accept. Users can also call `spdk_trace_register_description()` to
register tpoints with a single argument (or none).
Currently, all of the tracepoint arguments need to be passed as
uint64_t.
The trace record functions use variable arguments and rely on tracepoint
description to know the order and the format of the arguments passed.
That means that the user needs to take care that they're always in sync.
Moreover, this patch extends the tracepoint entry size from 32B to 64B,
meaning that there are 40B that can be utilized for passing arguments,
which in turn means that there can be up to 5 arguments per tracepoint.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9993eabb2663078052439320e6d2f6ae607a47ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7956
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Move the definition of structure spdk_io_channel into
lib/thread/thread_internal.h, so we don't have to update SO_VER for
other libraries in future when we need to change the internal details on
the structure.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I3d2ca7a8737972e0b33ce92e464da42c48f89dec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8189
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If we're not a DEBUG build, vfu_setup_log() was effectively forcing a
libvfio-user logging level of LOG_ERR. Instead, let the log handler decide what
to report, so we can respect the SPDK levels.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ib3ad62589f495a377885f7deabaf02b428e83d30
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8452
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Commit b7cc4dd added support multiple AERs, but didn't
remove the code that hardcodes aerl=0 for non-discovery
controllers. So even though the target now supports
multiple AERs, we never indicate that for non-discovery
controllers.
The spec also recommends that implementations support
a minimum of 4 AERs - so the current behavior is not
recommended.
It seems that at least on Windows (when testing with
vfio-user transport) we see the limit get exceeded
which results in ERRLOGs. Let's keep the ERRLOG there
for now, assuming that once we report we support 4
AERs that Windows won't try to send more than that.
Fixes issue #2000.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6a07a6f37aaa6e531ae2cf1e1c46da036b00785b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8488
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We cannot control what the host may send to the target.
For example, we have empirical evidence that Windows
will send vendor-specific IDs for features and log pages
(when testing with the vfio-user target transport).
So let's change the ERRLOGs in these cases to DEBUGLOGs.
Fixes issues #2004, #2007, #2008.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8d5b92fc5e33d698af246f2f1c34f7cf51e6488a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8487
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
1. Update with latest vfio-user specification changes.
2. The new libvfio-user will not expose dma_sg_t data structure
any more, SPDK should use pointer and allocate memory for it.
Change-Id: I619b0c0828cbe3b050c628bff4c4ce7ee840510f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8377
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Community-CI: Mellanox Build Bot
When creating queue pairs, the original code uses a stack
queue variable and copy it to queue pair in insert_queue
function, the coming changes in libvfio-user doesn't expose
dma_sg_t data structure any more, we need to change it to
a pointer and allocate memory for it, so here we eliminate
insert_queue function as a preparation.
Change-Id: Iee94029d24bc8882ec169665e229e6cbc11564c0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8376
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Thread is private data of spdk_io_channel, bdev should use
spdk_io_channel_get_thread() to access it. This prepares for the upcoming
change to make the definition of struct spdk_io_channel private.
Change-Id: I643c8d677e22f6d8dde2faf91bb2711d3f5d81b8
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8426
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
in-capsule data over 4KiB when using the Linux initiator.
This is fixed in the latest kernel. See
https://lists.infradead.org/pipermail/linux-nvme/2021-May/025641.htmlFixes#1823
Change-Id: Ie383ea774ee31ef8fe255119095b21603483c33f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In blob_load_cpl(), spdk_realloc() is called to realloc
memory of ctx->pages. If spdk_realloc() return NULL,
the ctx->pages is set to NULL without being freed,
and then a memleak problem occurs.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Idf21b690e89beab0245ba57a5de66a4f506d54fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8308
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If the transport returns error when polling for
completions, it gets to a uint32_t and we end up
trying to resubmit all of the requests that are
currently queued. But that's not correct - if
the transport returns an error we shouldn't be
trying to resubmit requests at all.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9198e3e2d71875cc1e46e0ac928338bb983487f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8395
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvmf_ctrlr_get_log_page used req->data to store the log page result.
While the req->data only contains the first iov, if req->iovcnt is
larger than 1, the req->data may not hold the complete log page; and
even worse, the log page result may be written to invalid address and
cause memory corruption.
Change-Id: Ie6415a6bd2327419fe4b32f21ac814fd827c9e95
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7970
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Currently, the SPDK "core_mask" environment option only supports setting either
"-l" or "-c". Allow applications to specify more complicated options by sniffing
for a leading "-", and passing that string through unchanged. This allows, for
example, --lcores to be used as described here:
https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I38cc54bfcd356f3176cde7848e592525f9231e3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7933
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The common bdev layer will split large WRITE ZEROES ranges into
multiple children requests based on the backend device's setting,
it will try to split up to 8 children requests at a time to avoid
flood requests.
Also add UT to cover different cases.
Change-Id: Id9505fbe1c297412ef97b1f73587b22bc43f770e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7875
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This mutex is not used anywhere. After removing mutex from struct
spdk_scsi_globals, struct spdk_scsi_globals is empty. Hence then
remove struct spdk_scsi_globals. We can create struct spdk_scsi_globals
again if it becomes necessary.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I749ae43f7735a7c9383d090eae2093bb52607f17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Add three parameters, pdu_pool_size, immediate_data_pool_size, and
data_out_pool_size to the RPC iscsi_set_options to run iSCSI target
with little memory.
For some use cases, we want to keep the max number of connections,
but simultaneously we want to reduce the pool size and let I/Os wait
until resource is provided.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I74dc785310b1d985f3e338c1e13fba3a3840d113
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8191
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In nvmf_vfio_user_listen(), fd should be closed before
set it to endpoint->fd, otherwise, the fd leakage probem
occurs.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I3fabc65d2764926e5873475962e4362e46eb37e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8309
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In spdk_idxd_get_channel(), if chan->batch_base is allocated
faild, we should free chan before returning NULL.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Ia652c334aead592429c1171da73d67160879686d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8301
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In ioat_channel_start(), if spdk_vtophys(ioat->comp_update) returns
SPDK_VTOPHYS_ERROR, spdk_free is called to free ioat->comp_update,
and ioat->comp_update is not set to NULL. However, the caller
ioat_attach() will also call ioat_channel_destruct() to free
ioat->comp_update, then double-free problem occurs.
Here, we will not free ioat->comp_update in ioat_channel_start(),
ioat_channel_destruct() will do that.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I3be19a3feec5c2188051ee67820bfd1e61de9b48
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8300
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
POSIX defines %z for printing size_t values in a portable way.
Replace a reference to %ld to remove the assumption about
the type of size_t.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I2186aa5e7072f565ea75de935e22c2c23acf1a1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8341
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
In blob_serialize_add_page(), *pages is set to spdk_realloc(*pages).
If spdk_realloc() returns NULL, the *pages pointer will be
overridden, whose memory will leak.
Here, we introduce a new var (tmp_pages) for checking the return
value of spdk_realloc(*pages).
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Ib2ead3f3b5d5e44688d1f0568816f483aa9e101f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8307
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
In spdk_fs_create_file_async(), file->name is set to strdup(name).
We should check whether file->name is equal to NULL.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I2219cc353eb4711290aee2599505f57af9088bb2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8302
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If iscsi initialization fails (due to a memory allocation
failure for example), we may not even get to the point
where the g_iscsi global is registered as an io_device.
So then when we tear down the iscsi library using
spdk_iscsi_fini(), we need to make sure we don't
try to unregister g_iscsi if it wasn't registered.
For now, just use the g_init_thread global to make this
determination - it's set just after we register the
io_device.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic9443564ef67b9c0df0fce47a346f4608749c306
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8351
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Because we use spdk_dma_malloc, then it does not init
the the contents in the memory.
Fixes#1996
Change-Id: Ieef411f6ae5114de9f732df6096e0bb123efb7e0
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8374
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
In spdk_nvmf_subsystem_add_ns_ext(), ns->ptpl_file is set to strdup(),
which may return NULL. We should deal with it.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: If95102fe9d6d789b8ba9e846c4d7f4e22e48a93c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8305
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nbd will be closed in nbd poller function asychronously.
Unify the stop process of HARDDISC and SOFTDISC in same place.
Prepare for following patch.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ida33ff6d081e68290cfa393c0c47fe7af545958b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8036
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
For receving the pdu, we add the crc32c offloading by Accel framework.
Because the size of to caculate the header digest size is too small, so
we do not offload the header digest.
Change-Id: If2c827a3a4e9d19f0b6d5aa8d89b0823925bd860
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7734
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
siginfo_t is a GNU extension. SPDK (and DPDK) have
direct dependencies on GNU extensions, but it's a bit
nicer if external modules don't also need to define
_GNU_SOURCE. Currently siginfo_t parameter in the
spdk_pci_error_handler is the only thing that violates
this.
Note that DPDK also supports registering sigbus handlers,
but they take the failing address as a parameter instead
of the full siginfo_t structure. Let's adopt the same
for SPDK.
While here, remove an extra semicolon that was just after
the virtio sigbus handler function signature that was
updated in this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07faf11a3ac3589c637cb2196581c102286b1e68
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8333
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
At this time only main lcore frequency is changed,
depending on its load either up or down.
Exception is when at least a single busy thread is present
on non-g_main_lcore. Then the main lcore frequency is set
to the maximum possible.
This patch moves when that is determined, from 'moving'
logic to one that sets reactors to interrupt mode.
If at least one thread is present on non-g_main_lcore,
it has to be busy. Otherwise it would be placed on main lcore.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2900598afe53fb609e1f06a60d5245f74511e1c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8050
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This field was only used to keep track of number of threads
that will be present on a core after scheduler moves.
It was used only internally within scheduler_dynamic.
Event framework has no need to keep such field in core_info.
Instead added field in cores_stats internal to scheduler_dynamic.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3ce74d4a25eac81e58da8705a1c4553730fc1e57
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8049
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Added core_stats structure that will hold stats modified
during balancing.
Further patches will modify the values in this structure,
to for example judge how much execution time a core
has left.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib8e611e36642c4543b5cb43bc2695c613d38f0fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6657
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch expands spdk_scheduler_core_info with two new
fields that will contain core stats only from last scheduling
period.
This will make sure that schedulers do not have to keep track
and calculate this value on their own.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3aa7dfa6a60c1d14d95a0e684e84c2e83f0a4496
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8048
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
(a5ad0f80) lib/event: update reactor tsc_last going poll mode
Patch above updated the tsc_last at the very end of changing
interrupt mode of the reactor.
The flow for turning from interrupt mode to poll mode is
first to send an event to the target lcore, then to iterate
over all reactors updating notify_cpuset on each.
Previous patch updated the tsc_last after notify_cpuset was
updated, meanwhile the threads could already been put on it.
This patch moves it immidietly to the point of changing
the in_interrupt state.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6aea252016f4706369b8b597b765593bc6edca3b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8111
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Renamed core_busy_tsc and core_idle_tsc to better
describe that they contain particular core stats for
its whole lifetime.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6f16b2b0a162aad8fbaf18f549fc50a2372b920b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8047
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
There is no need to keep new_lcore field.
lcore value is enough to determine the target core.
Meanwhile _threads_reschedule() can see if the target
core matches the one from core_info.
Removed _spdk_lw_thread_set_core() since it did not
serve much purpose.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I82c7cfebf1107b4a55b2af9b891052084a788907
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8046
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
lw_thread->lcore was set during gather_metrics,
rather than just after the thread reschedule.
This patch just moves it to the right place.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0477830902f68102e4e4f0ffc9359bd004a8ad42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7961
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
So far the schedulers had to calculate the diff of
current_stats - last_stats on their own to get tsc
from last scheduling period.
Renamed the current_stats to total_stats, but kept the meaning
as stats describing tsc for lifetime of a thread.
Instead change the meaning of the last_stats to describe
the tsc of only last scheduling period and change its name
to current_stats.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1a165ff7c1afe659b432c3127a351a96878d1f3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7843
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Using the CMB for SQs is not a standard use case.
Performance can vary widely when using CMB for SQs
and is typically not the configuration used for
benchmarking.
So let's change the default value here to 'false',
users can still opt-in by setting this option to
true in the spdk_nvme_ctrlr_opts structure prior
to attach.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iab746ba777b04152ffb92fea2a2bb923a0a0bf21
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8227
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
QEMU 6.0 by default uses a RedHat dev/vendor ID rather
than the Intel one that has always been used to date.
We need the NVME_QUIRK_MAXIMUM_PCI_ACCESS_WIDTH quirk
so that we do not use wide instructions to copy SQEs
to a virtualized CMB, since QEMU does not support
that.
The NVME_INTEL_QUIRK_NO_LOG_PAGES quirk is only needed
for devices with SPDK_PCI_VID_INTEL, so we do not need
to carry this one over to the new REDHAT entry.
Fixes issue #1986.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3d339b3525e7c6ceb792eb9d143e7a922c19344d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8226
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The NVMe driver layer will clear this log, so we don't
need to send another one in the aer callback.
Here we change the logic to compare with previous NS
state, if the NS state is same it will fail the test.
Change-Id: I6d80cb6a5f6d5eab92b8ccac601a23c19cea4003
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8175
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The arguments of a tracepoint are formatted when they're printed now, so
there's no need to append ":" or pad it with spaces.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I74f5568f1982dacc079e3b80bd19a9cd740b48ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7955
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The traces record calls to spdk_(get|put)_io_channel() and saves the
reference count of the IO channel and its context. The context, instead
of an IO channel pointer, was selected because the same pointer is often
used in other traces (e.g. nvmf's poll group), so it makes it possible
to match these traces together.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I15fe982a89685d8f6e23d406d6d48f5c2d9d604b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7232
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Both values should provide similar information, while the qpair can also
be matched to the traces from lib/nvmf allowing the user to track the
qpairs across these modules.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iba9abdd3f41b93100c0403b1c90fc4549d39189e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7159
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The traces are tracking the lifecycle of a poll group: creating it,
adding and disconnecting qpairs, and finally destroying the group.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I075b7f24d14b8fbb42bb18ddd70a668a8bace118
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7158
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Upcoming patches will add accel_fw support for batching this cmd
and then vectored versions later along with accel_perf to exercise
them.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I59f577f4365fbf063d7419cc6052e10c998b58bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8143
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Was using reserved field. Similar fix to what was done earlier
for direct submission of crc32c operation.
fixes#1972
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ie9867e72f60c7f38aa1af0273a036f34580ed4c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8145
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
To match regular sumission prep function and allow caller to
modify both descriptor and completion structures. Also allows
for more accurate error reporting. Needed for upcoming patch to
fix batch CRC submissions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I4b7b5b24e2f54149b513d4b23ba32f3802aff3e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8144
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Upcoming patches will add support for the idxd engine and
the accel_perf tool. Also following will come vectored support and
batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7223517a844525ad52ed49d65627b04c3cd9fe7c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8141
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Upcoming patches will add support to the accel fw, the idxd engine and
the accel_perf tool. Also following will come vectored support and
batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ie0fbd4b8da9f727426000898c0b511587adac65b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8139
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
_nbd_fini will make all NBDs into closing state.
remove _nbd_async, beasue it will call asynchronous error.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ifb873b7f079b735983bdf20c2df652be0a21919f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8035
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
And for some internal functions we need to pass controller
parameter so that we can do vtophys based on transport type.
Change-Id: I3ca4fa162ec9305f62b295ba21f7474c21edfe52
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8031
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We need to wait to process this quirk until after we
have a valid CAP register value. Before this fix,
controllers with this quirk would get their io_queue_size
always capped at 2 (min io queue size) because CAP hadn't
actually been read yet.
Fixes: f5ba8a5e (nvme: add NVME_CTRLR_STATE_READ_CAP)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4df87b5dfb0faa21db5b4cf6fc667d80621d1691
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8211
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The inline function can also be used in the coming submit request
function.
Change-Id: If4a5511001e6586dbce0978298beddc537f54d8b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8173
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The PCIE and VFIOUSER both can use this function, the only difference
is VFIOUSER should use IOVA=VA to do the vtophys translation, so
here we will move the function to the common PCIe layer as the first
step.
Change-Id: I699edb67a00a2fa534072fc02ac2dd4a27aba8f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8030
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Each time the following file
"/sys/kernel/config/nvmet/subsystems/nqn_name/namespaces/ns_id/enable"
on the target side was changed, the SPDK initiator should receive an
async event (type: SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE, info:
SPDK_NVME_ASYNC_EVENT_NS_ATTR_CHANGED).
But actually not.
Since for SPDK, when target sent the non-first event, the condition
"nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_NS_ATTR)" that prevents
target from sending event was matched.
This commit fix this issue by issuing a get_log_page cmd for each async
event received, just as the kernel initiator does.
Fixes#1825.
Signed-off-by: tyler.sun <tyler.sun@dell.com>
Change-Id: I2973470a81893456ca12e86ac390ea1de0eed62c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7107
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Also add scripts/bpf/nvmf.bt to enable and log these
probes.
This patch also adds a script that can generate
a bpftrace script snippet with string maps for
needed enumerations (currently nvmf_tgt_state and
spdk_nvmf_subsystem_state). This allows us to
dynamically generate this from the source code, and
can be extended for other enums we may want to
add in the future.
Thanks to Michal Berger for converting my original
gen_enums.py script into gen_enums.sh!
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: Iff34a6218aef40055ac14932eea5fc00e1c8bcf5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7194
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This reverts commit 2246a93718.
We are seeing a lot of failure on io_device lookup in the test
pool. These only showed up after this patch was merged and sees
the most likely culprit. Reverting this patch for now while we
continue debug.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2ab098319dfae3a5356eb4fe0dbf9f4af2d2eea5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8199
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use the macros for red black tree provided by Free BSD to speed up
io_device lookup.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Ib3bd382bbeb610503194e7d7bfd569f60a0d0121
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7894
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The original function will disconnect queue pairs first and then free
controller memory finally, so rename it to vfio_user_destroy_ctrlr().
Change-Id: Idc235e4186bd4164be712fc9d4cda4991efc6248
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7624
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
The coming destroy_ctrlr() function will disconnect
queue pairs and free controller at last, so here
rename the original destroy_ctrlr() to free_ctrlr().
Change-Id: I527b2742142d60b0383be5a12391c77dd50d47a7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7623
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
The original destroy_qp() only release the queue pair
related memory, and free_qp() will be called inside
destroy_ctrlr() function, so also remove one duplicated
line here.
Change-Id: I2a06a6704b514361685068acda4e65ed5d502f0d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7622
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
The controller concept in NVMf is like a session, for any
new connection in nvmf_vfio_user_accept() with the endpoint,
we treat it as a new controller, we don't need the `ready`
field in controller to indicate the connection state, we
need the connection state in endpoint, so here just use
endpoint->ctrlr point to indicate the socket connection
is valid or not.
Change-Id: I588dbba7973cb61a1d79d81324a43e052f7dafb0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7621
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Adding iov to the spdk_bdev_zcopy_start function enable spdk_bdev_zcopy_start to
be used by transport layers as the iov is owned by the transport command
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: I6d2be7f49566048bf25b7711ada8d2fb49fea6ee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6816
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
ZCOPY emulation is not required. Modules can check if the bdev module
supports ZCOPY. If not supported the module uses the existing
READ and WRITE operations.
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: Idac0a4d27a79a6c7e567c420e15637e826c347c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6815
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
If a reconnect fails, we restore the original
transport_failure_reason after we're done with
the failed reconnect. Save the original reason
in the qpair itself rather than a local variable,
to facilitate upcoming changes where connect will
be asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I20ff43fc687a379aa5c930e17cf3ff8d730320be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8116
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Previously we were only checking trtype==PCIE to
determine whether a controller was fabrics. This
skipped the vfio-user case. So use the new
spdk_nvme_transport_id_is_fabrics() API instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I81f26853f44b1c47522ce6354e5aa4a905796bd0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8089
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Purpose: Use the new function in order to reduce duplicated code.
Change-Id: Ie848c7586575b3f0bb617d7e767cf459b43d4783
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8174
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Nvme-cli submits a RESCAN IOCTL after a format command to
update any information that may have changed during the
format, such as LBA Format. This patch adds support
for RESCAN by executing nvme_ctrlr_update_namespaces to
update the controller information.
Fixes: #1964
Change-Id: I9f03e00a7f39339947ff02390f69ce806e1cfa0e
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8146
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Rather than to rely on schedulers to access and modify
last_stats values over multiple scheduling periods, move that
operation to event framework.
Providing this to the schedulers in generic manner is better
than enforcing that each scheduler has to keep track of this
data on their own.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Icaf3b4af80d86fafaddf328fd230db9743d21ab5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7971
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
A Security Receive command with the Security Protocol field cleared to
00h shall return information about the security protocols supported by
the controller, so we can check the TCG security protocol is supported
or not before sending it.
Fix issue #1961.
Change-Id: Id061defe45db981b276e2794fd0b59f8db70b7f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8083
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Perf improvement, directs DSA to write to cache as opposed to
mem.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I0d6ba157af8f1b54f8aae3b8e54a6f7754e4a9de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8169
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We already keep a list of outstanding completion locations to
poll and were previously polling all of them. New ones are
added at the tail and we poll the oldest first from the head
so if we break when we find a slot that hasn't completed we
can get more work done while the HW finishes. This is a proven
performance improvement in limited testing.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Icc15041605586f9a31435d447d253c381c00b1f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8161
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
1. struct pxdcap.per is changed to struct pxdcap.rer
Which matches the name in the nvme spec.
2. use new API return value.
3. update specification changes.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: Ida421c4cffd1c65d550e83011ab123b321ea9dff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8088
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
vfio-user transport `req_complete` callback will zero the internal
NVMe command and response fields, the common NVMf library should
not use them after the callback, so here we use stack variables
to save them before the `req_complete` callback.
Fix issue #1965.
Change-Id: Iff2342b6095d9496cdf112d657a0a99ce1fb5d12
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8129
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Initialize ANA state of each namespace to optimized regardless of
whether ANA is supported or not. This will simplify the code to get
the optimal I/O path because we do not have to care if the namespace
supports ANA.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I24dfe08674af398671de6528b884e9d82409eeae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7890
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The purpose is to reduce the duplicated code in nvmf and iscsi
layer.
Change-Id: I7e96f0d5bb1ba4b81378addca3cdd929056384e9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8132
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
We map the SPDK_NVME_TRANSPORT_* values directly to
the NVMe-oF trtype values. Since PCIe isn't
Fabrics, we choose 256 which is outside of the
8-bit trtype range of values.
So we can just check if trtype >= 256 to determine
if the trid is for fabrics or not. This is
preferable to checking PCIE || VFIOUSER in case
additional non-fabrics transport types are added
in the future.
I considered taking a trid as the parameter instead,
but went this route since it is consistent with
the existing spdk_nvme_ctrlr_is_discovery().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib62ff4d30549b2324486c81f2dce67f0f1741e9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8077
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Connect the adminq as part of controller initialization
instead of controller construction.
We never actually 'connected' the adminq for
PCIe or vfio-user transports, since its a nop.
But their connect_qpair transport ops function
is also a nop for the adminq, so it's fine to
generically connect the adminq across all transports.
Note that we cannot read registers (cc or csts)
during controller initialization now until after
the adminq has been connected since reading fabrics
registers depends on a connected adminq. This gets
special cased for now, but eventually reading
cc and csts will need to be part of the state machine
itself to make it asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia5566d7c549d78d24b94ea253df51e697da6237f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8079
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
According to NVMe-oF 1.1 spec, it is not a fatal error.
So according to Figure 126 in NVMe Base specification,
we should return "Transient Transport Error".
Change-Id: I601304ae2bb24508882fb1ec8c7e53ec587ab515
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7795
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This ensures the discovery ctrlr initialization is
done the same as normal ctrlrs. This will be
critical as we make the driver fully asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I33c4fd7c82d241c30e7adb89abe79b8088c8776a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8090
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Read CAP (Capabilities) register as part of controller
initialization instead of controller construction.
For now, still read CAP in the pcie and vfio-user
controller construction, since they need the
drstd (doorbell stride) to construct the admin
queue.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I000fe880f2ec0d6de1d565c883d7ea0ae1ac2c81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8078
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Read VS (Version) register as part of controller
initialization instead of controller construction.
This prepares for upcoming changes to make
controller attach fully asynchronous. Since reading
fabrics registers is an asynchronous operation, it
will be easier to read the VS register as part of
controller initialization which operates as an
asynchronous state machine.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I771386dbdf5902633e0d9f91b3b20be98f26fdc3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8076
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
We're going to be adding some new states (READ_CAP
and READ_VS) in future patches, that we want to
come before the current "INIT" state.
So we will simply make "INIT" have the same
value as this new NVME_CTRLR_STATE_CHECK_EN state
for now. That means existing code won't have to
change later once we add new states that come
before CHECK_EN.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07ca92e28ab1cd8d838cdef5c3ff36ba80a224bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8075
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
_reactor_schedule_thread() zeroes out the lw_thread on move.
To properly calculate thread stats since the move,
save them right after reschedule.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I44cc3b5907adda35b3117c2dd7268dc813d59853
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7919
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_thread keeps track of tsc from its whole lifetime,
those can be requested with spdk_thread_get_stats() at any time.
spdk_lw_thread uses stats from above and keeps track of two points in time:
- current_stats reflecting stats at the time of gather_metrics stage
- last_stats reflecting stats from previous gather_metrics stage
1)
Before this patch current_stats were duplicated in snapshot_stats.
There is no need for that so now they are removed.
2)
Removed _spdk_lw_thread_get_current_stats() since it would be copying
current_stats to current_stats, thus not perform any action.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5e5d4039cd0f7cc10ba150a3d915b90ec96589d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7842
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Whenever an event executes, it might change the currently
set thread or reset it to NULL.
To prevent it from affecting other events, set the current
thread each time an event executes.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6f1e7f8b7acab25353b4782058e87a9e01aab2c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8045
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
_get_thread_load() is function used to determine
the load of a thread based on relation of busy/idle tsc
from previous scheduling period.
In order to avoid division by 0 calculating the percentage,
we can simply exit early determining that thread was not
doing any work.
Having this check here will make sure that no matter
the changes in event framework, scheduler dynamic will work.
Removed the place that updated last_stats if they weren't
yet updated at least once (first scheduling period iteration).
In this case after change to _get_thread_load() will be the same,
as only the latest iteration will be used to calculate thread load.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I75f0f12f024675f2473a26e30596d6eb28093d46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7917
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
In struct spdk_nvmf_poll_group_stat, there are statistics of cumulative IO and
admin queue pair counts. But current qpair counts are not reflected. Use
this patch to add current admin and io qpair counts for a poll group.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I7d40aed8b3fb09f9d34e5b5232380d162b97882b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7969
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Eugene Kochetov <evgeniik@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Block device product name is same among same type
of the block devices, while Guest VM may use this
value to generate UUID, so here we change it to
block device name instead.
Change-Id: I58c5fb271a6a436c15520616c2065eee9c37300a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7996
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Now, vfu_setup_region() must specify the region fd offset (which is always zero
in our case).
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I10795d848a4c73ee9e1e78ea63776074401c4b17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8022
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use the macros for red black tree provided by Free BSD to speed up bdev
name lookup in spdk_bdev_get_by_name().
In the bdev_multi_allocation test, we can get 3x ~ 5x speed up when
creating multiple bdevs for various bdev nums.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I49a2fbcccf06d4c36cbd445ce59e0b0dd4ada31d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7837
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Clean up spdk_bdev_register() to facilitate the upcoming patch which
uses RB tree to speed up bdev name lookup.
* move TAILQ_INSERT from bdev_start() into bdev_init();
* rename bdev_init() to bdev_register() and rename bdev_start_finished()
to bdev_register_finished();
* inline bdev_start() into spdk_bdev_register().
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Idbfc800472bc8c6f9b615046e082772e9f6026e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8043
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
nvmf_get_ana_log_page used req->data to store the log page result.
While the req->data only contains the first iov, if req->iovcnt is
larger than 1, the req->data may not hold the complete log page; and
even worse, the log page result may be written to invalid address and
cause memory corruption.
The following patch will fix the same issue for other commands in
nvmf_ctrlr_get_log_page.
Fix#1946
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I495f3be05c82be5cd53609772c655c8924b9179f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7923
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Since we will reuse send_pdu for other purpose in the next
patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iee5166131b70a25bc13aaa847bfc9066231f31a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8028
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
For nvme/tcp connection, we use the synced manner
if the qpair is not fully connected. Thus without
the check, we will stuck here. And this patch
fixes this issue.
Change-Id: I72815bf5b4c0b31c4866bc1b9034b0e42b81d3f1
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Loading subsystems and restoring state from a JSON config file is useful
outside of the SPDK application framework, so move it to lib/init.
Change-Id: I7dd3ceace2e7b1b28eef83c91ce6a4eedc85740e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6645
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
I'm not sure whether this should go into lib/init or to lib/rpc
directly, but I've chosen lib/init for now.
This is to support applications that want to run the SPDK JSON
RPC server, but aren't using the SPDK application framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I79ca39aa0ca6e1a3a6905b0bf73e6cc99b086e55
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6644
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The functions to initialize the SPDK subsystems or tear them down
was previously an internal-only API. Make it public for use by
applications that aren't leverage SPDK's application framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I2ebfd020e6fa4c1947fa1c1a2ac509ce9b0242f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6643
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
There were some functions in the internal header that can be entirely
private to the init library. Move them over.
Also, remove the support for including the header from a C++ file
because these headers are internal to SPDK which is pure C.
Change-Id: Ic4323b2b8664e70106a57b3ca8acbc7c2efe621d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6642
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The header size is very small, which does not have too much value to
offload such calculation by hardware.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iaa82f39312df7eef3282325a33677ea41ab735ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8011
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
open_blobids holds bit array of currently open blobs,
this is a way for quicker determination than iterating
over all blobs. See patch introducing it:
(30ee8137)blob: Add a bitmask for quickly checking which blobs are open
That patch added resizes of this bit array to bs init
and bs recovery path (not shut down cleanly).
But that patch skipped over bs load from a clean shutdown.
This resulted in blob open having multiple blob pointers that
target the same blob id.
Fixes#1937
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3c42a63d168d1f5b013b449f010c5b207936045b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7998
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Mellanox Build Bot
Before this patch idle_tsc was sum of all idle tsc of all
threads running on a reactor.
There are cases when no threads are present on the reactor,
and _reactor_run() spins doing nothing.
To give more accurate representation of the reactors state,
the idle_tsc now adds time spent doing idle spinning.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If797b2a03507d17b07367d56d5f6c40cefbbbd49
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7900
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Disabling interrupt mode on reactor now updates the tsc_last
to current time. So any further tsc caulations will not account
for the time when reactor was in interrupt mode.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I56fb8a738eea60ee5de3b49d586f7cb228b54510
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7901
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
tsc_last value is used to update thread stats
during _reactor_run(). See:
spdk_thread_poll(thread, 0, reactor->tsc_last);
If no threads were present on the reactor,
this value got outdated and resulted in
adding time reactor spent with no threads to
stats of the first thread placed on that reactor.
This patch fixes thread stats by making sure
that argument to spdk_thread_poll() is up to date.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0c35fdba1b63b6ee19a5a2b34751090839cb2438
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7845
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This is useful for applications even if they elect not to use the SPDK
event framework.
This doesn't shift everything in one go - just the subsystem
initialization logic. Configuration file loading also needs to move
in a separate patch later.
Change-Id: Id419df1045442d416650ed90e5ee78adfdd623d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6641
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Get all of the hot stuff to the first cache line.
* Shrink the xfer enum to one byte (it only has 3 values).
* Pull out the dif enabled flag form the dif structure so it
can be access separately
* Rearrange the members
Change-Id: Id4a2fe90a49c055a4672642faac0028671ebfae9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7827
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
(cd83ea4a)thread: Add SPDK internal APIs spdk_thread_get_first/next_active/timed/paused_poller()
Patch above by mistake iterates over active_pollers list
for function that lists paused pollers.
Fixes#1947
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1b69d942675f34f5f046ec46feacc8d81d89f015
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7952
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
When blob persist starts, there can already be multiple
of such requests pending. It is possible to complete
a set of persists at once, if blob state after their
execution would be the same. This is the case when
persists are already pending when a particular persist
request is started.
This patch implements such mechanism by introducing
persists_to_complete queue, containing entries that
were previously queued up before starting the current
persist request. If there are any entries in this queue,
further requests are put into pending_persists.
When first request from persists_to_complete is persisted,
completions are issued for all requests on that queue at once.
If at that point there are any new entries on pending_persists,
all of them are put into persists_to_complete. Persist process is started
again with the first request from that queue.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I10063e55d6f821b1863de016d3148da6a719a422
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Was using reserved field in CRC to store the final address of where
to put the result, this not legal. Move to the completion record
and slightly re-arrange the struct to keep it at 96 bytes.
Refacorted the IO prep function so the caller can udpate both the
descriptor and completion records instead of continuing to add
parameters to the prep function for opcodes that need something
unique in the completion record. This also allowed for a minor
fix where the prep function was returning NULL when vtophys failed
which would have indicated busy as opposed to failire. Now we can
proprely fail that path.
fixes#1929
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic23bc7b68bdd5757c30b7963880677f423368e20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7735
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Update libvfio-user to the current version, updating the client for the relevant
changes.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic64ace08ac0c7e9676f04f8d1f47a9c0388a2652
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7983
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Support ACRE (Advanced Command Retry Enable) feature. Currently set
all crdt to 0 and only select crdt[0] (crd=1) when the IO has any
error.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: If7bc30f91f5b2d0839002dead17188a4b3a52d5d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7885
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
That prevents nvmf target from starting to destroy poll
groups prematurely
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I833f6198ef0e3083fdadf70dd3b62844c905aceb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7881
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch changes the order of IDENTIFY_ACTIVE_NS and CONSTRUCT_NS
controller states. It is required to further improve memory management
for namespaces by allocating memory only for active ones.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ie540442b1bd9e897afcbaa4319c139109dd0c515
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6503
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previous implementation allocated memory just once at the beginning of
active NS list retrieval procedure. It allocated memory for maximum
possible number of active namespaces, i.e. 'cdata.nn'.
This patch changes allocation logic. One page is allocated at the
beginning. If more is needed, reallocation is done with one more
page.
This patch also removes SPDK_MALLOC_DMA flag from allocation since we
don't do RDMA directly into this buffer.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Iaa80c4d70c54daaf71dcbf755c63a01a1d83b772
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6502
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Use the macros for red black tree provided by Free BSD to manage
timed pollers efficiently.
Allow RB_INSERT() to insert elements with duplicated keys by changing
the compare function to return 1 if two keys are equal.
Check the return code of RB_INSERT() because this is the first use case
for RB tree macros in SPDK. We did the same for RB_REMOVE() by
adding another temporary variable but we remove it from this patch
because it is not so important compared with RB_INSERT().
When a timed poller is inserted, update the cache for the closest (leftmost)
timed poller only if the tree was empty before or the closest (leftmost)
timed poller was actually changed. We do not have to use RB_MIN()
because all duplicated entries are inserted on the right side.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe253ca8eecc10116548b5eedbcdba8fb961b88d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7722
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We already hold thehe next closest timed poller in tmp. Inlining
poller_remove_timer() into thread_poll() makes the cache update
more efficient.
After this patch, poller_remove_timer() is called only in a single case
and the case is compiled only on Linux. So add it inside of a temporary
block is much clearner. However it will be used by spdk_poller_reschedule()
in the end of this patch series. So keep
the current position.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2e6858223713eed84f5d70b160da6122edae6d03
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7910
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This enable us to optimize the cache update when RB tree is supported.
Call poller_remove_timer() after getting the next element because
as TAILQ_FOREACH_SAFE() and RB_FOREACH_SAFE() do, TAILQ_NEXT() may
not be valid after the current element is removed.
Previously, the patch had called poller_remove_timer() before getting
the next element. However, thanks to the nice testing, this bug was
found.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I18afb4412115dc1696cc568610cbe3dc618c2357
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7909
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This change is a preparation to first dequeue the closest timed poller
always when it is expired. Previously the poller_remove_timer() calls
were not consistent and difficult to follow.
spdk_poller_pause() sets poller to PAUSING even when it in RUNNING
and move it to PAUSED after returning from its context.
If spdk_poller_pause() and spdk_poller_resume() are called while poller
runs, it is moved to WAITING. Hence thread_execute_poller() and
thread_execute_timed_poller() ignore such cases.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I29340613a2ec0c3529d0886f4d81c0a0fdf8745d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This API was removed previously, so remove remaining
references in map file and unit tests.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba2f6a5f5ba590d3996dc133c8181083a33d7405
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7963
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Currently we allocate buffers perf each SGL descriptor.
That can lead to a problem when we use NVME bdev with
PRP controller and length of the 1st SGL descriptor is
not multiple of block size, i.e. the initiator may send
PRP1 (which is SGL[0]) which end address is page aligned
while start address is not aligned. This is allowed by
the spec. But when we read such a data to a local buffer,
start of the buffer is page aligned when its end is not.
That violates PRP requirements and we can't handle such
request. However if we use contig buffer to write both
PRP1 and PRP2 (SGL[0] and SGL[1]) then we won't meet
this problem.
Some existing unit tests were updated, 1 new was added.
Fixes github issue #1853
Change-Id: Ib2d56112b7b25e235d17bbc6df8dce4dc556e12d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7259
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
zipf is a power law probability distribution. When
applied to performance testing of block devices, it
will select blocks over the full range of LBAs, but
will more frequently select lower-numbered LBAs.
The theta parameter governs the distribution - higher
values of theta will concentrate the distribution on
a smaller number of LBAs.
Note that fio supports zipf, so adding it to SPDK
will enable our perf tools (bdevperf, nvme-perf) to
provide similar functionality.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7df129c9d61996a2070188c6cd9f1fde631ac208
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7779
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We cannot solely rely on the qpair_ctx->count reaching
0, because qpairs that are in process of being
disconnected will immediately invoke the qpair
disconnect cb.
Instead, we need to wait until the poll group
no longer has any qpairs remaining on the subsystem.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I977747d367d14a4bf60f66a1147b3d75679e5179
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7870
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
When we introduce RB tree, getting the closest timed poller is not
O(1) but O(log N). To mitigate such delay, cache the closest timed
poller into thread, and update the cache when its content is changed.
Add unit test cases for this change. They will also clarify the current
behavior of spdk_poller_unregister() and spdk_poller_pause() for
timed pollers.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibb98a54c261859a3210034038d3953e5c93ef8aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7720
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Requests that still reside on retry queue should be
submitted to disk before shutdown.
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Change-Id: Id2d020fcaef6443d01cfd8628686e9b0f34a1cfa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6771
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch reduces admount of changes in the next patch,
no functional changes added.
The next patch will add usage of contig IO buffers for
multi SGL payload. To support it we need to pass an
offset to fill_wr_sgl function. Also in the current
version we assume that for 1 iteration we fill 1 IO
buffer, the next patch will change it and we'll need
to swtich to the next IO buffer in special case. That
can't be done easily if we use fill_wr_sge function
Change-Id: Iee8209634637697f700f8fa9fe61ead156b6d622
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7258
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The header is small enough that it likely won't ever make sense
to offload the digest computation.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib6baa201a76d769d978f498f5c65985d5ab06ffd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7766
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We should use `ns->registrants` to count the number of registered
controllers(REGCTL) for Reservation Report command, as `subsystem->ctrlrs`
only list current active sessions(controllers).
Also use the output data buffer directly so that we don't need to calloc/free
during the process of Reservation Report command.
Fix#1928
Change-Id: I650224b751a08416208b8a504b82debff31e92fd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7822
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com>
This is the same effort as spdk_poller.
The following patches will move the definition of struct spdk_thread and
enum spdk_thread_state from include/spdk_internal/thread.h to
lib/thread/thread.c.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7f7bdfdd7a7b1b834d16d79638a4fd2d63e9daf6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7800
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Most accesses to the struct spdk_poller outside lib/thread have been
done via functions but a few direct accesses remain.
Change these to indirect accesses by addinng a few helper functions
as SPDK internal APIs.
Add spdk_poller_get_name() to get the name of the poller.
Remove spdk_poller_state_str() and add spdk_poller_get_state_str().
Exposing enum spdk_poller_state outside lib/thread is not really
necessary.
This removal requires us to update major SO version.
Add spdk_poller_get_period_ticks() to get the period ticks of the
poller.
Add struct spdk_poller_stats and spdk_poller_get_stats() to get
the stats of the poller.
The next patch will move the definition of struct spdk_poller and
enum spdk_poller_state from include/spdk_internal/thread.h to
lib/thread/thread.c.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id597dae074a15fcd8af09fd9d416a22ce2f403c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7798
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It is better if the internal of poller or thread is not accessed
outside lib/thread.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I80d488a111fd9a67a0da32d1e63695ce5a6bcb4c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7776
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The following patches will update the cache to the closest timed
poller when removing it from the list. To do it easier, factor out
the operation to remove a timed poller from timed_pollers list into
a helper function.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I25016d86117b240a2651d1f06e23bea0342211f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7719
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When we introduce red black tree for timed pollers, we will not use
RB_FOREACH_SAFE() but cache the leftmost (smallest) node and iterate
from it via RB_NEXT() instead.
As another preparation, separate TAILQ_FOREACH_SAFE() into TAILQ_FIRST()
and TAILQ_NEXT().
The next patch will cache the first element to thread and refer it
first.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie03c387b5b3a055c668e7b439a5eb05ed77eaa81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7718
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It will be sufficiently reasonable to check if the caller thread is valid
even if spdk_poller_pause() or spdk_poller_resume() does nothing.
Besides, let's write all possibles states explicitly in switch - cases.
This refactoring clarifies the logic and makes the following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1163eff388fe741d6b6924f474a82b1aa7d18acb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7665
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Change if - else if - else blocks of thread_execute_poller() and
thread_execute_timed_poller() to switch - cases blocks and specify
possible states explicitly in these switch - cases blocks.
The code will be simpler and clarified, and then the following patches
will be easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2e283894f5d69e1bd67466ae070c5b8bb9014616
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7664
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There will be no issue even if time poller is unregistered or paused
after it is expired. The iteration is stopped anyway after the head poller
is found not to be expired.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2b394b8b517930a6630dd31f59fcaea12eb80572
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7662
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The following patches will introduce red black tree to manage
timed pollers efficiently but it will be based on macros available only
in lib/thread/thread.c. Hence then it will be difficult to expose the
internal of timed pollers tree outside the file. On the other hand,
we do not want to include JSON into the file.
Hence add a few SPDK internal APIs to iterate pollers list transparently.
For spdk_thread_get_next_active/timed/pause_poller(), we omit the parameter
thread and get it internally from poller->thread even if the names include
the term "thread". This will be slightly cleaner.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I000801a2e4dc42fa79801a2fd6f2b06e1b769c88
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7717
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add {min,max}_cntlid to spdk_nvmf_subsystem, defaulting to 1 and
0xFFEF, respectively, and add nvmf_subsystem_set_cntlid_range() to
allow the controller range to be configured in the range [min_cntlid,
max_cntlid].
Also add {min,max}_cntlid to the nvmf_create_subsystem RPC to allow
the controller ID range to be specified when creating an nvmf
subsystem.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: I936db3bb0c9a38569063a6fd3c11df262dfad776
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7322
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This update will stop using `struct vfio_device_info` from
<linux/vfio.h>.
Fix issue #1922.
Change-Id: Ia7ad745db8d7ed8f5248ca13e3188ebd540b0e40
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7831
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This helps in next patch in series where multiple
completions will be executing.
UT is adjusted since one additional poll is required.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id72377ddef91e40cdbc2bdea6f33c23309b0ca3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
During snapshot creation the original blob becomes
a thin provisioned blob that will only the diff of data after
snapshot creation.
Despite the comment in the UT the number of polls before issuing
blob write was hitting blob BEFORE it swapped with new one.
Issuing I/O during this period shall check for io freeze
before checking cluster allocation.
Otherwise bs_io_unit_is_allocated() hits assert for thin
provisioned blob. This is because cluster map of blob is
empty, but properties have not been updated yet.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I742e1a50b14d456ae1e6de13b5111caec3e8322c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also for the purpose to avoid a burst of children unmap request,
we will submit at most up to 8 children request at a time for
a big UNMAP command.
Change-Id: Iaf0f18b07517e0a8f84dc04e8c93b95691a1a43c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7518
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Both UNMAP and R/W will share the some logic to process the submission,
so combine them to a helper function first.
Change-Id: Ia4f234c6a58f078d3e9f88cacaf1510a17f07acc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7606
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Addressing review comment from 6cebe9d0.
Minor optimisation to cache result of spdk_u32log2() into local variable
instead of calling it multiple times.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I2fd6afd1e3ee461662de3f9d278958664224e106
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7806
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
ctrlr_discovery.c doesn't need this #include.
Including it causes bdev_module.h types to be
emitted to the debug symbols at least with some
compilers, which can result in unwanted abidiff
errors.
The unit tests do need it, so just include it
there instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad28f9778ce08b11b52325658583ae9032295f3a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7813
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michal Berger <michalx.berger@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The library itself doesn't need it. The unit tests
do need it, so just include it there.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9aefd303ae12928d45141029436509f185105bd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Michal Berger <michalx.berger@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Normally, unless RTE_MALLOC_DEBUG is set, DPDK zeroes memory in rte_free().
If RTE_MALLOC_DEBUG rte_free() fills memory with poison pattern, but then
(and only then) the memory is zeroed in rte_zmalloc_socket(). Relying on
this behavior allows to avoid unnecessary memset() in spdk_zmalloc() path.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Change-Id: If3efa4dd22f1568949c3fb529b604bd597ceb32f
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6975
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We did not find this issue, because we always use seed = 0 to test.
Definitely, we need to assign the seed value.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I6bec1b6d61480cdd7c9e27578dcaf5de2f65cf44
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7716
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch fixes missing free of qpair_mask when a listener error
occurs in ctrlr_create.
Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch>
Change-Id: I09162b86d8ac73bf9fc2006a08dcc0a955f222b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7818
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Not just one with extra available info. Also remove the extra
read of the error register, not required.
fixes: #1927
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I28badb45d8cc8d16b72f7019bd2a2044998fc402
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7729
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Nvme-cli uses NVME_IOCTL_IO_CMDs for "io-passthru"
commands to cuse devices. This patch adds support
for that IOCTL.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: I20e0ac91ba08fce91bc5da1f4a1e454058cdd1e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7741
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The nvme cuse IOCTLs are actually creating passthru commands
that can be either IO passthru commands or admin commands.
Renaming the routines to correctly reflect that should limit
the confusion when reading the code. Passthru commands that
are admin commands will go to the spdk_nvme_ctrlr_cmd_admin_raw
interface and passthru commands that are IO will be sent to the
spdk_nvme_ctrlr_cmd_io_raw interface.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: I8d427fe8b5f503fdb2d193236c77d410d5b13886
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7740
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Nvme-cli uses BLKSSZGET so support needs to be added for
nvme cuse devices.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: Ic8316713b2d017c8ff32a225efff6bcb95842799
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7708
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It is useful to have debug log information in the nvme_cuse
path when debugging IOCTls and flows.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: Ifef1bb82c96438e2fcbb9ad2fafe3f3eb66bed51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7707
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The purpose is to prepared for implement the async crc32 caculation
in the future patch.
Change-Id: Ia75f28154c49f08b527d48c63b9da79a6bdfede8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The purpose is to prepared for implement the async crc32 caculation
in the future patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I49f84ea1966f0acdd6f5aeb7192896f91fd16dee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7793
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Purpose: We will also support the kernel idxd driver, so we do not
need export this feature in the module file.
Change-Id: I965e031497920f527962ba187bccd81de6977b8f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6ad71a65244526e99a36920c630096cc9739d94d
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7809
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is prepared for using the hardware offloading
engine in accel framework. And some fields in nvme_tcp_pdu
needs to be DMA addressable.
Change-Id: I75325e2cd7ff25fe938bea0ac9489a5027e3e0e9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7770
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This is used to prepare using the accel framework to calculate
the crc32 because some fields in this structure needs to be allocated
in DMA addressable memory.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ib8def5596e60f4702709da647145c4e2b6d6848f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7767
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Note: a lot of the TCP and FC trace registers were
specifying '1' which means the arg type is a pointer,
but in reality it is always passing 0 for arg1. So
this patch just changes them to SPDK_TRACE_ARG_TYPE_INT.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I18d3cedd21e516f16cb2cd0a7f8c16670b1895d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7763
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
We already pass the PDU as arg1, so by changing the
trace register descriptions, we can map the PDUs to
more readable IDs when running the spdk_trace app.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad7106eeb0f5fe738f81da5ee174515d1cf4b6ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7757
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This was accidentally moved to the wrong place as
part of some earlier iSCSI refactoring. This trace
record should be executed when we have finished
reading all of the data for any PDU, not just those
with immediate data.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib1d17e5e79ff220e9e9b3dd55e247e745bd58019
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7756
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It can be useful for testing or development purposes to force clients to write
to doorbells using vfio-user messages instead of directly into shared memory;
add a transport-specific option to disable the shared mapping.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I7ed062fbe211ba27c85d00b12d81a0f84a8322ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7554
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will help us to add unmap split function, also
remove bdev_io_type_can_split() because we changed
to use swith(io_type) ... case now.
Change-Id: I449d6a9f5bf2d0b43dd124bbfc9e1ca2afddc15a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7516
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Only the last bdev_io can be freed without this fix.
Change-Id: I0d05b5d89e38ef60872ebc0f23aaed0c622593c4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7571
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Purpose: This patch is used to prepare to add the kernel
idxd support later.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: If89665f95d622c7342ab75050664158ec6fc615a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7330
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
(Note: this patch was previously applied as b32cfc46 and then reverted
as 63642bef.)
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7739
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
1) use spdk_bdev_get_name() accessor
2) use __SPDK_BDEV_MODULE_ONLY #define
The latter allows nvmf to just get the spdk_bdev_module
definitions and APIs that it needs for claiming bdevs
for purposes of avoiding the same namespace used in
different subsystems.
This also ensures that future changes to structures
like spdk_bdev and spdk_bdev_io will not cause
lib/nvmf so version changes.
Note: we include bdev_module.h explicitly in the
nvmf/subsystem unit tests now, before including
subsystem.c, because the unit tests do depend on
knowing the internal structure of spdk_bdev.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2f499a741d19f4749eadb402641f28137245fd23
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7738
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
libvfio-user DMA APIs report all regions notified by the client, including those
that don't have a corresponding shared mapping. There are several of these for
a typical VM, so just ignore this case.
Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: I37b06f4bc6d1818a03c8742616ed142f575d3f0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7532
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When payload_size is 0, we may get wrong cdw10 because of the calculate: 0 - 1,
add value check to fix value inversion bug.
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: I3bcd38ba981c854ff917282341d32aac47d22b76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7443
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This allows us to remove separate fcntl() calls to
set O_NONBLOCK.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1a590cfb3b65b3174bb5ef33e060cdc9bb7ac86c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7598
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
This simplifies the code a bit, removing the need
for the separate fcntl() calls.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4fef8f01a055d1471df87bd979c21d6198e9868a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7596
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It is already set by nvmf_tcp_req_pdu_init
when we get the pdu. So we do not set it again.
Change-Id: I034bbc46e600afd802457c0b152e303f16bafba3
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7714
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This reverts commit b32cfc467b.
This commit fails the ABI checks and only got through because the checks
were disabled until 21.04 hit.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id26b8f8ba551193d99b1ccbd31b35378b4095a20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7731
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7310
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We have seen that dptr was not alligned to 4k using cuse. Added allignment of data in cuse ctx to 4k same as it is done in nvme_allocate_request_user_copy
Signed-off-by: jwyka <jakub.wyka@intel.com>
Change-Id: Ic5c2482eae20d64ba467016eb61f5255467f70a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7453
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
When performing snapshot creation the I/O is frozen
during the process. The blob persists for extent page
allocation is delayed until snapshot creation is finished.
This results in multiple blob persists executing one after
the other, with only intent of writing out updated extent table
pointing to new extent pages.
Since blob->state is marked DIRTY before issuing each persist,
but a single persist completion marks state CLEAR.
Blob serialize correctly expects each persist to contain
dirtied metadata, in order to avoid unnecessary md writes.
Since all other instances of marking blob DIRTY is explicit,
assert in blob serialize is left as is.
Instead when running the queued up blob persists, the blob
state is marked DIRTY.
Side effect is that it will write out same md in some cases.
Fixes#1909
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I39f37299f3f0ebfccbdd4063781b5ecce286e993
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7640
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
spdk_vtophys() takes a mapping_length parameter, so
it can return the length for which the returned
virtual address is valid.
But spdk_vtophys() will only return the max
between the valid length and the input mapping_length
parameter.
So the nvme SGL building code for contiguous buffers
was broken, since it would only set the mapping_length
once, before the loop started. Worst case, if a buffer
started just before (maybe 256 bytes) before a huge page
boundary, each time through the loop we would create
a new SGL for only 256 bytes at a time, very quickly
running out of SGL entries for a large buffer.
Fixes#1852.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib1000d8b130e8e4bfeacccd6e60f8109428dfc1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7659
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The IDENTIFY_CNS quirk was applied as part of QEMU
OCSSD handling in commit 6442451b. But it was applied
not only to the OCSSD dev ID, but also the dev ID
for non-OCSSD NVMe controllers.
Starting with QEMU 5.2, QEMU will allocate a default
256 namespaces, but only some are active (associated
with the backing disks specified by the user). QEMU
supports IDENTIFY_CNS, but since this quirk was set,
we wouldn't send a real IDENTIFY_CNS and instead
would just populate a fake list where all namespaces
were considered active. This causes breakage in
a few places - mainly where we iterate through
the active namespaces, and then are surprised that
calling spdk_nvme_ns_is_active() returns false.
It was also breaking bdev_nvme_attach_controller RPC,
since by default we can only support returning 128
names, but since all of the namespaces were deemed
active, it was trying to return 256.
Fixes#1916.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4fdd27e0e36f0ac07a95f9f29aa83357e8505a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7658
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When zcero copy send is enabled and used by initiator,
it could significantly increase latency in some payloads.
To enable more fine graing configuration of zero copy
send feature, add new parameters enable_zerocopy_send_server
and enable_zerocopy_send_client to spdk_sock_impl_opts to
enable/disable zcopy for specific type of sockets.
Exisiting enable_zerocopy_send parameter affects all types
of sockets.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I111c75608f8826980a56e210c076ab8ff16ddbdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7457
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There are three modules implementing the bdev-zone API:
bdev_nvme, bdev_ocssd, and vbdev_zone_block.
For all three modules, the number of zones can be calculated using:
block_count / zone_size.
To avoid this calculation being performed everywhere, create a helper
function in bdev_zone.h, together with the other zone APIs, such that
a user can easily get the number of zones.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I2967b15a604ab8bf4420588e7510b9820762f925
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7451
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
tcp transport doesn't send a response capsule when
c2h_success is set even if cdw0 or cdw1 are non-0.
Signed-off-by: Ed rodriguez <edwinr@netapp.com>
Signed-off-by: John Meneghini johnm@netapp.com
Change-Id: Ieba81fcc50342a2009f7931526e6f8392e26b6a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6808
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When do spdk_reactor_set_interrupt_mode, if reactor
already runs in the specific mode, directly call
callback function before return 0;
Change-Id: I1fd8b753e9881755aa128aabe6d1e2749e58b39b
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7549
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The net library isn't needed - everything these RPCs
do can be done externally to the SPDK application.
This library will be removed in the 21.07 release.
As part of the deprecation, mark the net RPCs as
private. This will prevent an upcoming patch from
complaining that these RPCs are not documented.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I61118820fd29e410dca763595c3d9fd01a57373d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7592
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
For pollers that don't natively support interrupts, using
a busy wait mechanism temporarily.
An interrupt falicity for busy wait will
be registered for non-periodic poller.
Internally, an eventfd is created to each busy wait
poller. Write the eventfd when set interrupt mode,
and only read the eventfd when set back to poll mode,
then the busy wait poller will be called repeatly
in interrupt mode.
Change-Id: Iaeae14d1ff69fd9ef7d606a0b0a70193764513e9
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6711
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
In next patch, if poller doesn't have a period, eventfd
will be created which's always busy automatically.
This eventfd can be combined with timerfd. So rename
timerfd to interruptfd.
Change-Id: Ibffa30ecfcaa73e55f47e97fac854641b74f2dfb
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7546
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Defined callback for spdk_poller to adapt itself to
set interrupt or poll mode. The callback can
be registered to spdk_poller by new function
`spdk_poller_register_interrupt`
Interrupt callback operations for period poller are implemented,
so period pollers now are interruptable.
Change-Id: I2aa6ebfdd75f76b85a70af7e42530be4131ddc8a
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5752
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
These refined functions are prepared to adapt period
poller to following poller switchable API.
Change-Id: I34d2a785fa0e757b97b0dac5ccf24819d75e0184
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7156
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The interrupt mode of spdk_thread can be operated
by reactor based on reactor's interrupt mode when
the spdk_thread is scheduled or the reactor is set
into interrupt mode.
Change-Id: Ibeef7ffb759589a7b372bd78e59e3410be061383
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6709
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_thread_set_interrupt_mode can get spdk_thread run
between intr and poll mode. It is only valid when thread
interrupt facility is enabled by
spdk_interrupt_mode_enable(). Currently, this function
is limited that no poller is registered to the spdk_thread.
Change-Id: Iba54accd5976beb6f6e155014903928ce2858e36
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6708
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
All the other RPCs use underscores in the names of their fields, so
`framework_get_scheduler` should also use them instead of the spaces.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0e9edd7c59a4ab61643a7b558a2359e1805ed0b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7557
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DPDK recently clarified some semantics on the rte_devargs 'data'
and 'args' fields. This actually breaks our use of the 'data'
field to store the 2 second timeout timestamp for delaying
attach to newly inserted devices. Investigating this further,
it does not seem our use of the 'data' field was valid - it just
happened to work until now.
We could use the 'args' field now. But knowing whether to use
'args' or 'data' would then be dependent on the DPDK version.
We cannot use RTE_VERSION_NUM to decide, because this is a
compile time decision, and it is possible in shared library
use cases that we could actually link and execute against a
different version of DPDK than we built against.
So instead we will create our own env_devargs structure that
will store these allowed_at timestamps. Currently it's just
a linked list (which is exactly how DPDK does it) - we could
make it more optimal with a hash table down the road, but this
code only executes when we are doing PCI enumeration so it is
not performance critical.
Fixes#1904.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ee5d65ba90635b5a96b97dd0f4ab72a093fe8f7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7506
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Before this patch blob persist wrote out all allocated extent pages.
Intended design was to write out extent pages in two cases:
1) Thin provisioned blobs to write out extent pages when necessary
during cluster allocation.
2) Thick provisioned blobs to write extent pages during blob persist
when the blob was resized
This patch implements 1) by inserting extent before issuing blob persist
in cluster allocation path.
See blob_persist_extent_page_cpl() and blob_insert_new_ep_cb().
Blob persist might have to rewrite the last extent page after blob resize.
See blob_persist_start().
Meanwhile 2) was incorrecly implemented since it always re-wrote all
extent pages starting from 0. This was addressed by limiting number
of extent pages written, only to ones that were resized.
Some considerations were needed:
a) blob resize happen on cluster granularity, it might be needed to re-write
last extent page if resize was not large enough to change number of extent pages
b) first extent page to write should be based on the num_extent_pages from
active or clean, depending on resize direction
See blob_persist_start().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ibba9e3de3aadb64c1844a462eb0246e4ef65d37f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7202
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
When both clone and snapshot had already extent pages
corresponding to the same region in cluster map,
the clone extent page was replaced with one from snapshot.
This was incorrect and would result in loss of clusters
from clones extent page. It did not occur in practice
because all extent pages were rewritten anyway during
md sync. Cluster map was correct so updated extent pages
were too.
Cluster map correctness is verified in UT _blob_inflate_rw(true),
at the very end when checking data consistency of inflated blob.
This patch writes out the updated extent page explicitly.
So it would be possible to skip wirting out extent pages
during md sync later in the series.
Note 1)
At this point in series the extent page is written here,
and in blob persists. The later will be removed later in
series.
Note 2)
Errors during updating extent pages are not accounted for,
but neither does syncing them in blob persist.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I7deac3c64299f33f8df49e860af1a16295c074e6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
The blob_insert_extent() name was confusing, since the function
was actually responsible for writting out the extent page to disk.
Changed to a more fitting name.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ia312b0ef152100f30d5a1bfe123e55135c8afa6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
This patch does not change functionality. It separates
three stages of updating clone during snapshot deletion:
- updating cluster map
- updating extent pages
- removing backing device from clone
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I44869f3be596d9d0f06db4acedfdd7e1500516ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>