Previously, when connecting qpair, we allocated stats per qpair if poll
group is not used or we set stats per poll group otherwise.
Then when deleting qpair, we freed per qpair stats if allocated.
However, if qpair is still not completely disconnected after removing
qpair from poll group, pqpair->stat is use-after-free and it causes
a segmentation fault.
To fix this issue, we set pqpair->stat to &g_dummy_stats instead.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibf303e6db5176e93ed75cbe3a414bb923d6e3ab6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10845
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The NVMe bdev module enables asynchronous IO QP creation by default, after
calling `spdk_nvme_ctrlr_alloc_io_qpair` and `spdk_nvme_ctrlr_connect_io_qpair`,
the queue pair is in connecting state at the beginning, then users may call
`spdk_nvme_ctrlr_free_io_qpair` immediately, and the common layer will
change queue state to NVME_QPAIR_DISCONNECTING and NVME_QPAIR_DESTROYING,
so in function `nvme_pcie_ctrlr_delete_io_qpair` the workaround to wait
for create cq/sq callbacks will not be called, instead of using the common
layer queue state here, we should use the internal `pcie_state`.
Fix#2245.
Change-Id: I801caf26563464b135035bf7fa2f63def13de9f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10445
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The qpair's state member is only 3 bits of a uint8_t,
and the in_completion_context bit is another bit in that
same uint8_t.
We know that the qpair's state is only ever updated by
one thread, but it is possible that the state could
be modified by one thread, while another thread
is modifying in_completion_context.
in_completion_context is only modified by the thread
that is polling the qpair (or the qpair's poll group).
But with async mode, another thread that has a qpair
on the same PCIe controller could poll its adminq and
reap the SQ completion for the qpair that's owned by
the other thread.
So do *not* set the generic qpair state to CONNECTED
from the SQ completion callback. Instead just set
the pcie_state to READY, and let the thread that owns
the qpair detect the qpair is READY and set the state
to CONNECTED itself.
Fixes issue #2157.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9efc0c954504f1841e1c3890ae78211ad0d1990e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9975
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This is to help with binding trace objects together and
for the convenience (all trace definitions are in one place
instad of being scattered accross multiple files).
Change-Id: Ib15bc9c2eeee9c4d0816bcee509ab69f3f558e19
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9574
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
John Kariuki tested this patch on a system with
several Intel P5800X Optane SSDs, to determine the
performance impact of adding these two
spdk_trace_records() in the main NVMe I/O path.
The pathological case (512B random reads on a single
Xeon core) decreased from 13.10M to 12.88M, or 1.7%.
Normal workloads (4KB+) would incur a smaller penalty
since the I/O rate would be much lower - maybe even
unnoticeable..
This is a really valuable tracepoint to have enabled
by default, so I think this small amount of degradation
is acceptable.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie2543cadf3541eb74398d31ac0f495522ab49ec0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9303
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This new API signals that the ctrlr will soon be
reset. This allows the transport to skip unnecessary
steps in following calls to the driver prior to the
reset - for example, skipping PCIe DELETE_SQ/CQ
commands when freeing an IO qpair.
Note that if we are deleting a qpair after
prepare_for_reset was called, and the qpair is
still waiting for a CREATE_IO_CQ or CREATE_IO_SQ,
we cannot poll for those commands to complete,
but we also cannot free the qpair immediately.
So set a flag for this case to defer the
destruction until the outstanding CREATE_IO_CQ or
CREATE_IO_SQ callback is invoked (typically as an
aborted command when the reset happens).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I34c6276ae71e7d61ad4a3720f1a985b1ee96bd8b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9249
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When deleting an IO qpair, make sure that it's connection process is
finished (i.e. create CQ/SQ commands are completed) before freeing it.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I487dcef390d73ff4a7264ff97d965c9030916840
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9279
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The pcie layer can't always detect bad addresses
in the request at submission time - for example,
the transport may not have any trackers available
and the request gets queued at the generic
nvme level.
So this means that we might detect vtophys failures
during submission time, or in a process_completions
context - the latter happening when we complete
one request which triggers submitting a new request.
Currently if the vtophys failure happens during
submission context, we return -EFAULT to the
caller *and* call the completion callback. Nowhere
else in the driver do we do both - the intention
has always been that you get one or the other.
So make all of this consistent by tagging the
tracker and the qpair with a flag if we hit a vtophys
error in the submission path. Return 0 to the caller,
who will then later get a completion callback for the
bad request when the qpair is next processed for
completions.
I considered a separate TAILQ to hold these 'bad'
trackers, but that would have required duplicating
quite a bit of the tracker completion code for this
one case. The flag on the pqpair is already in the
hot cacheline, so it's cheap to check it. We will
only interate the outstanding_tr list when that flag
is set, so this should have zero impact to performance.
Fixes issue #2085.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I60b135fb32d899188e51545b69feb1b27758fd7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9234
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
async_mode option is currently supported in PCIe transport layer
to create io qpair asynchronously. User polls the io_qpair for
completions, after create cq and sq completes in order, pqpair
is set to READY state. I/O submitted before the qpair is ready
is queued internally. Currently other transports only support
synchronous io qpair creation.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib2f9043872bd5602274e2508cf1fe9ff4211cabb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8911
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The generic transport layer still does a busy wait, but at least
the logic in the PCIe transport now creates the queue pair
asynchronously.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: I9669ccb81a90ee0a36d3f5512bc49c503923b293
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8910
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
.ctrlr_connect_qpair
Previously this was assumed to be a synchronous process so the generic
layer transport code updated the state after .ctrlr_connect_qpair
returned. In preparation for making this support asynchronous mode,
shift that responsibility down into the individual transports.
While none of the transports actually do this asynchronously, insert a
busy wait in nvme_transport_ctrlr_connect_qpair to wait for the qpair to
exit from the CONNECTING state. None of the upper layer code can
actually correct handle a transport doing this asynchronously, so the
busy wait will cover that.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3c1a5c115264ffcb87e549765d891d796e0c81fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8909
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It is not uncommon for delete_io_qpair to fail, for
example when a controller is hot removed. So even
if SQ or CQ deletion fails, continue with freeing
resources and report success back up the stack.
There is really nothing the application can do to
account for this failing anyways.
Upcoming patches will add additional checks to
ensure failing delete_io_qpair status never gets
propagated to the caller.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iac007c1eba30f7a8c4936b3ffb6c837f28ee12ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8658
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
And for some internal functions we need to pass controller
parameter so that we can do vtophys based on transport type.
Change-Id: I3ca4fa162ec9305f62b295ba21f7474c21edfe52
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8031
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The inline function can also be used in the coming submit request
function.
Change-Id: If4a5511001e6586dbce0978298beddc537f54d8b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8173
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The PCIE and VFIOUSER both can use this function, the only difference
is VFIOUSER should use IOVA=VA to do the vtophys translation, so
here we will move the function to the common PCIe layer as the first
step.
Change-Id: I699edb67a00a2fa534072fc02ac2dd4a27aba8f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8030
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Check if qpair has a poll group during the connect process,
use poll group's statistics or allocate own structure per
qpair. That is done due to not all applications use poll
groups and we want to avoid "if (qpair->group)"
conditions in data path.
Admin qpair always allocates its own statistics
structure but the statistics are not reported
since this qpair is not attached to a poll group.
Statistics are reported by spdk_nvme_perf tool
if --transport-stats and in bdev_nvme_transport_statistics
RPC method.
Change-Id: I58765be161491fe394968ea65ea22db1478b219a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6304
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The shadow registers need to be zero when the qpair is
created. This happens automatically when a given qid
is used for the first time, since the page is allocated
with zmalloc. But if a qid is reused, we need to make
sure its shadow registers are cleared *before* we create
the qpair again with the same qid.
So clear the registers in nvme_pcie_ctrlr_delete_io_qpair,
just after the cq is deleted.
Fixes issue #1795.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I08c30d1ea248559a01b802cd132dd57199b491b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6752
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The data path for PCIe and vfio-user transports are almost
same too, so move the code from nvme_pcie.c to nvme_pcie_common.c,
so that these APIs can be reused by vfio_user.
No logic change for this patch.
Change-Id: I82f480bba3bae0ce35e2a98f29839081095f7d50
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6040
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Libvfio-user assumes the memory translation is IOVA=VA mode,
since SPDK CI is running inside a VM, the memory mode is
IOVA=PA mode, so when testing NVMe vfio-user transport inside
a VM spdk_vtophys doesn't work with libvfio-user, so here
we add a function to return memory address based on TRTYPE.
Change-Id: I11d1c87197f7bbfc243b6bf368795c9a74bd1303
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5958
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are some common data structures and APIs in pcie transport
which can be used both for pcie and vfio-user transport, so move
the common code into a new header and source file.
No actual logic change just the code movement except remove the
static function declarations.
Change-Id: Ie9021e703a5780fdd6840f0e3cfea76a0017a811
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5923
Community-CI: Broadcom CI
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>