Use uint32_t to avoid overflow when left shift by 16 places
This patch fix issue #2858
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I07f4328674ae7bd7525792ca1e424e85a932c87f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16180
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
nvmf_ctrlr_cmd_connect() can only handle a request in one buffer
(req->data); sanity check it's not split across IOVs.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I595d8542ce71e56cf2b074f4cf41bce440f6dc26
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16123
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This code has a similar potential problem as the identify
and log page commands did: stop using req->data in favour of IOVs.
We also need to fix the unit tests to initialize the iovs.
We don't change the existing "set" behaviour of requiring a single IOV
here.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I257567a7abd5fc3ed9ee21b432c7da7d70fbbde0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16122
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the previous fix:
adc2942ad nvmf: nvmf_ctrlr_get_log_page use iovs to store the log page
a data corruption bug in the log page code was fixed. Previously, it
used req->data, which may be too short a buffer in the case that the
buffer is split across more than one IOV. req->data is never safe to use
in this situation. The code was changed to use the provided iovs instead
of req->data.
However, the identify command handling was still vulnerable to this
problem, and has been seen in real life at least with a CentOS guest VM.
The fix is basically the same: use the IOV utility functions to write
out the response instead of directly trying to use req->data.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I00445895af20e43be73189629576eee0667f86dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16121
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Move the IOV handling code in ctrlr.c to the top of the file, for
subsequent use.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ibddde1cb964d8aaecf4673ffa6d4147d0a48020c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16120
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add a define for the Identify command buffer instead of using a raw
value.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9073ff84e2fa2ef9268051b898fe1027d8e97baa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Basic IO completion counting can be done at the common
layer, to enable some level of stat tracking even for
transports that don't have transport-specific tracking
yet.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If04f854b97440089b8ad149b64cb59173c73975c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15912
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
In _free_ctrlr(), ->endpoint can never be NULL, and the code was
self-contradictory; assume it's not NULL.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I81a449123ca05f64460380dc3a8ad8af2143d166
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Use standard "sqid" naming for a log message.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Icca8415cd17272ca7bd82667721c4131dd1df7f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15828
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Both the length of a request and the number of ranges to copy are
controlled by the user, so we should check them and return an error
instead of asserting that they're correct.
This fixes the `test/nvmf/target/fabrics_fuzz.sh` test.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I3481c4bb1f2c7676df81f41dfc95ef063924222e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15805
Reviewed-by: Pawel Piatek <pawelx.piatek@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Found with misspell-fixer.
Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: If062df0189d92e4fb2da3f055fb981909780dc04
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15207
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If a lot of qpairs are connected all at once, the
RDMA optimal_poll_group logic does not work correctly,
because it only accounts for qpairs that received
their CONNECT capsule. Now that we have a counter
for a poll group's unassociated qpairs, use that value
to supplement the current io qpair count.
We can just assume for now that all of these unassociated
qpairs are io qpairs. That won't always be true, but
for purposes of picking the optimal poll group it is
sufficient.
Note that for RDMA, we could increment the counters
based on the RDMA qpair ID in the private data in the
rdmacm connect, but to keep the code simpler and common
across all transports, we defer the accounting until
after receiving the CONNECT command, so that it is
the same for all transports.
Fixes issue #2800.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5897d6ebac23d3b78b100e3fef5a7f9fb5304820
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15695
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use a local variable to hold the qpair count.
While here, also use pg_current to get the min_value,
this is a bit simpler to read than things like
(*pg)->group.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I65771fb469f021e9e77b8a6c117841b8f4b66af5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15694
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We make decisions on how to pick a poll group for a new
qpair by looking at each poll group's current_io_qpairs
count. But this count isn't always accurate since it
doesn't get updated until after the CONNECT has
been received.
This means that if we accept a bunch of connections
all at once, they may all get assigned the same poll
group, because the target poll groups counter doesn't
get immediately incremented.
So add a new counter, current_unassociated_qpairs,
to account for these qpairs. We protect this counter
with a lock, since the accept thread will increment
the counter, and the poll group thread will
decrement it when the qpair receives the CONNECT
allowing us to associated with a subsystem/controller..
If the qpair gets destroyed before the CONNECT is
received, we can use the qpair->connect_received
flag to decrement current_unassociated_qpairs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8bba8da2abfe225b3b9f981cd71b6f49e2b87391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15693
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Currently we use qpair->ctrlr at qpair destroy
time to decide if we need to decrement the
qpair's poll group's qpair count. But this is
not correct - these counters get incremented
when the CONNECT is received, but qpair->ctrlr
doesn't get set until later.
So add a new connect_received bool to the spdk_nvmf_qpair.
Use this instead to determine when we should decrement
the poll group qpair counters.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I174a0fda36c4558171953bf58f2f5117bc074f76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
At least recent Linux guest VMs send SPDK_NVME_IDENTIFY_CTRLR_IOCS as a
matter of course. While this isn't supported in lib/nvmf, as this
doesn't represent an error, reduce the log level of the error message so
we don't spam the logs.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I095de3e4331b3912cbc457da6d722b9883ec7884
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15646
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Optimal I/O boundary causes I/O to be split in the nvme driver. This is
a problem for writes if write_unit_size > 1 because the split I/O may
not match the write_unit_size.
Fixes: #2791
Change-Id: I437e6cb6d8e2415658d5b46539feeacb5363fd46
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15627
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
NVMf target reports copy command support if all bdevs in the subsystem
support copy IO type. Maximum copy size is reported for each namespace
independently in namespace identify data. For now we support just one
source range.
Note, that command support in the controller is initialized once on
controller create. If another namespace which doesn't support copy
command is added to the subsystem later, it will not be reflected in
the controller data structure and will not be communicated to the
initiator. Attempt to execute copy command on such namespace will
fail. This issue is not specific to copy command and applies also to
write zeroes and unmap (dataset management) commands.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I5f06564eb43d66d2852bf7eeda8b17830c53c9bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The correct SPDK thread is already contained in the poll group.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I4eefe2ba60c77c01a866a693bccbb8affc8262ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If the guest performs a hard shutdown we're not deleting the CQs:
nvmf_vfio_user_close_qpair calls delete_sq_done, which won't delete
the CQ because vu_ctrlr->reset_shn is false.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I383fb985340a0d9d0eb7fea7403372cbdc55a089
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15387
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This eventfd may be passed by libvfio-user to the remote process which
might remove the EFD_NONBLOCK flag, in which case we would block
indefinitely.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: If9826cd700b4a7b3458a0a8278a96322d99ac08e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15385
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
per Intel policy to include file commit date using git cmd
below. The policy does not apply to non-Intel (C) notices.
git log --follow -C90% --format=%ad --date default <file> | tail -1
and then pull just the 4 digit year from the result.
Intel copyrights were not added to files where Intel either had
no contribution ot the contribution lacked substance (ie license
header updates, formatting changes, etc). Contribution date used
"--follow -C95%" to get the most accurate date.
Note that several files in this patch didn't end the license/(c)
block with a blank comment line so these were added as the vast
majority of files do have this last blank line. Simply there for
consistency.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id5b7ce4f658fe87132f14139ead58d6e285c04d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
It doesn't make sense to have the size of the doorbells fixed and then
calculate the maximum number of queue pairs based on it, do it the other
way round. Also, add some sanity checks based on the spec.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I17e3509fb0a011128ca089ce78b7a296262e6f8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Some reset/disable paths are freeing the shadow doorbells without
switching the SQs back to BAR0. Fix this up, and add a small cleanup
when initializing the shadow doorbells.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ia5e5b91b7dc696a558eb0ad59cc554abced47cca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14901
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To support SQs allocated to a poll group other than the controller's
main poll group, we need to make sure to poll those SQs when we wake up
and handle the controller interrupt. As they will be running in a
separate SPDK thread, we will arrange for all poll groups to wake up
when we receive an interrupt corresponding to a vfio-user message
arriving.
This can mean needless wakeups: we don't (yet) have a mechanism to only
wake up the poll groups that correspond to a particular SQ write.
Additionally, as we don't have any notion of a poll group per
controller, this ends up polling all SQs in the entire poll group, not
just the ones corresponding to the controller we were handling.
As this has potential performance issues in many cases, it defaults to
disabled.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I3d9f32625529455f8d55578ae9cd7b84265f67ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14120
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
max_aq_depth should be not smaller than 2 or greater
than 4096
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I205fbb4345cfdc41ebaf30c953da263fe9f0e9a8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14691
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
max_queue_depth should be not smaller than 2 or greater
than 65536
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I0f2a4b8df6eb1b140a11936fc6929f1285a7d717
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14619
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Refine the macro definition name about queue depth and
prepare for next patch.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I85bee2528ae4ab70292fc11aa62d05bae0c28a77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14664
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
When Link Time Optimization is enabled, compiler can sometimes produce
additional warnings saying that some variables may be uninitialized.
To supress the warning it is enough to add explicit initialization
of the variable causing the issue, in this case 'iovcnt = 0'.
Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com>
Change-Id: I080b20a6008643ae78c8e3a6c2d183193ef6c1bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14674
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Community-CI: Mellanox Build Bot
When data_local.num_async_events >
SPDK_NVMF_MIGR_MAX_PENDING_AERS, data_local.async_events
was already indexed by 256, and it was out of bounds.
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Change-Id: I15cfdeb9bc165de0c73fbc9171b0ce6d8689c0aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14666
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
In-capsule data length should be the same with the SGL data length.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I7eefecb8baebb76850a48689907aff27a8946f98
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Fixed error handles which are violated with spec:
1. 'data length > MAXH2CDATA' is a fatal error.
2. 'ICDOFF != 0' should abort the IO.
Other errors which are not defined in spec:
1. invalid sgl type
2. In-capsule Data length > In-capsule Data size
Because this function runs before data part receiving, it is hard
to skip the following data segment if we want to handle some error
as non-fatal.
Currently, we have to handle all undefined errors as fatal errors.
I think after this release, we can change receving process. This will
be helpful for error handling. But this work is not small.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I8fc0d2d743505e49a93be19fd217e7ad6ca06622
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14580
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
During fuzzing vfio-user client and server are started from same
process causing deadlock. SO_PEERCRED return pid of process
connected to vfio endpoint.
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I6fc2db5d58a459a30fec116a9de3c69d48acf75e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14559
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
in_capsule_data_size should not be larger than max_io_size.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I636724c888b9e5abc4cffac96bff24021e172498
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14618
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
hpda value should be in range of 0 to 31.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ie1329c831af06ccc8943a562c3f6396b635be518
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14575
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This function is small and called only once.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ie4b11668e42a8920b3a9a11aa8cb83512f32942c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14576
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
When emitting the JSON-RPC text for saving the
current configuration, add the listeners last.
This is usually the preferred order when
configuring a new subsystem - it is better to have
all of the namespaces and hosts added to the subsystem
before adding the listener to allow hosts to connect
to it. We support namespace hotplug but there's
no need to unnecessarily generate hotplug events
if we can avoid it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I79e8a0a496eeb128efbb7e314ac835b6110d3cc8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14586
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This function is called only once and can be eliminated.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I0b3e80c025b60a816e2113f859907f95e96dd183
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14578
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
'nvmf_tcp_pdu_payload_insert_dif' can be done after receiving
whole payload data as an optimization.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I3054079427c25d102477ef8ec1b288631741d7a3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14577
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
RDMA transport registers MRs for in-capsule
data buffers, commands and completions. Since
these structures are allocated using huge pages,
MR for these buffers are already registered, we
only need to translate addresses.
Signed-off-by: Aleksey Marchuk <alexeymar@nvidia.com>
Change-Id: I90c53d8276d72077f7983e9faf9160e9ede52a7d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14430
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When doing live migration, there are some spdk_nvmf_ctrlr internal
data structures which need to be saved/restored, these data
structures are designed only for vfio-user transport, for
the purpose to extend them to support other vendor
specific transports, here we move them as public APIs,
users can use SAVE|RESTORE to restore a new nvmf controller
based on original one.
And remove the register from vfio-user transport, these registers
are stored in the common nvmf library.
Change-Id: I9f5847ef427f7064f8e16adcc963dc6b4a35f235
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11059
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Adding `psk` field to `spdk_nvme_ctrlr_opts`
Adding `psk` parameter to `bdev_nvme_attach_controller` RPC
Change-Id: Ie6f0d8b04ce472e6153934e985c026acded6cdfc
Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14046
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
While waiting for a new PDU, target will not do too many useless
memcpy.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ie0825c2b1e44444b210040c4a1761010e0e4cfe5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We already define a convenient variable for the admin CQ: use it.
Suggested-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If6570f30844a52113633bdb5f3543eec700f05d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14391
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The word engine was both used (interchangeably with module) to refer to
the things that plug into the framework and to the framework itself.
This patch eliminates all use of the word engine that meant the
framework. It leaves uses of the word that meant "module".
Change-Id: I6b9b50e2f045ac39f2a74d0152ee8d6269be4bd1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13918
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
If initiator and target run on the same application, and initiator
uses SRQ, target may get async events for initiator, e.g.,
IBV_EVENT_QP_LAST_WQE_REACHED unexpectedly.
The reason is initiator and target may use the same device
simultaneously and only target polls async events.
Target sets attr.qp_context to rqpair when creating QP, but initiator
sets attr.qp_context to NULL when creating QP.
Hence one simple fix is to ignore async events whose qp_context is
NULL.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id9ead1934f0b2ad1e18b174d2df2f1bf9853f7e1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14297
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Multiprocess is only supported by a few libraries (e.g. NVMe driver).
Other libraries that don't support it will often fail on mempool
initialization when running as a secondary process, as the mempools are
already created by the primary process. But the error messages are
vague and don't indicate why this happened. So, this patch adds a check
to see if a mempool exists after spdk_mempool_create() fails and prints
an error message informing users that multiprocess is unsupported.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6f915a94266e64dda380e3b269424cc579372a10
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14234
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Fixes #issue 2636.
The existing allocation method (nvmf_rdma_get_optimal_poll_group())
is traversal and unperceived link disconnection. A more fair method
considering the number of real-time connections to allocate a poll
group is implemented.
Signed-off-by: liuqinfei <18138800392@163.com>
Signed-off-by: luo rixin <luorixin@huawei.com>
Change-Id: Ic1e6283e386dbb0dd6655bedebe26aeedb16c333
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14002
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the queue was on another poll group, we need to send a message back
to the admin CQ's thread to post the completion from the correct
context.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I997987d5d6b822a1a5124f54fc29ce5d7f03190d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14057
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
The public interface of lib/accel is now include/spdk/accel.h
Change-Id: Id94f623a494eb1b524b060f4413f633073ea7466
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
There is a race condition if we call this function in the
polling mode when running with multi-cores, same as other
places where the function is called, we only kick controller
in interrupt mode, also in `vfio_user_ctrlr_intr`,
`ctrlr->sqs[0]` may be set to NULL after the controller
poll call, so return earlier for this case.
Change-Id: I03a7b74a39c966a2b8be610bca0e492d902f6b08
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13696
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
For the case `nvmf_subsystem_remove_listener` RPC call when VM is connected,
we should not relisten the accept poller, because the endpoint will be
destroyed for this case.
Change-Id: Icf8299f26a3bbf7bbe44fd01edb4ede344692d25
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13548
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
While running into this function, even the subsystem can't be
destroyed due to error subsystem state, it's better to continue
the execution.
Continue to fix#2590, QEMU is stuck for the failure case, and
nvmf target should process such error because it may support other
normal subsystems at the same time.
Change-Id: Ib05e24996378b52070d2b760519f476f9b2d7e76
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13839
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Sometimes VM may get a kernel panic when starting, and SPDK CI will kill
`nvmf_tgt` after 60 seconds, and for this exception, SPDK will raise an
assertion when destroying the subsystem, while here, we remove this
assertion and print the error information.
CI will still mark this case as a failed case, then we can use this error
information to understand error subsystem state in vfio-user.
Fix issue #2590.
Change-Id: I20b16f9e96a566730eca2dd9ea165645bd9160bd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13773
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
this is a prework for further changes - with lock on generic layer
lock on specific transport (e.g. tcp, rdma) layer becomes optional
possibly it won't be required if some contract introduced on public
interfaces (to be considered)
- spdk_nvmf_poll_group_[create|destroy]
- spdk_nvmf_tgt_listen_ext, spdk_nvmf_tgt_stop_listen
- spdk_nvmf_get_optimal_poll_group
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ib132babf9e7022342129fe795991cdad834e7f53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13665
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When there are not enought transport buffers for
multi SGL request in state NEED_BUFFER, WRs
received from the data_wr_pool are returned back
to the pool. However rdma_req->data.wr.next pointer
still points to the first WR from the pool. Usually
it doesn't cause any problems since rdma_req will
try to fill buffers again, but when qpair is being
destroyed, all requests are completed forcefully.
When the request is completed and data.wr.next
pointer is not NULL, we'll try to put already
released WRs into the pool one more time.
That corrupts the pool and leads to undefined
behavior.
Fixes#2541
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I238b92eec132d8d845330362af6f335421177454
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13760
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
allow DSA device to async offload crc32 calculation in nvmf-TCP
This patch can use DSA to accelerate crc32 computation, making
the io performance of TCP paths using crc32 approach the io
performance of TCP paths that do not use crc32.
Using SLIST to minimize the performance drop. SLIST has less
operation compared to TAILQ.
Thinking about memory thrashing, we should use the same memory as
possible to receive new PDUs. So, insert newly freed PDU in to head
is better.
The performance drop is within 1% compared to the TCP path without
crc32.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I480eb8db25f0e730cb198ca5ec19dbe3b4d38440
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11708
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When doing DIF insert and strip, we will reserve extra
buffer in block device layer to save DIF information,
so when attaching one device to Namespace, we will
check the value first so that the reserved buffer
size isn't smaller than metadata size.
Change-Id: Id9272886ce8a7c01271279686730af4e5b24f35a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12188
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
When `dif_insert_or_strip` is enabled, NVMf library will do
DIF insert and strip automatically, client isn't aware of
it, when `dif_insert_or_strip` is disabled, we will report
Namespace E2E Protection Capabilities to client, but we
don't process PRACT and PRCHK flags in NVMf library, so
here we don't report the capabilities to client and leave
the use of extended LBA buffer to users.
Change-Id: Ic610dc65fef210a7799c6ab693d89138b99e1193
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12165
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
it is impl and used only in nvmf.c source file
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1236f9ede28c5da313d118ce73e1da64381379c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13664
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We cannot count AERs as outstanding IO for purposes
of subsystem pause, because we cannot expect them
to be completed. Previously we would account for this
in nvmf_ctrlr_async_event_request() by decrementing
the counter, but this did not consider cases in the
calling function (nvmf_ctrlr_process_admin_cmd) where
an AER might complete with error before this function,
resulting in the counter getting stuck indefinitely
with a >0 value.
Rather than adding a decrement in all of those
error cases, do a single check at the beginning
of nvmf_ctrlr_process_admin_cmd, and remove the
one from nvmf_ctrlr_async_event_request.
Fixes issue #2215.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ica969f116d80dfba0168369ff2fba9a4a42fc076
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13678
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
In NVMe/TCP target, the socket low water mark is set to
sizeof(struct spdk_nvme_tcp_common_pdu_hdr), which is 8 bytes.
In corner test, there might be 4 bytes data packet sent to
NVMe/TCP target, after that, if there is no more data sent to
the same socket, the 4 bytes won't be read by NVMe/TCP target
qpair thread. Because of this, there is a IO request didn't
complete in initiator. Then, if manual call the readv function to
read the 4 bytes for the pdu in target, the io request complete
normally in initiator. It seems like the pdu might be split,
and in the situation, the IO request will not complete until
new IO request reach.
After set low water mark in NVMe/TCP target to 1 byte, just
like iscsi target done, the issue disappear immediately.
Signed-off-by: BinYang0 <bin.yang@jaguarmicro.com>
Change-Id: I59d3d900f0b25632d786ef25ab096eabe43476bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13633
Reviewed-by: <chuanwei.ji@jaguarmicro.com>
Reviewed-by: Qingmin Liu <qingmin.liu@jaguarmicro.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It will not find the h2c related reqs in the tailq now.
We can get it from tqpair->reqs directly.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I25f0900e875b054d7617450477e9719e7a59aa18
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12861
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Currently we initialize pending_bytes only in pre-copy state. This is
pointless since we don't generate any migration data at this state, so
if the vfio-user client reads migration data it will be garbage. Even
worse, we don't re-initialize pending_bytes in stop-and-copy state, so
if the vfio-user client reads the entire migration data in pre-copy state
then there will be nothing left to read in the stop-and-copy state,
which is where we actually produce the migration data. This results in
corruption of the controller's state (e.g. queues).
This patch ensures that migration data are available in the
stop-and-copy state, by setting pending_bytes accordingly only in that
state.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I0b215e64cd1f58f254e1079f06402d196f984099
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11718
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The read_data, write_data, and data_written migration callbacks assume
that the migration data are accessed in one go. Until this is fixed,
with this patch we ensure we don't ignore unsupported ranges.
Change-Id: I640415858b8c374ffc9e487cd20f5130e0be9305
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11717
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
QEMU may exit due to some exceptions which mean the socket
connection may be disconnected at any time, so for asynchronous
callbacks especially the subsystem pause/resume callbacks, they
all run in asynchronous way, the controller pointer may become
invalid before the callbacks are called.
Fix#2530.
Change-Id: I6d73597d75761e28844e83bfee7f8a446d85fa49
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK was previously incorrectly requesting log levels such as
LOG_NOTICE. Update libvfio-user so it is in fact supported, and check
that setting up the callback actually worked.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I41c2a8cf683868c3c2e40470f78e1af3dba29de4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12839
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Update libvfio-user such that the SGL access APIs can be used
concurrently. We are guaranteed that the guest memory remains mappable
now that the vfio-user transport has implemented quiescence.
This is currently only really useful (for a single controller) in poll
mode, but shouldn't break interrupt mode, as we still ensure all a
controller's queues are on the same poll group in that case.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I0988e731558e9bf63992026afc53abc66ec2a706
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12349
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In SPDK, declarations have the return type on the same line. Definitions
have the return type on a separate line. Astyle has an option for
enforcing this. Unfortunately, it seems to have two bugs:
1) It doesn't work correctly at all on C++ files.
2) It often fails on functions that return enums, or long type names
Deal with 1) by adjusting the check_format.sh script to only tell astyle
to fix return type line breaks for C files and not C++. Deal with 2) by
adding a few typedefs to work around the problem.
Change-Id: Idf28281466cab8411ce252d5f02ab384166790c6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
While we are quiesced, we're not allowed to access guest memory via the
SGL APIs. Refuse to process any commands unless we're in RUNNING state.
We need to synchronize with each poll group via a message before we can
call vfu_device_quiesced(), otherwise we could still be processing
commands via nvmf_vfio_user_sq_poll().
For interrupt mode, we then might miss processing commands in a
corresponding interrupt callback, so make sure we process them when we
return to RUNNING state.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ieae5a9ae8d9de722e0bdf4bb8d61e7e678159f1f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12912
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
An oversight meant that quiesce was in fact only pausing the admin
queue, and not ensuring no I/O was ongoing. Fix this by passing the
right flag to spdk_nvmf_subsystem_pause().
Change-Id: I930c616d1170ac0299339b04928da57f6a7489ab
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13441
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If the broadcast NSID is supplied, every namespace is paused.
Change-Id: I40cc3e04b5a75b731ab0c8946ed8146275cc8ee4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13394
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Profiling data showed the deference of the CQ head in cq_is_full() was a
significant contributor to the CPU cost of post_completion(). Use the
cached ->last_head value instead of a doorbell read every time.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ib8c92ce4fa79683950555d7b0c235449e457b844
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11848
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Patch below changed the struct spdk_nvmf_ctrlr_data by inserting
spdk_nvme_cdata_fuses. This affects large number of nvmf interfaces.
(cbfd581) nvmf: Add NVMe fused operations to spdk_nvmf_ctrlr_data
Unfortunately was missed due to lack of rebase after ABI update on
CI machines.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ifd06d0ddbefe9ea6c9715adae9881d4606e34b44
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13013
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Kamil Godzwon <kamilx.godzwon@intel.com>
Reviewed-by: Michal Berger <michallinuxstuff@gmail.com>
Community-CI: Mellanox Build Bot
Add an option to stop nvmf transport advertising support for both the
compare command and the fused compare_and_write operation in vfio_user
transport.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I3900218c0e9884f86a5c8698a030f8106b64f2f7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12919
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Compare command, when not supported natively by the underlying bdev
is emulated by the bdev layer.
Change nvmf ctrlr data to advertise compare command by default.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I88646e6c1a7d7a2829be813ff0241661724bd127
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12918
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fused compare_and_write operation is always advertised by the nvmf
transport.
Add the fuses structure to spdk_nvmf_ctrlr_data to make advertising
fused operation configurable.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I73ee03dc8948f1d250cc0a8f0b8a3bde042a45e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12917
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Many open source projects have moved to using SPDX identifiers
to specify license information, reducing the amount of
boilerplate code in every source file. This patch replaces
the bulk of SPDK .c, .cpp and Makefiles with the BSD-3-Clause
identifier.
Almost all of these files share the exact same license text,
and this patch only modifies the files that contain the
most common license text. There can be slight variations
because the third clause contains company names - most say
"Intel Corporation", but there are instances for Nvidia,
Samsung, Eideticom and even "the copyright holder".
Used a bash script to automate replacement of the license text
with SPDX identifier which is checked into scripts/spdx.sh.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa88ab5e92ea471691dc298cfe41ebfb5d169780
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12904
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: <qun.wan@intel.com>
Reflect that we are kicking the entire controller.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If5723a5f485745ef0a2456942b6df1d54133815b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12665
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
This function is really about re-arming all SQs for a poll group;
refactor to reflect this.
This is necessary ground-work before we can support multiple reactors in
vfio_user.c in interrupt mode.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I170fae2076fc80e742926cf448973671ac9e3bd9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12664
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Prepare for the later patch, and make the later patch code clean
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I12b175c86a5245f38dc76fe2d3918ec4b30a475a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12830
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>