When use of deprecated featues is encountered, SPDK now calls
SPDK_LOG_DEPRECATED(). This logs the use of deprecated functionality in
a consistent way, making it easy to add further instrumentation to catch
code paths that trigger deprecated behavior.
Change-Id: Idfd33ade171307e5e8235a7aa0d969dc5d93e33d
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
the mobj is allocating from pdu_data_out_pool,
if pdu_data_out_pool is exhausted, when the pdu is
polled next time, because data_buf_len is modified,
iscsi_pdu_payload_read return -1, and the connection
will be released.
Signed-off-by: lizengwu <786436671@qq.com>
Change-Id: I3ee65472f7ddaa357d7952a5b734540f0bc0b216
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15626
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously we use this flag to avoid to call `vhost_dev_unregister`
twice in `subsystem_fini`, but DPDK vhost library will check it,
we don't need this flag actually, but there is one race condition
between adding a new connection and unregistering the socket file
in different threads, so here we just move it to vhost-user device
as the first patch, and then use this flag in coming patch.
Change-Id: I658712dd20331a2e2eb5f4758bf76f748036a131
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15482
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
`vhost_user_dev_unregister` will check if the device is busy,
so we don't need to check `user_dev->pending_async_op_num` here.
For `vdev->registered`, with this check here, we can remove a
device even it didn't have a valid QEMU connection, and since
vhost-scsi supports hotplug feature, we don't need to check this
flag either when it have a valid QEMU connection.
Change-Id: I50cdeb5ca544e2ed93a1bc99ec3da8787a9e5df5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15481
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Feng Li <lifeng1519@gmail.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Replace comments saying that particular locks must be held with
assertions that enforce that those locks are held. Remove the comments
so that there is no chance of comments and code getting out of sync in
the future.
This also fixes a caller of bdev_close() that did not hold a required
lock.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I3a540f1ad9b9826f925c523986334aa8fcd302f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This exercises the parts of spdk_spin_*() that are difficult to test in
unit tests. In particular, it tests multiple SPDK threads running on
different pthreads contending for a lock and it tests pollers and
messages going off CPU with a lock held.
Change-Id: I5cd6ce29c92c44ba63f47332fe339e59eed81553
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15534
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This introduces an enhanced spinlock that adds safeguards compared to
the default pthread_spinlock_t. In particular:
- A pthread_spinlock_t is still used, but additional error checking is
performed to ensure there is no undefined behavior on relock,
unlocking when not the owner, or destoying a locked lock.
- The SPDK concurrency model allows an SPDK thread to be migrated
between pthreads. Releasing a pthread spinlock on a different thread
from where it is taken is undefined behavior. If an SPDK spinlock is
held at a time that a time when a poller or message returns control to
thread_poll(), the program will abort.
- SPDK spinlocks can only be obtained from an SPDK thread.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6dd6493ab5f5532ae69e20654546405a507eb594
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
nvme_tcp_read_pdu in a loop
nvme_tcp_read_pdu itself has a loop in it that runs until no more data
is available, so the extra loop does nothing.
Change-Id: I1471018e396c43187d1f06bd18ce8a6846a71c94
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15139
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is unsafe, because we touch need_buf_* queues, which aren't
thread-safe. Also, documented this requirement in
spdk_bdev_io_get_buf()'s description.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iabc141e051c543fdd51f079ae212f69e980d8148
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15668
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Add rpc method trace_get_info to show name of shared memory file,
list of the available trace point groups and mask of the available trace points for each group.
Fixes#2747
Signed-off-by: Xinrui Mao <xinrui.mao@intel.com>
Change-Id: I2098283bed454dc46644fd2ca1b9568ab2aea81b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15426
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
All the other spdk_sock_* functions return -1 and set errno
appropriately, so we should do the same in flush().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I51cda2c51974c72e82531f06fa31ab89b2329c91
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
If a lot of qpairs are connected all at once, the
RDMA optimal_poll_group logic does not work correctly,
because it only accounts for qpairs that received
their CONNECT capsule. Now that we have a counter
for a poll group's unassociated qpairs, use that value
to supplement the current io qpair count.
We can just assume for now that all of these unassociated
qpairs are io qpairs. That won't always be true, but
for purposes of picking the optimal poll group it is
sufficient.
Note that for RDMA, we could increment the counters
based on the RDMA qpair ID in the private data in the
rdmacm connect, but to keep the code simpler and common
across all transports, we defer the accounting until
after receiving the CONNECT command, so that it is
the same for all transports.
Fixes issue #2800.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5897d6ebac23d3b78b100e3fef5a7f9fb5304820
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15695
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use a local variable to hold the qpair count.
While here, also use pg_current to get the min_value,
this is a bit simpler to read than things like
(*pg)->group.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I65771fb469f021e9e77b8a6c117841b8f4b66af5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15694
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We make decisions on how to pick a poll group for a new
qpair by looking at each poll group's current_io_qpairs
count. But this count isn't always accurate since it
doesn't get updated until after the CONNECT has
been received.
This means that if we accept a bunch of connections
all at once, they may all get assigned the same poll
group, because the target poll groups counter doesn't
get immediately incremented.
So add a new counter, current_unassociated_qpairs,
to account for these qpairs. We protect this counter
with a lock, since the accept thread will increment
the counter, and the poll group thread will
decrement it when the qpair receives the CONNECT
allowing us to associated with a subsystem/controller..
If the qpair gets destroyed before the CONNECT is
received, we can use the qpair->connect_received
flag to decrement current_unassociated_qpairs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8bba8da2abfe225b3b9f981cd71b6f49e2b87391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15693
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Currently we use qpair->ctrlr at qpair destroy
time to decide if we need to decrement the
qpair's poll group's qpair count. But this is
not correct - these counters get incremented
when the CONNECT is received, but qpair->ctrlr
doesn't get set until later.
So add a new connect_received bool to the spdk_nvmf_qpair.
Use this instead to determine when we should decrement
the poll group qpair counters.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I174a0fda36c4558171953bf58f2f5117bc074f76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
At least recent Linux guest VMs send SPDK_NVME_IDENTIFY_CTRLR_IOCS as a
matter of course. While this isn't supported in lib/nvmf, as this
doesn't represent an error, reduce the log level of the error message so
we don't spam the logs.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I095de3e4331b3912cbc457da6d722b9883ec7884
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15646
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Issue is found in the virtio_pci_scsi_dev_create() whose
error path is setting the vdev->ctx to NULL before the
destruct operation.
Change-Id: I4ab0fbe300f7413ad4503833088856aa3f4c0734
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15676
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Optimal I/O boundary causes I/O to be split in the nvme driver. This is
a problem for writes if write_unit_size > 1 because the split I/O may
not match the write_unit_size.
Fixes: #2791
Change-Id: I437e6cb6d8e2415658d5b46539feeacb5363fd46
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15627
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
NVMf target reports copy command support if all bdevs in the subsystem
support copy IO type. Maximum copy size is reported for each namespace
independently in namespace identify data. For now we support just one
source range.
Note, that command support in the controller is initialized once on
controller create. If another namespace which doesn't support copy
command is added to the subsystem later, it will not be reflected in
the controller data structure and will not be communicated to the
initiator. Attempt to execute copy command on such namespace will
fail. This issue is not specific to copy command and applies also to
write zeroes and unmap (dataset management) commands.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I5f06564eb43d66d2852bf7eeda8b17830c53c9bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The correct SPDK thread is already contained in the poll group.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I4eefe2ba60c77c01a866a693bccbb8affc8262ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Fixes#2781
This patch fixes two issue causing segfault on r2t:
1. pdu buffer is allocated from immediate_data_pool, but data_buf_len is set as data_out_pool
2. task->desired_data_transfer_length is rewrite by iscsi_send_r2t, which causes a wrong calculated pdu->data_buf_len
Signed-off-by: melon.masou <melon.masou@outlook.com>
Change-Id: I151859afff7104f29ad7f0ec57a8479d88b742bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15542
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Added new API 'spdk_bdev_histogram_get_channel' to get histogram of
a specified channel for a bdev. A callback function is passed to it
to process the histogram.
Change-Id: If5d56cbb5fe6c39cda7882f887dcc9c6afa769ac
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15539
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Print out the specific values in this SPDK_ERRLOG,
this can help to find where the error is.
Change-Id: I2a38aa2d4270e0bbf554ddb348a73d40967d1b16
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
If the calloc failed, the fd was left in the fd_group.
Change-Id: Ie68426a13d342756c20315656f0309440fda6e02
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15475
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <levon@movementarian.org>
SPDK threads generally run on dedicated cores and locks should be rarely
contended. Thus, putting a thread to sleep while waiting on a mutex does
not free up CPU cycles for other pthreads or processes. Even when
running in interrupt mode, lock contention should be low enough that
spinlocks are a net win by avoiding context switches.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6e2e78b2835bbadb56bbec34918d998d75280dfd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also add unit tests that explicitly test this
condition. They fail without the nvme driver changes
in this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa369be341eb4eba394f248990e56dce001d3940
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15579
Reviewed-by: Mariusz Barczak <mariusz.barczak@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For now, just print a loud warning when this case is
violated. We will add a hard assertion and cause the app
to exit with error status in a later release.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic9226f76a4729820f13a2728bea977b6a54f48ee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15513
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This function can be useful to query if a thread
had spdk_thread_exit() called on it yet. Internally
we have both EXITING and EXITED state - so
!spdk_thread_is_running() can be used to detect a
thread that is in either of those states.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2f6fb024a6b1bc895fdc5132c722abc10f5d30f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15512
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This was an accidental remnant from the original
check-in, when we did not have a clear differentiation
between the event and thread libraries.
The rocksdb plugin code will send events to an
lcore - not an SPDK thread. But originally the two
were combined though an API called spdk_allocate_thread.
Once the differentiation was clearly made, we moved to
using spdk_event_allocate() to send events to a specific
lcore, but never removed the spdk_thread.
So now let's just remove the spdk_thread_create since
it is not needed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6a3c304b7b4183eee90038367fdea7ebd7280f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15504
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This requires creating and setting SPDK threads in the
subsystem unit tests as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I31acfb1d7e418f011acc9b48933032d8bf8a1c53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15511
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Currently when a vhost-scsi controller is removed, it
calls spdk_vhost_scsi_dev_remove_tgt on all remaining
targets, and then immediately calls vhost_dev_unregister.
But this path goes into vhost_user_dev_unregister which
immediately returns with error if there are any pending
async operations - and there are since scsi_dev_remove_tgt
is asynchronous.
So instead add the vhost_dev_unregister call to
remove_scsi_tgt, so that the unregister only happens
after the last ref goes away.
This requires changing vhost_fini() to no longer
assume that spdk_vhost_dev_remove() will immediately
unregister the device, since it now happens
asynchronously. Previously vhost_fini() was making
this assumption erroneously - it would call g_fini_cb
without actually checking that the devices had been
unregistered. Because of that incorrect assumption,
we need to do both the vhost and vhost-scsi changes
in the same patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9577901266975447f9acfe53475221113f02fea3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15510
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
At end of spdk_thread_poll(), if thread is in EXITING state,
we call thread_exit() to see if the thread can move to
EXITED state. If there are any pollers, io_channels
or pending device unregistrations in progress, thread_exit()
will keep the thread in EXITING mode for this iteration.
But a thread may post messages to itself during this cleanup
process, so thread_exit() should also check if there are
any messages on its queue.
Found during testing of spdk_thread lifetime patch set.
rbd bdev module will send messages to itself like this
during cleanup. Without this change, rbd module testing
with bdevperf would cause an spdk_thread to move to
EXITED state prematurely.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie611026a67b7fa48640ae83be03e29a9c64883a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15533
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
If a connection is established and we receive a bad
PDU before successful login, the login_timer would not
get unregistered. So ensure the login_timer is always
unregistered in _iscsi_conn_destruct().
Found with Calsoft tests during new spdk_thread_exit()
assertion testing. Lack of unregistration would result
in its associated spdk_thread being unable to exit
cleanly due to the unexpired timer.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I79d427512f7829ad76bf89155e0e14c7bce3a7d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15499
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The "app thread" will always be the first
thread created using spdk_thread_create(). There
are many operations throughout SPDK that implicitly
expect to happen in the context of this app thread,
so by formalizing it we can start to make assertions
on this to help clarify and simplify locking and
synchronization through the code base.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7133b58c311710f1d132ee5f09500ffeb4168b15
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15497
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Next patch will add a new caller to this function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I54374c0af3a4a0fdcc5ac9ca25e2c7ef03e99829
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15576
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
`BDEV_IO_NUM_CHILD_IOV` and `BDEV_RESET_IO_DRAIN_RECOMMENDED_VALUE`
are public macro definitions without `SPDK_` prefix, so we add the
`SPDK_` prefix to them.
Change-Id: I4be86459f0b6ba3a4636a2c8130b2f12757ea2da
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15425
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The bdev hot remove might be an async process. The bdev_open will
return an error during the hot remove process. If someone invoke the
bdev_get_bdevs API when a bdev is in the middle of a hot remove
process, the spdk_for_each_bdev function will stop its loop when a
bdev_open return an error. Thus the bdev_get_bdevs will only return
partual bdevs or even return an empty list if the hot remove bdev is
the first bdev in the loop. When spdk_for_each_bdev and
spdk_for_each_bdev_leaf loop for each bdevs, if a bdev returns an
error, we skip that bdev instead of stop the whole loop.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: Ib35b817e23e47569fc5762a883b4ff8e322ae173
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15322
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
If the guest performs a hard shutdown we're not deleting the CQs:
nvmf_vfio_user_close_qpair calls delete_sq_done, which won't delete
the CQ because vu_ctrlr->reset_shn is false.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I383fb985340a0d9d0eb7fea7403372cbdc55a089
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15387
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This eventfd may be passed by libvfio-user to the remote process which
might remove the EFD_NONBLOCK flag, in which case we would block
indefinitely.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: If9826cd700b4a7b3458a0a8278a96322d99ac08e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15385
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch introduces function spdk_fd_group_get_epoll_event, which
returns the epoll(7) event that caused the file descriptor group
callback function to execute. Rather than changing the signature of
spdk_fd_fn in order to pass the struct epoll_event, which would result
in a gigantic patch where there vast majority of users would simply have
to ignore the new argument, we introduce this new API that allows to
return the epoll_event only when really needed.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Suggested-by: John Levon <john.levon@nutanix.com>
Change-Id: I3debe1382d1c2bfec6ae4fea274ee38ed0b135fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14935
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
The ftl_md_get_buffer_size returns the buffer size in bytes, so we
should divide by the block size, instead of this smaller value. Risks
touching bad memory during dirty shutdown recovery, especially in >16TiB
drives.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Signed-off-by: Mariusz Barczak <mariusz.barczak@intel.com>
Change-Id: I4095b00a79a1bdbce5046dc46349a9670e41b18e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15259
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A metadata region without mirror should have the INVALID enum set,
otherwise it risks touching invalid parts of the array.
The sb_shm_md not being set to NULL could cause the code to touch this
freed pointer in the error path in ftl_md_create -> ftl_md_create_shm ->
ftl_md_invalidate_shm calls.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Signed-off-by: Mariusz Barczak <mariusz.barczak@intel.com>
Change-Id: I7fe9694dad535de5f6b2a4af27400fa125480605
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15258
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Bumping the IO activity statistics during relocation, compaction, L2P
cache processing and user IO handling. This makes sure poller busy
counter is more accurate.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Change-Id: Iabf8ec7ca41c01d7a00d3a70825b8d5283ab2bf1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15257
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
NVMe controllers can be marked as removed even if we cannot receive
uevents (e.g. by the VMD driver), so we should process them regardless
of hotplug_fd.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iaaf13a136929200e824f7a6dd3b5584998801630
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15547
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I2d3825ffcce098909745ba949cdde3eb7f71c703
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15545
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The previous 14B buffer was too small for VMD devices.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib3984f7104fadbb2fbf7ec56932675d73eda1456
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15532
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
We should always build all function that are part of the API, even if
some of the libraries they depend on are missing. In that case, they
can return an error instead.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I72b450b3a1d62e222bd843e45be547d926414775
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Remove automatic generation of UUIDs for bdevs
that do not provide this value themselves.
This is to clarify whether this field can be
depended upon.
Modified match files to reflect change in UUID
generation.
Disabled nullglob shell option, as it deletes
empty arrays during word splitting. Bdevs with no
aliases would instead of "[]", have nullpointer
printed, which makes resulting JSON invalid.
Part of enhancement proposed in #2516.
Change-Id: Ic1d5f8f8d001ae1a219e876aef2a19b1ff0b2f2c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15150
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Add the processing of returning 0 for spdk_accel_get_opc_module_name(),
and remove SPDK_RPC_STARTUP, because this will cause core dumped
when run nvmf_tgt with --wait-for-rpc and no RPC framework_start_init.
Fixes issue: 2770
Change-Id: I1c53ccb8caa52f2eaa0b8b560a021bded49d8fed
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15377
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Add helper functions, bdev_iostat_ctx_alloc() and bdev_iostat_ctx_free()
for the bdev_get_iostat RPC.
The following patches will allocate spdk_bdev_io_stat dynamically for
bdev_get_iostat_ctx.
This is a preparation for that.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ib71d6fb92d8134d2282507e62874f19045b630b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15442
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The bdev_get_iostat RPC uses two types of contexts, one to manage the
progress of the bdev_get_iostat RPC and another to call
spdk_bdev_get_device_stat().
However, this was hard to find from the source code.
To make us easier to find this, rename the former by rpc_ctx and the
latter by bdev_ctx. Then rename related functions and variables accordingly.
Furthermore, relocate request and decoder declaration to improve
readability.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3472c87fe4ec1f5981a49ef79148534fbb1d46c4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15349
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
RPC parameters and decoders for the bdev_get_iostat RPC are used only
by rpc_bdev_get_iostat(). Locating RPC parameters and decoders close to
rpc_bdev_get_iostat() clarifies it. Furthermore, this will simplify code
review for the next patch.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1b1b428e3eb3bb4422e490c5f4324f0e40f9710f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For I/Os controlled by QoS, TRACE_BDEV_IO_DONE is collected after
redirecting to the original thread. Hence, TRACE_BDEV_IO_START should
be collected on the original thread too.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I15411be823450ee5ddaa7582509a7aa068476fc5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14824
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The 2211 implementation only gets used when runtime
detects the DPDK version is DPDK 22.11. But we still
compile this file even if it gets built against an
older DPDK.
This is typically fine, except there are some interrupt
APIs that changed in DPDK 21.11, so older DPDKs don't
have some of the functions used in this file. We need
to use ifdefs to allow this to compile.
We will need some more work to handle this case properly,
but this patch at least fixes the 2211.c case for now.
We will probably need a 2108.c file that exactly matches
the 2207.c file except for this interrupt API changes.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6055694ccbb79845798e750ebb7127ec6c160e2e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15236
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Update code for read the virtual address width to use glob to locate the
Intel and AMD iommu capability registers. This code should work for all
AMD numa configurations.
Fixes issue 2730
Signed-off-by: Michael Piszczek <mpiszczek@ddn.com>
Change-Id: Ibf5789087b7e372d892b53101e4c0231809053f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14961
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Add an optional allowlist for RPC methods: if the method is not listed,
it is not allowed to be called or visible. This can be used to restrict
accidental mis-configurations, and generally helps locking down the
configuration surface.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ied78fc4b14b60cb94ed0852b92deb6df545cbec4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15275
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
per Intel policy to include file commit date using git cmd
below. The policy does not apply to non-Intel (C) notices.
git log --follow -C90% --format=%ad --date default <file> | tail -1
and then pull just the 4 digit year from the result.
Intel copyrights were not added to files where Intel either had
no contribution ot the contribution lacked substance (ie license
header updates, formatting changes, etc). Contribution date used
"--follow -C95%" to get the most accurate date.
Note that several files in this patch didn't end the license/(c)
block with a blank comment line so these were added as the vast
majority of files do have this last blank line. Simply there for
consistency.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id5b7ce4f658fe87132f14139ead58d6e285c04d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Allow CPU core locks to be enabled and disabled
during runtime. This feature will be useful
in cases like SPDK hot upgrade, where
locking should be disabled temporarily.
Change-Id: I9bc7292fd964abffc7214d074d191f38b13583c3
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15031
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
When running SPDK application on a given set of
CPU cores, create lock files for each of them.
This wil prevent user misconfiguration and
assigning a core to more than one SPDK instance.
The introduced mechanism is based on device locks
implemented in spdk_pci_device_claim() function.
Add a command line option to disable lock files.
This feature will be useful in cases where differing
CPU cores is impossible (eg. setup with only one core
available).
The patch also fixes all existing cases of overlapping
core masks.
Change-Id: Ie9aacb7523a3597b9aa20f2c3fa9efe4db92c44c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14919
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Additionally, print the string representation of the ctrlr state, as it
makes debugging init failures much easier.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I572ef3d6f7d5bbd52039a8872733578c92be4c4a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Add API spdk_bdev_io_get_submit_tsc to get submit tsc of a bdev I/O,
which can be used in bdev modules to avoid calling expensive
spdk_get_ticks().
Change-Id: Ifbcecb1bc663344997c5e73b72a1dfb5d0422946
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14989
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This fix allows to use relaxed ordering feature where it is
supported. libibversb checks with the driver if relaxed ordering
access flag is supported and ignores it if not.
Experiments show that set by default it doesn't spoil performance but
allows to reach desired one on AMD EPYC systems. For example fio read
test (ConnectX-6, AMD EPYC 7763, two jobs, queue depth 32, block size
32K) can starve down to 6-7 GiB/s without it. Enabling this option
allows to get bandwidth more than 21 GiB/s.
Change-Id: I5983aed5d1f38ee7bec9c310597731c9a6a329da
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14885
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It doesn't make sense to have the size of the doorbells fixed and then
calculate the maximum number of queue pairs based on it, do it the other
way round. Also, add some sanity checks based on the spec.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I17e3509fb0a011128ca089ce78b7a296262e6f8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The bdev*_with_md APIs now allow to pass NULL md
pointer, so calling this function without checking
for metadata simplifies code
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I364a646630bd36120231ea87a41fea05df51befb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15090
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
In the following patches, we will add a feature to inject data
corruption to the error bdev module. For read I/O, we will have
to inject data corruption at completion. However, if we use
spdk_bdev_part_submit_request(), it will not be possible because we
cannot add any custom operation into the completion callback.
To fix the issue, modify spdk_+bdev_part_submit_request() and
rename it to spdk_bdev_part_submit_request_ext().
Fortunately, we can use stored_user_cb in struct spdk_bdev_io.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I46d3c40ea88a3fedd8a8fef6b68ee417c814a7a1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15002
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Session in vhost means an active socket connection from
client(e.g: QEMU or SPDK vhost initiator), but the device
state could be `started` or `stopped` because users may
remove the driver of the device in VM, so in
`foreach_session` we can always call the callback function
without checking the session state, and the callback function
may check the device state if necessary.
Change-Id: Id0fc8c7f6f0915a55a738f0c87ebe6539f7fb2db
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15038
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Now we will start the device(virtio-blk and virtio-scsi) when
there is a valid I/O queue(VRING_KICK message), the backend
device `start_session` callback will ensure this check, so
when processing VRING_KICK messages for each vring, we can
just call `new_device` if `started` is false, and if `started`
is true, it means the device is already started, it's safe
for us to add one more vring even the device is started.
With this change, we don't need to wait for the return value
of `start_session` in synchronous mode, just return is OK.
Fix#2518.
Change-Id: I92ba3d4e5c38422d7697c1d13180a4a48f0dd4cd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14981
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We will stop/start the device multiple times when a new
vring is added, and also stop/start the device when
set vring's callfd, actually we only need to start
the device after a I/O queue is enabled, DPDK rte_vhost
will not help us to start the device in some scenarios,
so this is controlled in SPDK.
Now we improve the workaround to make it consistent with
vhost-user specification.
For each SET_VRING_KICK message, we will setup the new
added vring, and then we try to start the device.
For each SET_VRING_CALL message, we will add one more
interrupt count, previously this is done when enable
the vring, which is not accurate.
For each GET_VRING_BASE message, we will stop the
device before the first message.
With above changes, we will start/stop the device once,
any new added vrings after starting the device will be
polled in next `vdev_worker` poller.
Change-Id: I5a87c73d34ce7c5f96db7502a68c5fa2cb2e4f74
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14928
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
`vdev_worker` in vhost-scsi is used to process request queues,
and `vdev_mgmt_worker` is used to process the event and control
queue, so we don't need to call `vhost_session_used_signal` in
`vdev_worker`, just remove it.
Change-Id: I86f3e90890e6defba69b01fec131afe1adad3a49
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14927
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently we will allocate all VQ's tasks when starting
the device, it will not allow us to add new VQ after
starting the device, so here, we move it to VQ setting
function.
Change-Id: I59cfc393d66779ab8a0eb704bc73bcede3f0a2a0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14926
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
With this change, then we can call vq settings after the
VRING_KICK message, currently we will stop/start device
multiple times when a new vq is added.
Change-Id: Icba3132f269b5b073eaafaa276ceb405f6f17f2a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14925
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Feature negotiation is done after SET_FEATURES message, here we
move it in this message context, so that we can use the negotiated
features before starting the device.
Change-Id: Ic6388dbcebd72bc5ef182e65798d34c07f6fc35c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14924
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Before starting a device, the memory table is already
there, so we can check it earlier.
Change-Id: I4996705501577cfa78c89621f7081eb0c3d4dd78
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14923
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
DPDK has merged changes which hide remove some DPDK
object such as rte_device and rte_driver from the
public API.
So we add copies of the necessary header files into
our tree, along with a 22.11-specific pci_dpdk
implementation.
These files are copied over exactly, except for one
#include which needs to change from <> to "" so that
it picks up the header in our tree instead of looking
for it in system headers.
Longer-term we may want to look at ways to automated
checking and updating of these header files. DPDK 22.11
isn't officially released yet, so the header files could
change, but we want to get this in now since without
it SPDK cannot build against DPDK tip at all.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I89ffd0abab52c404cfff911c1c9b0cd9e889241d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14570
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Copy operation is defined by source and destination LBAs and LBA count
to copy. For destiantion LBA and LBA count we reuse exiting fields
`offset_blocks` and `num_blocks` in `struct spdk_bdev_io`. For source
LBA new field `src_offset_blocks` was added.
`spdk_bdev_get_max_copy()` function can be used to retrieve maximum
possible unsplit copy size. Zero values means unlimited. It is allowed
to submit larger copy size but it will be split into several bdev IOs.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I2ad56294b6c062595c026ffcf9b435f0100d3d7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Add a new parameter "-c" to display the per channel IO statistics
for required Bdev
./scripts/rpc.py bdev_get_iostat -b Malloc0 -h
usage: rpc.py [options] bdev_get_iostat [-h] [-b NAME] [-c]
optional arguments:
-h, --help show this help message and exit
-b NAME, --name NAME Name of the Blockdev. Example: Nvme0n1
-c, --per-channel Display per channel IO stats for specified device
This could give more intuitive information on each channel's processing
of the IOs with the associated thread on the same Bdev.
Please also be aware that the IO statistics are collected from SPDK
thread's related channel's information. So that it is more relating
to the SPDK thread. And in the dynamic scheduling case, different
SPDK thread could be running on the same Core.
In this case, any seperate channel's IO statistics are returned to
the RPC call and if needed, further parse of the data is needed to
get the per Core information although usually there is one thread
per Core.
On the other hand, user could run the framework_get_reactors RPC
method to get the relationship of the thread and CPU Cores so as
to get the precise information of IO runnings on each thread and
each Core for the same Bdev.
Change-Id: I39d6a2c9faa868e3c1d7fd0fb6e7c020df982585
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13011
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
And also related function pointers and APIs:
spdk_bdev_for_each_channel_msg;
spdk_bdev_for_each_channel_done;
spdk_bdev_for_each_channel_continue;
Change-Id: I52f0f6f27717d53c238faf2f998810c9c5ee45d4
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14614
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
The following patches will allow the caller to specify a custom
completion callback to spdk_bdev_part_submit_request(). To do it
easily, consolidate completions of all I/O types into
bdev_part_complete_io().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I083695189daa7e5271787c50947e428d01a83677
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15001
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We do not support Soft RoCE anymore. Remove a workaround for Soft RoCE's
bug that we amy receive a completion without error status after qpair is
disconnected/destroyed. Then add a assert to check if rdma_req->req is
not NULL. This will simplify the code and the following patches.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I80c349053adc0f79679eaf8a5d7265d555d3c2b0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14909
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The following patches will support SRQ and SRQ will be per poller.
We will need SRQ in nvme_rdma_cq_process_completions().
It is not possible to identify poller if poll_group is passed to
nvme_rdma_cq_process_completions().
Based on these thoughts, add poll_group pointer to poller and pass
poller to nvme_rdma_cq_process_completions() instead of poll_group.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I322a7a0cc08bdcc8e87e720ad65dd8f0b6ae9112
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14282
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
NVMe-RDMA target has a helper function get_rdma_qpair_from_wc() and
uses it to identify a qpair from a WC.
NVMe-RDMA initiator has a similar function
nvme_rdma_poll_group_get_qpair_by_id().
NVMe-RDMA initiator will support SRQ in the following patches, and
it will want to identify a qpair from a WC.
get_rdma_qpair_from_wc() of NVMe-RDMA target uses wc->qp_num internally
anyway.
However, the upcoming custom transport for RDMA will have to use other
variables of WC.
Hence, it will be convenient to pass WC instead of qp_num if we consider
future enhancements.
Based on these thoughts, for NVMe-RDMA initiator rename
nvme_rdma_poll_group_get_qpair_by_id() by get_rdma_qpair_from_wc().
remove unnecessary declaration, and pass WC instead of qp_num.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I01ead4730207e2c6ac53b83f151bd5f977a11465
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14279
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Poller will have more shared resources when SRQ is supported.
This is a preparation.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ic3d1cb93dde3f53653a9536a103e5518cebd58e1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14173
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
nvme_rdma_ctrlr_disconnect_qpair() does not poll the qpair until it is
actually disconnected if it is in a poll group even if its async mode
is disabled. Hence, spdk_nvme_ctrlr_free_io_qpair() removes the qpair
from a poll group when it is being disconnected.
On the other hand, I/O qpair is destroyed after it is actually
disconnected.
When SRQ is enabled and used, a SRQ is destroyed if the corresponding
poller does not have any I/O qpair after an I/O qpair is removed from
the poller.
In particular, if we use spdk_nvme_ctrlr_free_io_qpair(), a SRQ is
destroyed before the corresponding I/O qpairs are destroyed.
Destroying a SRQ failed because it is still referenced by I/O qpairs.
This bug was found when running the SPDK NVMe perf tool with SRQ.
The reason was we had nvme_rdma_poll_group_process_completions() to call
disconnected_qpair_cb after the qpair is actually disconnected.
However, it is ensured that nvme_rdma_poll_group_process_completions()
calls disconnected_qpair_cb for any disconnected qpair.
Hence, remove a check if qpair->poll_group is not NULL from
nvme_rdma_ctrlr_disconnect_qpair() and update the comment.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0fde0d827eec3280e1cc5a0fce34d163a6069bc4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
With RDMA, the admin poller can experience a remote disconnect when
processing completions. The admin qpair will be disconnected to handle
this. The disconnect code path will manually complete queued aborts.
However, the completion callback for the abort will attempt to resubmit
other queued aborts from the queue, which will result in a very large
stack and can eventually cause a segfault.
The fix is to not resubmit queued aborts if the admin qpair is in any
kind of failed state.
Change-Id: I4a6f959232c8a1bd30c87ca50459014e556cbaa0
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15114
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
A loop inside 'nvme_tcp_qpair_process_completions' makes
'max_completions' actually behaving like a minimum:
do {
rc = nvme_tcp_read_pdu(tqpair, &reaped);
[...]
} while (reaped < max_completions);
Before this change 'max_completion' constraint, in its true sense,
was actually not respected and a loop inside 'nvme_tcp_read_pdu'
could be executed indefinitely as long as a recv state changed.
To prevent this behavior, max_completion must be passed to
'nvme_tcp_read_pdu' and used as an additional exit condition.
Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com>
Change-Id: I28da962f4a62f08ddb51915b5d0dae9611a82dee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15136
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Some reset/disable paths are freeing the shadow doorbells without
switching the SQs back to BAR0. Fix this up, and add a small cleanup
when initializing the shadow doorbells.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ia5e5b91b7dc696a558eb0ad59cc554abced47cca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14901
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To support SQs allocated to a poll group other than the controller's
main poll group, we need to make sure to poll those SQs when we wake up
and handle the controller interrupt. As they will be running in a
separate SPDK thread, we will arrange for all poll groups to wake up
when we receive an interrupt corresponding to a vfio-user message
arriving.
This can mean needless wakeups: we don't (yet) have a mechanism to only
wake up the poll groups that correspond to a particular SQ write.
Additionally, as we don't have any notion of a poll group per
controller, this ends up polling all SQs in the entire poll group, not
just the ones corresponding to the controller we were handling.
As this has potential performance issues in many cases, it defaults to
disabled.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I3d9f32625529455f8d55578ae9cd7b84265f67ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14120
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When crc32c is invoked with a multiple entry input iov,
only the last op has crc_dst set in order to write the final
crc value into the user supplied location.
spdk_idxd_process_events() for every successfully completed
CRC op writes the value into *op->crc_dst
UNLESS it is NULL.
The problem is that _idxd_prep_batch_cmd() that allocates
new ops left op->crc_dst uninitialized.
This results in a memory corruption (use after free)
in the following scenario:
1) op A is allocated an crc_dst is set to point to user memory X.
2) Op A is compeleted
3) User memory X is freed.
4) Ops B and C are allocated (chained), C has crc_dst set.
=> B reused op A memory and crc_dst still points to the
now stale user location (1)
5) B is complered, spdk_idxd_process_events() writes into X
as B->crc_dst = X.
Fix: _idxd_prep_batch_cmd() should initialize crc_dst to NULL.
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Change-Id: I9e7d57ec43a8fbcb3750906015a5cb7291278c35
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15115
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We were missing a check when ISAL uses the complete output buffer
on compression to determine whether it was s perfect fit or if
simply not enough buffer was provided.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I73532666f50cb9fbef3c42f6bfb25fc5c7de01c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Prevent user from switching back to static scheduler after
different scheduler has been selected. Currently we
do not have a way to save initial thread distribution
configuration, so each time user switches from dynamic
scheduler back to static, the SPDK threads may end up on
different reactors. This would cause discrepancy in
performance statistics of SPDK managed by static scheduler.
Change-Id: Ic17a6be55eaea0e1a748f92e01f7075540403637
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15055
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This helps generate slightly better code in this function,
which can have a noticeable impact for high trace
event workloads.
Tested with bdevperf, single malloc or null bdev,
qd=32, 512B randreads on a single Xeon core.
Specify "-e bdev" to enable bdev trace events.
Null:
Before: 8.09M/s (123ns per IO)
After: 8.68M/s (115ns per IO)
Malloc:
Before: 4.21M/s (237ns per IO)
After: 4.34M/s (230ns per IO)
Note that each bdev I/O generates two trace events (START
and END) - meaning this change removes 7-8ns of overhead
for every 2 trace events, at least on my system.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7021b7f9e28b4a7cb16f8a97b4d4004ae165efd2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15096
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
When spdk_idxd_submit_crc32c() handles input
with multiple iovs (or multiple ops are generated
due to physically discontinuous buffers),
the first op has the original seed, while the
subsequent ops instruct the hardware to
to fetch the seed from the output of the previous op
(op->hw.crc32c_val):
void *prev_crc;
...
desc->flags |= IDXD_FLAG_FENCE | IDXD_FLAG_CRC_READ_CRC_SEED;
desc->crc32c.addr = (uint64_t)prev_crc; <<< virtual addr
The problem is the prev_crc is a virtual address,
so the hardware (at least with no IOMMU configured)
reports: DSA_COMP_HW_ERR1
spdk_idxd_process_events: Completion status 0x20
Solution:
Set crc32c.addr to the physical address of
the crc32c_val field in the previous desc.
Since desc->completion_addr already holds the physical address
of the dsa_hw_comp_record, we use this with the crc32c_val offset.
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Change-Id: I330e98c2f3fd6da5cb4fc03d0745df09a9ff0e0c
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14954
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It allows the users to specify the path to the RPC socket on a NFS
mounted filesystem. This is necessary, because flock(2) on NFS requires
write access to place an exclusive lock.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If197498ed5bdcb4e02c5f2f2b2c1ef388872c457
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14993
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
When Link Time Optimization is enabled, compiler can sometimes produce
additional warnings saying that some variables may be uninitialized.
To supress the warning it is enough to add explicit initialization
of the variable causing the issue, in this case '*module_name = NULL'
and "*writer = NULL".
Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com>
Change-Id: I30492115b28a18554b08a6f575cbcc9538f3b848
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14849
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Fixes#2693
spdk threads should not be placed in interrupt mode
if the application does not have interrupt mode enabled.
This resulted in race condition, while reactor was placed
in interrupt mode, thread was scheduled on it.
Such operation is a valid one, but never should be attempt
to change the threads mode in this case.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I10b0bbacac1df812badb91b37064528f66743e51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14815
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Patch below added copies of pci realted headers to keep
compatiblity with <= DPDK 22.07.
(1eb35ac) env_dpdk: add copies of 22.07 pci-related header files
Unfortunetly the rte_bus/bus_pci/dev headers from DPDK 22.07 are
not compatibile going back to DPDK 20.11.
The issues are:
- lack of RTE_TAILQ_ENTRY defined in rte_os.h
- rte_intr_handle being part of rte_pci_device rather than pointer
pci_dpdk_2207.c even before this patch is not binary compatible with
DPDK 20.11 - see pci_device_*_interrupt_2207() functions.
There would need to be another copy of headers matching that version
of DPDK to resolve this issue.
SPDK supports up to two latest LTS releases. Which right now includes
DPDK 20.11, but soon will be dropped due to DPDK 22.11 release.
Having compile time defines here, keeps the older DPDK working.
Meanwhile backwards compatiblity in SPDK is no worse than before.
The recent changes to env_dpdk, are aiming to improve support
with newer versions of DPDK.
Change-Id: If4dc601cb03e18c2cad61f3a93080e8265ca5fcc
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14795
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add new bdev property split_on_write_unit which, if set to true, causes
writes to be split to match write_unit_size and fail if not aligned to
or not multiple of write_unit_size.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: Id49f58a3288ddf5cfe4921ce4020ae4bcdd67298
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11390
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The client vfio_user library doesn't require this flag as
it is totally owned in SPDK, so remove it.
Change-Id: I8f7b1df18017ceac24dbb8a0417871f25f6bee0d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13895
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously SPDK use libvfio-user library to provide emulated NVMe
devices to VM, but it's limited to NVMe device type only. Here we
add SPDK vfu_target library abstraction based on libvfio-user which
supports more PCI device types.
We will add virtio-blk and virtio-scsi devices emulation based on
vfu_tgt library in following patches, actually this library can
support NVMe emulation too, due to the fact that the NVMe emulation
is already exist, so we will keep the NVMe emulation which based on
libvfio-user directly as it is.
Change-Id: Ib0ead6c6118fa62308355fe432003dd928a2fae9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12597
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This allows eliminating dpdk_pci_device_vtophys and
dpdk_pci_device_map_bar, reducing the amount of
code we need to maintain in the per-DPDK version
implementations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I73d15eb75bf7fe8340d85494425e15651fec5425
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14722
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Break this function up into three APIs instead:
* dpdk_pci_device_get_addr
* dpdk_pci_device_get_id
* dpdk_pci_device_get_numa_node
This more clearly delineates the requirements we
have from the DPDK PCI device/driver APIs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie585c8252d63c15c6e6884d60f8a064c3f0ab94f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14684
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Moving forward, we want to still be able to run against
<= 22.07 versions of DPDK, which exposed the necessary
data structures in public header files. But since we
will be building against newer versions of DPDK which
don't expose them publicly, we need a copy of the 22.07
header files in our tree.
Exclude these header files from astyle and POSIX include
file checks in check_format.sh
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icd8a067af41a2ba031ce8f875a8a2b63f722ab69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14683
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This was a remnant from ages ago when we had rte_vhost
DPDK code copied into our repo. We actually have a file
named rte_vhost_user.c which is not DPDK code that was
getting excluded from astyle checking.
So this also includes the astyle violations that had
crept into this file. In a couple of places, change
the enum return type to int, this reduces astyle
confusion on function and if brace style.
Same applies to POSIX include checking - we don't need
to exclude rte_vhost_user.c from this either.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If3a25011ad54c694c15a91f7be66d862c765c5db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14688
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For example, in the calling from spdk_bdev_get_current_qd(), if
spdk_for_each_channel() failed to allocate struct spdk_io_channel_iter,
it will just return and the ctx allocated in spdk_bdev_get_current_qd()
is not released.
Instead to change the public API of spdk_for_each_channel() to return
the failed status to let the caller properly handle the NOMEM case and
release the allocation, it just adds the assert here.
Change-Id: I6a95207dd390586bdae4e86e5d550cdac709e10a
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14657
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
max_aq_depth should be not smaller than 2 or greater
than 4096
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I205fbb4345cfdc41ebaf30c953da263fe9f0e9a8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14691
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
max_queue_depth should be not smaller than 2 or greater
than 65536
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I0f2a4b8df6eb1b140a11936fc6929f1285a7d717
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14619
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Refine the macro definition name about queue depth and
prepare for next patch.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I85bee2528ae4ab70292fc11aa62d05bae0c28a77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14664
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Delete bit masks from trace help (found inside
build/bin/spdk_tgt -h help text), as they do not
provide useful information, are much harder to
remember and use, and migh leave user confused.
Since we provide trace group names anyway, bit masks
are excessive.
Change --tpoint-group-mask parameter name to
--tpoint-group, because we do not provide
bit masks anymore.
Drop "default" tpoint group mask from help text,
since it does not enable any tracepoints and
may confuse the user.
Change-Id: I2ca780883dfa7822e76523e9ba1fc65a7bfe5a99
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14656
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When Link Time Optimization is enabled, compiler can sometimes produce
additional warnings saying that some variables may be uninitialized.
To supress the warning it is enough to add explicit initialization
of the variable causing the issue, in this case 'iovcnt = 0'.
Signed-off-by: Szulik, Maciej <maciej.szulik@intel.com>
Change-Id: I080b20a6008643ae78c8e3a6c2d183193ef6c1bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14674
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Community-CI: Mellanox Build Bot
When data_local.num_async_events >
SPDK_NVMF_MIGR_MAX_PENDING_AERS, data_local.async_events
was already indexed by 256, and it was out of bounds.
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Change-Id: I15cfdeb9bc165de0c73fbc9171b0ce6d8689c0aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14666
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
If the kernel is booted with the IOMMU enabled and Shared Memory mode
enabled (which are the expected boot parameters for production servers),
then the kernel idxd driver will automatically register a dedicated work
queue with the PASID for the process that opens it. This means that the
descriptors written into the portal for that work queue should be
*virtual* addresses.
If the IOMMU is enabled but Shared Memory mode is disabled, then the
kernel has registered the device with the IOMMU and assigned it I/O
virtual addresses. We have no way to get those addresses from user
space, so we cannot use the kernel driver in this mode. Add a check to
catch that.
If the IOMMU is disabled, then physical addresses are used everywherre.
Change-Id: I0bf079835ad4df1128ef9db54f5564050327e9f7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14019
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
The DSA specification calls out that software must use a memory barrier
such as sfence prior to writing a descriptor or incorrect data may be
transferred during the operation.
Change-Id: I12f20e5a748e41616c7a542ccdb158c6b548eea4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14018
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
By doing the registration immediately upon mapping the BAR instead of
when the memory is inserted into the spdk_mem_map, we're able to
register BARs that are not 2MB multiples in size and alignment. The SPDK
API for registering a BAR already returns the physical/io address in the
map call, and it can be used directly without a call to
spdk_mem_register().
If the user does elect to later register the BAR using
spdk_mem_register(), we attempt to insert the 2MB aligned segments we
can into the spdk_mem_map. Users may still need to register memory for a
few reasons, such as making spdk_vtophys() work, or for setting up the
BAR as a target for RDMA. These cases still require 2MB aligned and
sized segments.
Change-Id: I395ae8803ec4bf22703f6f76db54200949e82532
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14017
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
These variants did not exist in DPDK 20.11 which is
still supported by SPDK.
So we will instead need to scan the rte_version()
string to get these values.
Fixes issue #2715.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I79657002a7a605a38a0d98b944ac53c02fa6d78c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14661
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Pawel Piatek <pawelx.piatek@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
In-capsule data length should be the same with the SGL data length.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I7eefecb8baebb76850a48689907aff27a8946f98
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Fixed error handles which are violated with spec:
1. 'data length > MAXH2CDATA' is a fatal error.
2. 'ICDOFF != 0' should abort the IO.
Other errors which are not defined in spec:
1. invalid sgl type
2. In-capsule Data length > In-capsule Data size
Because this function runs before data part receiving, it is hard
to skip the following data segment if we want to handle some error
as non-fatal.
Currently, we have to handle all undefined errors as fatal errors.
I think after this release, we can change receving process. This will
be helpful for error handling. But this work is not small.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I8fc0d2d743505e49a93be19fd217e7ad6ca06622
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14580
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Fuzzing vfio-user require access to send request api
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I6c58b8ab4fd3394150bbb3e64b4f95bff93dae6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13881
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
During fuzzing vfio-user client and server are started from same
process causing deadlock. SO_PEERCRED return pid of process
connected to vfio endpoint.
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I6fc2db5d58a459a30fec116a9de3c69d48acf75e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14559
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This checks the current version to make sure we have
a dpdk_fn_table that supports it.
This is easy for now, since the DPDK PCI API is
public. Moving forward, DPDK 22.11 will likely make
these APIs private, requiring us to carry header file
copies for different DPDK versions so that we can
not only build against DPDK but also use the correct
data strucures and APIs to interact with those private
DPDK interfaces. We will also need to consider
minor (i.e. stable or point) releases since they
could technically change PCI ABI as well - the current
year + month checks won't be sufficient.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic9f41d9d13778f3d078b20b08da48d8d16362b11
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14637
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This allows it to return error codes. Have the
init code check the return value and fail the init
process when pci_env_init() returns error.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7c8a4f9a6da6b3438ed09a881153b7a4ceef3a83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14635
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Get ready to have multiple implementations of the
dpdk_fn_table. We could do some fancy self-registering
constructor functions, but let's just keep it simple
for now and extern declare each implementation in
the pci_dpdk.h header file.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8f5621412d1c8bd22c95ab74ef66c5bcc41d1380
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14636
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>