The kernel vfio_pci driver module introduced vf_token checking
mechanism since kernel version 5.7, and has been supported by
DPDK. So add support for it to deal with the scenario of VF.
Signed-off-by: Jun Zeng <jun1.zeng@intel.com>
Change-Id: Ie9700fa395327da4e847c6213167284c148a64e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Found with misspell-fixer.
Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: If062df0189d92e4fb2da3f055fb981909780dc04
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15207
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
While lvs->destruct is set in a few places, it is never read. Since it
is not used, it is removed.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Iee21e92c9049d143fca13930b4b5f328f9ec38f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15716
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Copy-on-write happens when cluster is written for the first time for
thin provisioned volume. Currently it is implemented as two separate
requests to underlying bdev: read of the whole cluster to bounce
buffer and then write of this buffer to the new location on the same
underlying bdev.
This patch improves copy-on-write flow by utilizing copy command of
underlying bdev if it is supported. In this case we have just one
request to bdev and don't need the bounce buffer.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I92552e0f18f7a41820d589e7bb1e86160c69183f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
New `translate_lba` operation allows to translate blob lba to lba on
the underlying bdev. It recurses down the whole chain of bs_dev's. The
operation may fail to do the translation when blob lba is not backed
by the real bdev. For example, when we eventually hit zeroes device in
the chain.
This operation is used in the next commit to get source LBA for copy
operation.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I89c2d03d1982d66b9137a3a3653a98c361984fab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14528
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Add max/min_read/write/unmap_latency_ticks into the struct
spdk_bdev_io_stat.
When initializing or resetting the instance of the struct
spdk_bdev_io_stat, initialize max to 0 and min to UINT64_MAX.
Then update max if a new value is larger than the current max,
and update min if a new value is smaller than the current min.
For the bdev_get_iostat RPC, it prints max and prints min if min is not
UINT64_MAX or 0 if min is UINT64_MAX.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1b30b3825c15e37e9f0cf20104b866186de788a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14825
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The following patches will extend I/O statistics to include error
counters and module specific counters to output these via the
bdev_get_iostat RPC.
In this case, the size of the struct spdk_bdev_iostat will be variable.
As a preparation, allocate spdk_bdev_io_stat dynamically.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I1979a9d867859d5cb5d05717bfcc677f07fa03f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15479
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
When use of deprecated featues is encountered, SPDK now calls
SPDK_LOG_DEPRECATED(). This logs the use of deprecated functionality in
a consistent way, making it easy to add further instrumentation to catch
code paths that trigger deprecated behavior.
Change-Id: Idfd33ade171307e5e8235a7aa0d969dc5d93e33d
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
This introduces an enhanced spinlock that adds safeguards compared to
the default pthread_spinlock_t. In particular:
- A pthread_spinlock_t is still used, but additional error checking is
performed to ensure there is no undefined behavior on relock,
unlocking when not the owner, or destoying a locked lock.
- The SPDK concurrency model allows an SPDK thread to be migrated
between pthreads. Releasing a pthread spinlock on a different thread
from where it is taken is undefined behavior. If an SPDK spinlock is
held at a time that a time when a poller or message returns control to
thread_poll(), the program will abort.
- SPDK spinlocks can only be obtained from an SPDK thread.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6dd6493ab5f5532ae69e20654546405a507eb594
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
This is unsafe, because we touch need_buf_* queues, which aren't
thread-safe. Also, documented this requirement in
spdk_bdev_io_get_buf()'s description.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iabc141e051c543fdd51f079ae212f69e980d8148
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15668
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
We make decisions on how to pick a poll group for a new
qpair by looking at each poll group's current_io_qpairs
count. But this count isn't always accurate since it
doesn't get updated until after the CONNECT has
been received.
This means that if we accept a bunch of connections
all at once, they may all get assigned the same poll
group, because the target poll groups counter doesn't
get immediately incremented.
So add a new counter, current_unassociated_qpairs,
to account for these qpairs. We protect this counter
with a lock, since the accept thread will increment
the counter, and the poll group thread will
decrement it when the qpair receives the CONNECT
allowing us to associated with a subsystem/controller..
If the qpair gets destroyed before the CONNECT is
received, we can use the qpair->connect_received
flag to decrement current_unassociated_qpairs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8bba8da2abfe225b3b9f981cd71b6f49e2b87391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15693
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Currently we use qpair->ctrlr at qpair destroy
time to decide if we need to decrement the
qpair's poll group's qpair count. But this is
not correct - these counters get incremented
when the CONNECT is received, but qpair->ctrlr
doesn't get set until later.
So add a new connect_received bool to the spdk_nvmf_qpair.
Use this instead to determine when we should decrement
the poll group qpair counters.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I174a0fda36c4558171953bf58f2f5117bc074f76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
NVMf target reports copy command support if all bdevs in the subsystem
support copy IO type. Maximum copy size is reported for each namespace
independently in namespace identify data. For now we support just one
source range.
Note, that command support in the controller is initialized once on
controller create. If another namespace which doesn't support copy
command is added to the subsystem later, it will not be reflected in
the controller data structure and will not be communicated to the
initiator. Attempt to execute copy command on such namespace will
fail. This issue is not specific to copy command and applies also to
write zeroes and unmap (dataset management) commands.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I5f06564eb43d66d2852bf7eeda8b17830c53c9bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Added new API 'spdk_bdev_histogram_get_channel' to get histogram of
a specified channel for a bdev. A callback function is passed to it
to process the histogram.
Change-Id: If5d56cbb5fe6c39cda7882f887dcc9c6afa769ac
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15539
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
SPDK threads generally run on dedicated cores and locks should be rarely
contended. Thus, putting a thread to sleep while waiting on a mutex does
not free up CPU cycles for other pthreads or processes. Even when
running in interrupt mode, lock contention should be low enough that
spinlocks are a net win by avoiding context switches.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I6e2e78b2835bbadb56bbec34918d998d75280dfd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This function can be useful to query if a thread
had spdk_thread_exit() called on it yet. Internally
we have both EXITING and EXITED state - so
!spdk_thread_is_running() can be used to detect a
thread that is in either of those states.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2f6fb024a6b1bc895fdc5132c722abc10f5d30f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15512
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The "app thread" will always be the first
thread created using spdk_thread_create(). There
are many operations throughout SPDK that implicitly
expect to happen in the context of this app thread,
so by formalizing it we can start to make assertions
on this to help clarify and simplify locking and
synchronization through the code base.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7133b58c311710f1d132ee5f09500ffeb4168b15
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15497
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
`BDEV_IO_NUM_CHILD_IOV` and `BDEV_RESET_IO_DRAIN_RECOMMENDED_VALUE`
are public macro definitions without `SPDK_` prefix, so we add the
`SPDK_` prefix to them.
Change-Id: I4be86459f0b6ba3a4636a2c8130b2f12757ea2da
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15425
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This patch introduces function spdk_fd_group_get_epoll_event, which
returns the epoll(7) event that caused the file descriptor group
callback function to execute. Rather than changing the signature of
spdk_fd_fn in order to pass the struct epoll_event, which would result
in a gigantic patch where there vast majority of users would simply have
to ignore the new argument, we introduce this new API that allows to
return the epoll_event only when really needed.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Suggested-by: John Levon <john.levon@nutanix.com>
Change-Id: I3debe1382d1c2bfec6ae4fea274ee38ed0b135fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14935
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
We should always build all function that are part of the API, even if
some of the libraries they depend on are missing. In that case, they
can return an error instead.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I72b450b3a1d62e222bd843e45be547d926414775
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add an option "--generate-uuids" to bdev_nvme_set_options
RPC to enable generation of UUIDs for NVMes devices that
do not provide this value themselves. The identifier is
based on a serial number of the device, so a bdev
using this NVMe will always be assigned the same UUID.
Part of enhancement from #2516.
Change-Id: I86d76274e5702d14ace89d83d1e9129573f543e2
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15151
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Update code for read the virtual address width to use glob to locate the
Intel and AMD iommu capability registers. This code should work for all
AMD numa configurations.
Fixes issue 2730
Signed-off-by: Michael Piszczek <mpiszczek@ddn.com>
Change-Id: Ibf5789087b7e372d892b53101e4c0231809053f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14961
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Add an optional allowlist for RPC methods: if the method is not listed,
it is not allowed to be called or visible. This can be used to restrict
accidental mis-configurations, and generally helps locking down the
configuration surface.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ied78fc4b14b60cb94ed0852b92deb6df545cbec4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15275
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
per Intel policy to include file commit date using git cmd
below. The policy does not apply to non-Intel (C) notices.
git log --follow -C90% --format=%ad --date default <file> | tail -1
and then pull just the 4 digit year from the result.
Intel copyrights were not added to files where Intel either had
no contribution ot the contribution lacked substance (ie license
header updates, formatting changes, etc). Contribution date used
"--follow -C95%" to get the most accurate date.
Note that several files in this patch didn't end the license/(c)
block with a blank comment line so these were added as the vast
majority of files do have this last blank line. Simply there for
consistency.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id5b7ce4f658fe87132f14139ead58d6e285c04d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Add API spdk_bdev_io_get_submit_tsc to get submit tsc of a bdev I/O,
which can be used in bdev modules to avoid calling expensive
spdk_get_ticks().
Change-Id: Ifbcecb1bc663344997c5e73b72a1dfb5d0422946
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14989
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
It doesn't make sense to have the size of the doorbells fixed and then
calculate the maximum number of queue pairs based on it, do it the other
way round. Also, add some sanity checks based on the spec.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I17e3509fb0a011128ca089ce78b7a296262e6f8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In the following patches, we will add a feature to inject data
corruption to the error bdev module. For read I/O, we will have
to inject data corruption at completion. However, if we use
spdk_bdev_part_submit_request(), it will not be possible because we
cannot add any custom operation into the completion callback.
To fix the issue, modify spdk_+bdev_part_submit_request() and
rename it to spdk_bdev_part_submit_request_ext().
Fortunately, we can use stored_user_cb in struct spdk_bdev_io.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I46d3c40ea88a3fedd8a8fef6b68ee417c814a7a1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15002
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Copy operation is defined by source and destination LBAs and LBA count
to copy. For destiantion LBA and LBA count we reuse exiting fields
`offset_blocks` and `num_blocks` in `struct spdk_bdev_io`. For source
LBA new field `src_offset_blocks` was added.
`spdk_bdev_get_max_copy()` function can be used to retrieve maximum
possible unsplit copy size. Zero values means unlimited. It is allowed
to submit larger copy size but it will be split into several bdev IOs.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I2ad56294b6c062595c026ffcf9b435f0100d3d7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
And also related function pointers and APIs:
spdk_bdev_for_each_channel_msg;
spdk_bdev_for_each_channel_done;
spdk_bdev_for_each_channel_continue;
Change-Id: I52f0f6f27717d53c238faf2f998810c9c5ee45d4
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14614
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
To support SQs allocated to a poll group other than the controller's
main poll group, we need to make sure to poll those SQs when we wake up
and handle the controller interrupt. As they will be running in a
separate SPDK thread, we will arrange for all poll groups to wake up
when we receive an interrupt corresponding to a vfio-user message
arriving.
This can mean needless wakeups: we don't (yet) have a mechanism to only
wake up the poll groups that correspond to a particular SQ write.
Additionally, as we don't have any notion of a poll group per
controller, this ends up polling all SQs in the entire poll group, not
just the ones corresponding to the controller we were handling.
As this has potential performance issues in many cases, it defaults to
disabled.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I3d9f32625529455f8d55578ae9cd7b84265f67ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14120
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In a use case, a custom sock module supports zero copy for read.
The custom sock module wants to keep the recv_buf_size to be sufficiently
large, for example 16MB. However, most upper layers overwrite the recv_buf_size
by a smaller value via spdk_sock_set_recvbuf() later. This is not desirable.
To fix the described issue, change the meaning of impl_opts->recv_buf_size
to be the minimum size, and spdk_sock_set_recvbuf() uses the maximum value
among the requested size, g_spdk_sock_impl_opts->recv_buf_size, and
MIN_SO_RCVBUF_SIZE.
We may have to change the code to initially create a socket. However,
for most cases, the upper layer calls spdk_sock_set_recvbuf() anyway.
Hence this fix will be minimal and enough.
For the use case, it is enough to change recv_buf_size of the posix sock
module. However, the custom sock module may support I/O uring in future.
Hence, change I/O uring sock module together.
Additionally, for consistency, change the meaning of impl_opts->send_buf_size
to be the minimum size, and spdk_sock_set_sendbuf() uses the maximum value
among the requested size, g_spdk_sock_impl_opts->send_buf_size, and
MIN_SO_SNDBUF_SIZE.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I051ba7cb50bc9dcad229e922198b04fe45335219
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14915
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Mellanox Build Bot
The callback is executed with the opaque argument specified by the user,
not accel_task.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6f4388d8c4ceb37d4df174a6f4c0529012fe1f28
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15020
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add new bdev property split_on_write_unit which, if set to true, causes
writes to be split to match write_unit_size and fail if not aligned to
or not multiple of write_unit_size.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: Id49f58a3288ddf5cfe4921ce4020ae4bcdd67298
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11390
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously SPDK use libvfio-user library to provide emulated NVMe
devices to VM, but it's limited to NVMe device type only. Here we
add SPDK vfu_target library abstraction based on libvfio-user which
supports more PCI device types.
We will add virtio-blk and virtio-scsi devices emulation based on
vfu_tgt library in following patches, actually this library can
support NVMe emulation too, due to the fact that the NVMe emulation
is already exist, so we will keep the NVMe emulation which based on
libvfio-user directly as it is.
Change-Id: Ib0ead6c6118fa62308355fe432003dd928a2fae9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12597
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Identify and properly handle conventional zones (in smr drives) by using
zone type and WP state. Bdevs supporting zoned devices(like uring, nvme and
vbdev_zone_block) now update the zone type information. As a result, the
fio plugin now uses this info instead of hard coding the zone type.
Also adds new WP state(ZONE_STATE_NOT_WP) for handling zones w/o WP.
Signed-off-by: Indraneel M <Indraneel.Mukherjee@wdc.com>
Change-Id: If031e0742d68c55c35e95ddc33d478939bbd52fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14572
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
By doing the registration immediately upon mapping the BAR instead of
when the memory is inserted into the spdk_mem_map, we're able to
register BARs that are not 2MB multiples in size and alignment. The SPDK
API for registering a BAR already returns the physical/io address in the
map call, and it can be used directly without a call to
spdk_mem_register().
If the user does elect to later register the BAR using
spdk_mem_register(), we attempt to insert the 2MB aligned segments we
can into the spdk_mem_map. Users may still need to register memory for a
few reasons, such as making spdk_vtophys() work, or for setting up the
BAR as a target for RDMA. These cases still require 2MB aligned and
sized segments.
Change-Id: I395ae8803ec4bf22703f6f76db54200949e82532
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14017
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This is first commit that should go into latest SPDK
after the code freeze for SPDK 22.09.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib972b328d1bb3fbab0da65a55c188bfcf1661798
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14632
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Per spec to assure correct operation of IAA decompression.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I745c5ecc09d220017a8da42b52f4ff7caa5e748c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14660
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>