Nvmf/vfio-user uses this API to map NVMe command sent from
VM from Guest Physical Address to Host Virtual Address, so
now we moved this API from the nvme library to nvmf/vfio-user
as an internal API.
UT code will be added back in coming patch.
Change-Id: I54817fc9811ccd9ddd97b3aa6762a2fce4bbdda6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Just remove this function pointer and add a new one,i.e.,
dump_sw_error.
Because this function pointer is only used to
read a sw err info. We can hide it in the detailed
idxd implemenation.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I42fe2220dae85df307b5af64e37acfd7f748915b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8707
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Latest nvme-cli (>= 1.13) fails to issue commands towards SPDK's cuse
ctrl device, e.g.:
$ nvme get-feature /dev/spdk/nvme0 -f 1 -s 1 -l 100
nvme_cuse.c: 654:cuse_ctrlr_ioctl: *ERROR*: Unsupported IOCTL 0x4E40.
get-namespace-id: Invalid argument
The reason is because nvme-cli now also sends NVME_IOCTL_ID to the
target device to determine if it's indeed a controller or a ns. In
case kernel returns ENOTTY then nvme-cli considers the device to be
a controller. Since cuse_ctrlr_ioctl() returns EINVAL in such a case
the nvme-cli fails.
To avoid this simply replace EINVAL with ENOTTY for the ioctls that
may be not supported by ctrl or ns device.
nvme-cli commit in question:
fa2b91da74
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I29003864bc2a5c1a8906d6d01beba3d6f4e31b0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8531
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Now that we've deprecated the RPCs for a release, we can remove the whole
library.
Change-Id: I0f1a357fcfb3404efac39aa021928841c2f22ff1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Bit 1 in the CMIC of the Identify Controller Data Structure specifies
if the NVM subsystem may have multiple controllers or not.
However, multi_host indicated a particular use case such that the NVM
subsystem is used by multiple hosts.
multi_ctrlr will be more appropriate.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0246096a5cc44721aeff3ff6f96473a2abe11964
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8719
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For cases where cpumask for a thread was not set,
all bits were turned on for whole length of cpuset structure.
This resulted in JSON RPC reponses with way too long cpumask
for what is useful.
Now the response is limited to the applications core mask,
as that makes sense so long as number of cores cannot change.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib5cf271d3b219ba679f1abe498516796693a87dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8288
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The round-robin logic is no longer necessary to spread
the threads around the cores. Starting from core other
than first is even counter-productive to bunching up
threads.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5fcee2bacc2d0b4af26336caf381ed954814d731
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8085
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Before this patch _find_optimal_core() returned
1) any core that could fit the thread
2) if current core was over the limit, the least busy core
3) current core if no better candidate was found
Combined with _get_next_target_core() round-robining
the first core to consider, resulted in threads being
unnecessarily spread over the cores.
This patch only places threads on lower lcore id,
or when current core is over limit then any core that can fit it.
Next patch will remove round-robin logic to always start with
lowest lcore id.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I54e373d3ca02a5633607d22978305baa1142f8bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8112
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Before this patch the idle time of a core was increased
by the amount of busy time of thread that was moved out.
No assumption was made as to how the remaining threads,
would behave during next scheduling period.
This approach is fine, as over multiple scheduling periods
we'd arrive at a point where threads could do no more work
or all cores would be busy.
Yet this requires multiple scheduling periods to sort out
the threads.
Later in the series core_load will be used to determine,
when to start moving threads out of the core. So changing
this assumption will allow for faster responses to thread load,
at cost of sometimes spreading threads too much briefly.
With this patch, we are assuming that threads remaining
on the core will do proportionally the same amount of work
during next scheduling period.
See an example illustrating the change:
Before moving Thread1
Thread1 Busy 80 Idle 20 Load 80%
Thread2 Busy 60 Idle 40 Load 60%
Core Busy 140 Idle 60 Load 70%
After moving Thread1 out (original code)
Core Busy 140-80=60 Idle 60+80=140 Load 30%
After moving Thread1 out (this patch)
Core Busy 140-80=60 Idle 60-20=40 Load 60%
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1f347983449b2fde476dab360c4df689965ca3ea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8279
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
In cases when all cores are already doing too much work
to fit a thread, active threads should still be balanced
over all cores.
When current core is overloaded, place the thread
on another that is less busy.
The core limit is set to 95% to catch only ones that are
fully busy.
Decreasing that value would make spreading out the threads
move aggressive.
Changed thread load in one of the unit tests to reflect the
95% limit.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3b3bc5f7fbd22725441fa811d61446950000cc46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8113
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We had not held mutex while removing bdev name or alias from bdev
name tree for most cases. Fix these in this patch.
spdk_bdev_unregister() already holds g_bdev_mgr.mutex when removing
name, and so we do not need to change it.
spdk_bdev_close() had not held g_bdev_mgr.mutex. What we want to lock
is only when removing name from name tree, that is, calling
bdev_name_del() in bdev_unregister_unsafe(). However, we need to
keep hierarchical lock ordering. Hence get and free g_bdev_mgr.mutex
outside of bdev->internal.mutex.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4e2c8604e27c8603725efa9bc0bee2013eccb2ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8527
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We had not held mutex when adding bdev name to global bdev name tree
in bdev_name_add(). Fix these in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I33813638f11da85263ec0c8849e566d247a45d43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8524
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If the specified name already exists in the global bdev name tree,
RB_INSERT() returns a pointer to it. Hence we do not have to call
bdev_get_by_name() when using bdev_name_add().
Hence update bdev_name_add() to return -EEXIST if RB_INSERT() returns
a non-NULL pointer, and then remove the bdev_get_by_name() calls.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2d4554ef7e5286270417def64b638b803eecfca2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8573
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The complier complains:
/usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’
output between 4 and 19 bytes into a destination of size 7
71 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
So we change the array size from 7 to 20, so it is enough to put 19 bytes
in.
Fixes #issue 2014
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I97dfbf9707d0e275382324fa7352b7a212b2aeb5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8694
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In the nightly test, the compiler complains:
trace.c: In function ‘_spdk_trace_record’:
00:07:12.523 trace.c:144:53: error: ‘argval’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
00:07:12.523 memcpy(&buffer->data[offset], (uint8_t *)argval + argoff,
00:07:12.523 ^
00:07:12.523 trace.c:145:36: error: ‘arglen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
00:07:12.523 spdk_min(curlen, arglen - argoff));
And this patch is provided to fix such issue.
Fixes #issue 2034
Change-Id: I4c78d63bdc6a7d166990ae1d18a6abf183efdee1
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8709
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Chengqiang Meng <chengqiangx.meng@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
If NGUID is not specified with nvmf_subsystem_add_ns json-rpc request
then it is possible to expose the same NGUID as bdev nvme module
attached.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ie0ed7189e55a5abd6bc0904fc356d26f62b50549
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8628
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch adds the ability to chain multiple trace entries together to
extend the size of the argument buffer. This means that a tracepoint is
no longer limited to the size of a single entry, so it can have any
number of arguments, and their size is also not constrained to a single
entry.
Some limitations are still there: a tracepoint can have up to 5
arguments and strings are limited to 255 bytes. These constraints stem
from the definitions of tracepoint structures, which could be easily
modified to extend the limits if needed.
To record a tracepoint requiring larger buffer, aside from reserving
`spdk_trace_entry` structure, a series of `spdk_trace_entry_buffer`
structures are allocated too. Each of them acts as a buffer for the
arguments. To allow trace tools to treat the buffer structures
similarly to regular entries, they also have the `tpoint_id` and `tsc`
fields. The id is always assigned to `SPDK_TRACE_MAX_TPOINT_ID` to make
sure that a buffer is never mistaken for an entry, while the value of
`tsc` is always shared with the initial entry. This also provides a way
for the trace tools to verify if an entry is part of a chained buffer.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I51ceea6b6e57df95d4b8bd797f04edbc4936c180
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8405
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
It makes the code more readable. Additionally, to avoid partial updates
to an entry, the check for the number of arguments was moved before it's
filled in.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9ba01b1bcdc29267571badaebd4a9b34ffd7f728
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8404
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
It allows us to get rid of the `next_circual_entry` variable and will
make it easier to retrieve multiple trace entries, which will be needed
in subsequent patches.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I4666c9da518c2ac0b376e10aa73d1c58cff91f13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8403
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Returning an error from this function is not useful - there
is nothing the caller can do with that information. So
change the return value to void. Also add ERRLOG and assert
if a transport actually returns a non-zero status, to
force the transport implementer (which must be an out-of-tree
transport) to make changes as necessary.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I402afec045265db178af821d25b99a6dbe066eab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8659
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It is not uncommon for delete_io_qpair to fail, for
example when a controller is hot removed. So even
if SQ or CQ deletion fails, continue with freeing
resources and report success back up the stack.
There is really nothing the application can do to
account for this failing anyways.
Upcoming patches will add additional checks to
ensure failing delete_io_qpair status never gets
propagated to the caller.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iac007c1eba30f7a8c4936b3ffb6c837f28ee12ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8658
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Due to the recent changes for non block size multiples write I/O,
the data digest feature was degraded. If Linux iSCSI host enables
data digest and tries to detect LU from SPDK iSCSI target, data
mismatch error is detected and the connection is disconnected
unexpectedly.
The cause was that pdu->data_valid_bytes was not set for non-write
response PDUs which have a data segment.
iscsi_pdu_calc_data_digest() has been used only for non-write response
PDUs. Hence we did not need to change iscsi_pdu_calc_data_digest().
Restore the original implementation of iscsi_pdu_calc_data_digest().
Additionally, to avoid future degradation, rename the related
functions to iscsi_pdu_calc_partial_data_digest() and
iscsi_pdu_calc_partial_data_digest_done(), and add comments for
clarification.
This fix was verified by the reporter.
Fixes#2029.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6babcd1b56e79d3fa3cd26b2dfaad87a52788e63
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8635
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Fixes#2022
If queued aborts are present when trying to fail a ctrlr
using spdk_nvme_ctrlr_fail(), then the abort command completion
will attempt to retry one of the queued aborts.
This eventually leads to a segfault that can be avoided by not
retrying any queued aborts.
Change-Id: I897dcb8809e16af8bdd39d4381ab531e1cc29822
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8585
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The vfio-user target emulated NVMe device is treated as
PCIe NVMe SSD in the Guest VM, so when doing controller
reset or shutdown, we should abort the AERs which in the
NVMf library.
Users may switch kernel NVMe driver to SPDK NVMe driver
in the VM, without this fix, we will got "AERL exceeded"
response very frequently, because the AERs submitted by
previous driver will never be aborted in runtime.
Change-Id: I0222ed509629ccb0e98217414dd9043857105686
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8558
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When users remove kernel NVMe driver in the VM, after 120 seconds,
SPDK NVMf target will disconnect ADMIN queue pair due to association
timer timeout, and for vfio-user transport, the ADMIN queue pair
connection is associated with the socket connection, so when probing
the NVMe controller again, because there is no active ADMIN connection
for fabric register R/W commands, it will cause segment fault.
Here we set the association timeout value to 0 for vfio-user transport,
so that the ADMIN connection will not be disconnected when shutdown the
controller, the ADMIN queue pair will be disconnected when the socket
connection breaks.
Change-Id: I3613169229bae384405889653e50f581d30d7c07
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8557
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The NVMf library will set cdw0 based on specific command,
so we use it directly in vfio-user, otherwise, some NVMe
commands such as AER can't work.
Fix issue #2016.
Change-Id: Ie1a80a92c0856b61822ee51ce5d8faaaf1d463de
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8556
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Also fix one incorrect print log.
Change-Id: I3254baf4bbff4acfc0ef43f628d025931e8589ea
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8555
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
These macros are only valid for Fabric transports.
Change-Id: Ia456eebdcdab28e81226c1b3a7211fcb41b5e481
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8554
Community-CI: Mellanox Build Bot
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_bdev_register() and spdk_bdev_add_alias() had not held mutex when
adding bdev name or alias to global bdev name tree. This bug caused unexpected
error when traversing global bdev name tree.
The next patch will fix the bug. This patch is a preparation for the fix.
spdk_bdev_get_by_name() had not held mutex while traversing bdev
name tree. The major callers to spdk_bdev_get_by_name() had held mutex
when calling it. However, this was not clear.
Factor out the internal of spdk_bdev_get_by_name() into a helper
function bdev_get_by_name() and then change spdk_bdev_get_by_name()
to lock and unlock when calling bdev_get_by_name().
Then replace spdk_bdev_get_by_name() call in spdk_bdev_alias_add() and
bdev_register() by bdev_get_by_name() call.
spdk_bdev_get_by_name() call in spdk_bdev_examine() is not changed.
This is called only from JSON RPC and not related with the bug. So
we want to fix only unlocked access to global bdev name tree.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I25f07694e569eec10dba6c3c8543f6ce77412fe8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8523
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It is better to not fail connect commands when a subsystem
is not ready. The host will not be expecting that and will
typically treat it as a catastrophic failure (i.e. it won't
retry the connect).
So instead when this situation occurs, start a poller for
the connect request. We will continue to retry processing
it until the subsystem is ready to handle it.
Fixes issue #1985.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id8835df8f0edf1e889fdd7e754e261c2a880cbb6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8571
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It is possible for a controller to get added to the
subsystem before its admin_qpair has been assigned.
We need to account for that when traversing the subsystem's
ctrlr list when determining ns and ana_changes that need
to be reported for the ctrlr.
Found while doing stress testing with connects and
subsystem ns add/remove.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie54dc6ac202faeaeace054e6599f2dea2f30211e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8570
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
After correct trstring initialization, it is
overwritten with trstring value of the current
probe ctx. That leads to a problem when initiator
connects to a sbusystem with listeners of different
transport types (e.g. TCP and RDMA). If probe_ctx has
TCP type, than discovery probe initialized probe trid
with trtype=RDMA and trstring=TCP. As results, SPDK
creates TCP controller with trtype=RDMA and we hit
assert in nvme_tcp_qpair function.
Change-Id: I9355450c40c58fa55b016220703f6f7ae36b2571
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8464
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Pollers are supposed to return SPDK_POLLER_{BUSY,IDLE}.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I92bd184aaba9e3efb730b68a6024ebc9757ffd8b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8559
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If we continuous setup and teardown cuse session, It will teardown
uninitialized cuse session and cause segment fault, New function
cuse_session_create will do the session create operation and under
g_cuse_mtx to avoid this issue.
Signed-off-by: Weifeng Su <suweifeng1@huawei.com>
Change-Id: I2b32e81c0990ede00eea6d4ed3a7e44d534d4df3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8231
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This update will allow us to use spdk_nvme_detach_async() and
spdk_nvme_detach_poll_async() easier to aggregate multiple detachments.
Previously, we could do:
spdk_nvme_detach_async()
spdk_nvme_detach_async()
spdk_nvme_detach_async()
and then started doing spdk_nvme_detach_poll_async().
Hence aggregating multiple detachments is already supported.
After this patch, the following sequence is possible:
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_async() = 0
spdk_nvme_detach_async() = 0
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = -EAGAIN
spdk_nvme_detach_poll_async() = 0
The actual changes is to remove the variable polling_started from
struct spdk_nvme_detach_ctx because it is not necessary anymore.
Clarify this change via updating the header file and CHANGELOG.
Verify this change by unit test.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iebdf6c27c5304a2097b7084c315ccc99634ffa1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8468
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add a new function spdk_nvme_detach_poll() to simplify a common
use case to continue polling until all detachments complete.
Then use the function for the common use case throughout.
Besides, usage by simple_copy application was not correct, and
fix it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic14711cd8478bf221c0fe375301e77b395b37f26
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8509
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The function comment was referring to a non-existent caller; instead, expand
with a little more detail on the path taken for new QPs.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I42478194f3cfc18a6ff6c434964630ac42866f1d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8534
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When the property is 8 bytes but the host only requested
4, we need to mask and only return the bytes requested
by the host. Wait to do the DEBUGLOG until after
that has happened.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8f476a47e9fd07bf652fd64f3b1c17d650374167
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8506
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Interpret bare --with-dpdk opt as user's request to find installed
(provided by the distro) DPDK's libs|include files and use them during
the build.
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I9da99671b95af0121194b3a6d53636b0ded71f1b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: <tomasz.rochumski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Was using "dst" in some cases and "crc_dst" in others for crc32c
related calls. Update them to always use crc_dst
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Icf200f1734c64c29881f23b02b8d12bad81b3ca0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8186
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Recent work identified race conditions having to do with the
dynamic flow control mechanism for the idxd engine. In order
to both address the issue and simplify the code a new scheme
is now in place. Essentially every DSA device will be allowed
to accomodate 8 channels and each channel will get a fixed 1/8
the number of work queue entries regardless of how many
channels there are. Assignment of channels to devices is round
robin and if/when no more channels can be accommodated the get
channel request will fail.
The performance tests also revealed another issue that was
masked before, it's a one-line so is in this patch for convenience.
In the idxd poller we limit the number of completions allowed
during one run to avoid the poller thread from starving other
threads since as operations complete on this thread they are
immediately replaced up to the limit for the channel.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I913e809a934b562feb495815a9b9c605d622285c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8171
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In spdk_idxd_configure_chan(), if memory allocation fails in
TAILQ_FOREACH() {} code range, we will goto err_user_comp and
err_user_desc tag, in which we donot free chan->completions
and confused batch->user_completions with chan->completions.
Memleak problem and double free problem may occurs.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I0e588a35184d97cab0ea6b6c013ca8b3342f940a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
When a qpair is destroyed and the qpair is the last,
_nvmf_ctrlr_free_from_qpair() (in lib/nvmf/nvmf.c) sends two messages,
one is for _nvmf_ctrlr_destruct() and another is for
_nvmf_transport_qpair_fini().
We do not know which of two completes earlier.
_nvmf_ctrlr_destruct() frees the qpair->ctrlr in the end.
On the other hand, _nvmf_ctrlr_free_from_qpair() calls
spdk_nvmf_poll_group_remove() in the end, and spdk_nvmf_poll_group_remove()
accesses the qpair->ctrlr to free queued requests to the qpair.
Before one recent change, spdk_nvmf_poll_group_remove() had been called
before _nvmf_ctrlr_free_from_qpair() was called.
Hence extrace the operation to free queued requests from
spdk_nvmf_poll_group_remove() and inline it into _nvmf_qpair_destroy().
Fixes one showstopper error to investigate the issue reported in #1819.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I29c43ff7b289fc77a5de9c33e0266301c412e208
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Previously core load was only considered for main lcore.
Other cores were used based on cpumask only.
Once an active thread was placed on core it remained there
until idle. If _get_next_target_core() looped around,
the core might receive another active thread.
This patch makes the core load matter for placement of any thread.
As of this patch if no core can fit a thread it will remain there.
Later in the series least busy core will be used to balance
threads when every core is already busy.
Modified the functional test that depended on always selecting
consecutive core, even if 'current' one fit the bill.
Later in the series the round robin logic for core selection
is removed all together.
Fixed typo in test while here.
Note: _can_core_fit_thread() intentionally does not check
core->interrupt_mode and uses tsc. That flag is only updated
at the end of balancing right now. Meanwhile tsc is updated
one first thread moved to the core, so it is no longer
considered in interrupt mode.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I95f58c94e3f5ae8a468723d1dd6e53b0e417dcc3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8069
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Idle threads are always moved to main core, there are no
other considations. Doing it as separate first pass,
allows to have the core stats be up to date for second
pass for active threads.
Core load stats will be used later in the series to determine
optimal target core for an active thread.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6a9bc11b86e954e461f7badebf3a6e4d1718f63c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8067
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be needed when doing multiple passes over
all threads. See next patch.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I4e9c749d69314fc268cbcb9334862392100b651e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8066
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When picking a path to go down with a thread,
conditions unnecessarily piled up.
Instead do it either of two ways:
- move idle threads to main core
- find best core for active threads and move them there
There is no need to worry about cpumask of the thread,
since _find_optimal_core() will always return a core
within the cpumask.
If the found core is the same one as the current,
_move_thread() won't perform any action.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0f4782766c15c86b5db0c970cfc9547058845b2a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8065
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Refactors logic for finding the optimal core for a thread
to single function.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ifc2b09acb6f698640ce9602fec4f567eb32b79fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6732
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Refactor all thread moves and core stats updates to single function.
At this time in series only tsc of main core was modified and
only idle tsc of main core was used. Main core would be either
the destination core or the source core. In both cases, the idle
time for main core had to be updated.
This patch generalizes this logic to always move the execution
time from source core to destination core.
As a byproduct cores besides main core have the stats updated,
which will be useful later in the series. Once core load will
be the deciding factor for choosing a core.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I57564e8b2632f919869d74e8f10b01fb3dda3be9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6658
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Now that the trace library can handle multiple arguments, there's no
point in passing 0 for tracepoints that don't have any arguments. This
patch removes all such instances. It allows us to to verify that
`spdk_trace_record()` was issued with the exact number of arguments as
specified in the definition of the tracepoint.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idbdb6f5111bd6175e145a12c1f0c095b62d744a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8125
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Replaced calls to `spdk_trace_record_tsc(spdk_get_ticks(), ...)` with
`spdk_trace_record(...)`, which does the same thing but is more consise.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib96e0bc0225490dadf857e1ddd2a3ecbf71e98c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Now that each tracepoint can have more than one argument, we cannot pad
the missing ones, as it would take too much space. Therefore, we put
them at the end of a line and simply skip the missing ones.
Additionally, since empty arguments are no longer padded, this patch
stops recording arguments with names consisting of an empty string
(containing just '\0').
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5199a3219a31d6afd3178324a4f48563b84e6149
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7958
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Now that `spdk_trace_record` receives variadic arguments, we no longer
have to pass strings as uint64_t, but can pass them directly as
pointers. That also means that the recorded strings can be longer than
8B (up to 40B).
This patch changes the blobfs code to pass the filenames as strings and
gets rid of the code that converted them to uint64_t.
Additionally, the maximum length of string arguments printed by
`app/trace/trace` has been extended to 16 and they're also padded to 16
characters, to better align with other argument types.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibe94452bf1b27eba2b15ca8608d0c3b55c2db360
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7957
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch allows tracepoint to record a variable number of arugments.
An additional function has been added,
`spdk_trace_register_description_ext()`, which allows the user to
register definitions for tracepoints specifying all the arugments that
they accept. Users can also call `spdk_trace_register_description()` to
register tpoints with a single argument (or none).
Currently, all of the tracepoint arguments need to be passed as
uint64_t.
The trace record functions use variable arguments and rely on tracepoint
description to know the order and the format of the arguments passed.
That means that the user needs to take care that they're always in sync.
Moreover, this patch extends the tracepoint entry size from 32B to 64B,
meaning that there are 40B that can be utilized for passing arguments,
which in turn means that there can be up to 5 arguments per tracepoint.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9993eabb2663078052439320e6d2f6ae607a47ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7956
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Move the definition of structure spdk_io_channel into
lib/thread/thread_internal.h, so we don't have to update SO_VER for
other libraries in future when we need to change the internal details on
the structure.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I3d2ca7a8737972e0b33ce92e464da42c48f89dec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8189
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If we're not a DEBUG build, vfu_setup_log() was effectively forcing a
libvfio-user logging level of LOG_ERR. Instead, let the log handler decide what
to report, so we can respect the SPDK levels.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ib3ad62589f495a377885f7deabaf02b428e83d30
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8452
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Commit b7cc4dd added support multiple AERs, but didn't
remove the code that hardcodes aerl=0 for non-discovery
controllers. So even though the target now supports
multiple AERs, we never indicate that for non-discovery
controllers.
The spec also recommends that implementations support
a minimum of 4 AERs - so the current behavior is not
recommended.
It seems that at least on Windows (when testing with
vfio-user transport) we see the limit get exceeded
which results in ERRLOGs. Let's keep the ERRLOG there
for now, assuming that once we report we support 4
AERs that Windows won't try to send more than that.
Fixes issue #2000.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6a07a6f37aaa6e531ae2cf1e1c46da036b00785b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8488
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We cannot control what the host may send to the target.
For example, we have empirical evidence that Windows
will send vendor-specific IDs for features and log pages
(when testing with the vfio-user target transport).
So let's change the ERRLOGs in these cases to DEBUGLOGs.
Fixes issues #2004, #2007, #2008.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8d5b92fc5e33d698af246f2f1c34f7cf51e6488a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8487
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
1. Update with latest vfio-user specification changes.
2. The new libvfio-user will not expose dma_sg_t data structure
any more, SPDK should use pointer and allocate memory for it.
Change-Id: I619b0c0828cbe3b050c628bff4c4ce7ee840510f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8377
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Community-CI: Mellanox Build Bot
When creating queue pairs, the original code uses a stack
queue variable and copy it to queue pair in insert_queue
function, the coming changes in libvfio-user doesn't expose
dma_sg_t data structure any more, we need to change it to
a pointer and allocate memory for it, so here we eliminate
insert_queue function as a preparation.
Change-Id: Iee94029d24bc8882ec169665e229e6cbc11564c0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8376
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Thread is private data of spdk_io_channel, bdev should use
spdk_io_channel_get_thread() to access it. This prepares for the upcoming
change to make the definition of struct spdk_io_channel private.
Change-Id: I643c8d677e22f6d8dde2faf91bb2711d3f5d81b8
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8426
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
in-capsule data over 4KiB when using the Linux initiator.
This is fixed in the latest kernel. See
https://lists.infradead.org/pipermail/linux-nvme/2021-May/025641.htmlFixes#1823
Change-Id: Ie383ea774ee31ef8fe255119095b21603483c33f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In blob_load_cpl(), spdk_realloc() is called to realloc
memory of ctx->pages. If spdk_realloc() return NULL,
the ctx->pages is set to NULL without being freed,
and then a memleak problem occurs.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Idf21b690e89beab0245ba57a5de66a4f506d54fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8308
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If the transport returns error when polling for
completions, it gets to a uint32_t and we end up
trying to resubmit all of the requests that are
currently queued. But that's not correct - if
the transport returns an error we shouldn't be
trying to resubmit requests at all.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9198e3e2d71875cc1e46e0ac928338bb983487f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8395
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvmf_ctrlr_get_log_page used req->data to store the log page result.
While the req->data only contains the first iov, if req->iovcnt is
larger than 1, the req->data may not hold the complete log page; and
even worse, the log page result may be written to invalid address and
cause memory corruption.
Change-Id: Ie6415a6bd2327419fe4b32f21ac814fd827c9e95
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7970
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Currently, the SPDK "core_mask" environment option only supports setting either
"-l" or "-c". Allow applications to specify more complicated options by sniffing
for a leading "-", and passing that string through unchanged. This allows, for
example, --lcores to be used as described here:
https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I38cc54bfcd356f3176cde7848e592525f9231e3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7933
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The common bdev layer will split large WRITE ZEROES ranges into
multiple children requests based on the backend device's setting,
it will try to split up to 8 children requests at a time to avoid
flood requests.
Also add UT to cover different cases.
Change-Id: Id9505fbe1c297412ef97b1f73587b22bc43f770e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7875
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This mutex is not used anywhere. After removing mutex from struct
spdk_scsi_globals, struct spdk_scsi_globals is empty. Hence then
remove struct spdk_scsi_globals. We can create struct spdk_scsi_globals
again if it becomes necessary.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I749ae43f7735a7c9383d090eae2093bb52607f17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8192
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Add three parameters, pdu_pool_size, immediate_data_pool_size, and
data_out_pool_size to the RPC iscsi_set_options to run iSCSI target
with little memory.
For some use cases, we want to keep the max number of connections,
but simultaneously we want to reduce the pool size and let I/Os wait
until resource is provided.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I74dc785310b1d985f3e338c1e13fba3a3840d113
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8191
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In nvmf_vfio_user_listen(), fd should be closed before
set it to endpoint->fd, otherwise, the fd leakage probem
occurs.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I3fabc65d2764926e5873475962e4362e46eb37e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8309
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In spdk_idxd_get_channel(), if chan->batch_base is allocated
faild, we should free chan before returning NULL.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Ia652c334aead592429c1171da73d67160879686d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8301
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In ioat_channel_start(), if spdk_vtophys(ioat->comp_update) returns
SPDK_VTOPHYS_ERROR, spdk_free is called to free ioat->comp_update,
and ioat->comp_update is not set to NULL. However, the caller
ioat_attach() will also call ioat_channel_destruct() to free
ioat->comp_update, then double-free problem occurs.
Here, we will not free ioat->comp_update in ioat_channel_start(),
ioat_channel_destruct() will do that.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I3be19a3feec5c2188051ee67820bfd1e61de9b48
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8300
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
POSIX defines %z for printing size_t values in a portable way.
Replace a reference to %ld to remove the assumption about
the type of size_t.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I2186aa5e7072f565ea75de935e22c2c23acf1a1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8341
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
In blob_serialize_add_page(), *pages is set to spdk_realloc(*pages).
If spdk_realloc() returns NULL, the *pages pointer will be
overridden, whose memory will leak.
Here, we introduce a new var (tmp_pages) for checking the return
value of spdk_realloc(*pages).
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Ib2ead3f3b5d5e44688d1f0568816f483aa9e101f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8307
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
In spdk_fs_create_file_async(), file->name is set to strdup(name).
We should check whether file->name is equal to NULL.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I2219cc353eb4711290aee2599505f57af9088bb2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8302
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If iscsi initialization fails (due to a memory allocation
failure for example), we may not even get to the point
where the g_iscsi global is registered as an io_device.
So then when we tear down the iscsi library using
spdk_iscsi_fini(), we need to make sure we don't
try to unregister g_iscsi if it wasn't registered.
For now, just use the g_init_thread global to make this
determination - it's set just after we register the
io_device.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic9443564ef67b9c0df0fce47a346f4608749c306
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8351
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Because we use spdk_dma_malloc, then it does not init
the the contents in the memory.
Fixes#1996
Change-Id: Ieef411f6ae5114de9f732df6096e0bb123efb7e0
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8374
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
In spdk_nvmf_subsystem_add_ns_ext(), ns->ptpl_file is set to strdup(),
which may return NULL. We should deal with it.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: If95102fe9d6d789b8ba9e846c4d7f4e22e48a93c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8305
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nbd will be closed in nbd poller function asychronously.
Unify the stop process of HARDDISC and SOFTDISC in same place.
Prepare for following patch.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ida33ff6d081e68290cfa393c0c47fe7af545958b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8036
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
For receving the pdu, we add the crc32c offloading by Accel framework.
Because the size of to caculate the header digest size is too small, so
we do not offload the header digest.
Change-Id: If2c827a3a4e9d19f0b6d5aa8d89b0823925bd860
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7734
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
siginfo_t is a GNU extension. SPDK (and DPDK) have
direct dependencies on GNU extensions, but it's a bit
nicer if external modules don't also need to define
_GNU_SOURCE. Currently siginfo_t parameter in the
spdk_pci_error_handler is the only thing that violates
this.
Note that DPDK also supports registering sigbus handlers,
but they take the failing address as a parameter instead
of the full siginfo_t structure. Let's adopt the same
for SPDK.
While here, remove an extra semicolon that was just after
the virtio sigbus handler function signature that was
updated in this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07faf11a3ac3589c637cb2196581c102286b1e68
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8333
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
At this time only main lcore frequency is changed,
depending on its load either up or down.
Exception is when at least a single busy thread is present
on non-g_main_lcore. Then the main lcore frequency is set
to the maximum possible.
This patch moves when that is determined, from 'moving'
logic to one that sets reactors to interrupt mode.
If at least one thread is present on non-g_main_lcore,
it has to be busy. Otherwise it would be placed on main lcore.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2900598afe53fb609e1f06a60d5245f74511e1c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8050
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This field was only used to keep track of number of threads
that will be present on a core after scheduler moves.
It was used only internally within scheduler_dynamic.
Event framework has no need to keep such field in core_info.
Instead added field in cores_stats internal to scheduler_dynamic.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3ce74d4a25eac81e58da8705a1c4553730fc1e57
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8049
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Added core_stats structure that will hold stats modified
during balancing.
Further patches will modify the values in this structure,
to for example judge how much execution time a core
has left.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib8e611e36642c4543b5cb43bc2695c613d38f0fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6657
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch expands spdk_scheduler_core_info with two new
fields that will contain core stats only from last scheduling
period.
This will make sure that schedulers do not have to keep track
and calculate this value on their own.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3aa7dfa6a60c1d14d95a0e684e84c2e83f0a4496
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8048
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
(a5ad0f80) lib/event: update reactor tsc_last going poll mode
Patch above updated the tsc_last at the very end of changing
interrupt mode of the reactor.
The flow for turning from interrupt mode to poll mode is
first to send an event to the target lcore, then to iterate
over all reactors updating notify_cpuset on each.
Previous patch updated the tsc_last after notify_cpuset was
updated, meanwhile the threads could already been put on it.
This patch moves it immidietly to the point of changing
the in_interrupt state.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6aea252016f4706369b8b597b765593bc6edca3b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8111
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Renamed core_busy_tsc and core_idle_tsc to better
describe that they contain particular core stats for
its whole lifetime.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6f16b2b0a162aad8fbaf18f549fc50a2372b920b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8047
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
There is no need to keep new_lcore field.
lcore value is enough to determine the target core.
Meanwhile _threads_reschedule() can see if the target
core matches the one from core_info.
Removed _spdk_lw_thread_set_core() since it did not
serve much purpose.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I82c7cfebf1107b4a55b2af9b891052084a788907
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8046
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
lw_thread->lcore was set during gather_metrics,
rather than just after the thread reschedule.
This patch just moves it to the right place.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0477830902f68102e4e4f0ffc9359bd004a8ad42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7961
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
So far the schedulers had to calculate the diff of
current_stats - last_stats on their own to get tsc
from last scheduling period.
Renamed the current_stats to total_stats, but kept the meaning
as stats describing tsc for lifetime of a thread.
Instead change the meaning of the last_stats to describe
the tsc of only last scheduling period and change its name
to current_stats.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1a165ff7c1afe659b432c3127a351a96878d1f3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7843
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Using the CMB for SQs is not a standard use case.
Performance can vary widely when using CMB for SQs
and is typically not the configuration used for
benchmarking.
So let's change the default value here to 'false',
users can still opt-in by setting this option to
true in the spdk_nvme_ctrlr_opts structure prior
to attach.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iab746ba777b04152ffb92fea2a2bb923a0a0bf21
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8227
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
QEMU 6.0 by default uses a RedHat dev/vendor ID rather
than the Intel one that has always been used to date.
We need the NVME_QUIRK_MAXIMUM_PCI_ACCESS_WIDTH quirk
so that we do not use wide instructions to copy SQEs
to a virtualized CMB, since QEMU does not support
that.
The NVME_INTEL_QUIRK_NO_LOG_PAGES quirk is only needed
for devices with SPDK_PCI_VID_INTEL, so we do not need
to carry this one over to the new REDHAT entry.
Fixes issue #1986.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3d339b3525e7c6ceb792eb9d143e7a922c19344d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8226
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The NVMe driver layer will clear this log, so we don't
need to send another one in the aer callback.
Here we change the logic to compare with previous NS
state, if the NS state is same it will fail the test.
Change-Id: I6d80cb6a5f6d5eab92b8ccac601a23c19cea4003
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8175
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The arguments of a tracepoint are formatted when they're printed now, so
there's no need to append ":" or pad it with spaces.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I74f5568f1982dacc079e3b80bd19a9cd740b48ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7955
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The traces record calls to spdk_(get|put)_io_channel() and saves the
reference count of the IO channel and its context. The context, instead
of an IO channel pointer, was selected because the same pointer is often
used in other traces (e.g. nvmf's poll group), so it makes it possible
to match these traces together.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I15fe982a89685d8f6e23d406d6d48f5c2d9d604b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7232
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Both values should provide similar information, while the qpair can also
be matched to the traces from lib/nvmf allowing the user to track the
qpairs across these modules.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iba9abdd3f41b93100c0403b1c90fc4549d39189e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7159
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The traces are tracking the lifecycle of a poll group: creating it,
adding and disconnecting qpairs, and finally destroying the group.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I075b7f24d14b8fbb42bb18ddd70a668a8bace118
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7158
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Upcoming patches will add accel_fw support for batching this cmd
and then vectored versions later along with accel_perf to exercise
them.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I59f577f4365fbf063d7419cc6052e10c998b58bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8143
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Was using reserved field. Similar fix to what was done earlier
for direct submission of crc32c operation.
fixes#1972
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ie9867e72f60c7f38aa1af0273a036f34580ed4c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8145
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
To match regular sumission prep function and allow caller to
modify both descriptor and completion structures. Also allows
for more accurate error reporting. Needed for upcoming patch to
fix batch CRC submissions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I4b7b5b24e2f54149b513d4b23ba32f3802aff3e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8144
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Upcoming patches will add support for the idxd engine and
the accel_perf tool. Also following will come vectored support and
batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7223517a844525ad52ed49d65627b04c3cd9fe7c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8141
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Upcoming patches will add support to the accel fw, the idxd engine and
the accel_perf tool. Also following will come vectored support and
batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ie0fbd4b8da9f727426000898c0b511587adac65b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8139
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
_nbd_fini will make all NBDs into closing state.
remove _nbd_async, beasue it will call asynchronous error.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: Ifb873b7f079b735983bdf20c2df652be0a21919f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8035
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
And for some internal functions we need to pass controller
parameter so that we can do vtophys based on transport type.
Change-Id: I3ca4fa162ec9305f62b295ba21f7474c21edfe52
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8031
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We need to wait to process this quirk until after we
have a valid CAP register value. Before this fix,
controllers with this quirk would get their io_queue_size
always capped at 2 (min io queue size) because CAP hadn't
actually been read yet.
Fixes: f5ba8a5e (nvme: add NVME_CTRLR_STATE_READ_CAP)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4df87b5dfb0faa21db5b4cf6fc667d80621d1691
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8211
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The inline function can also be used in the coming submit request
function.
Change-Id: If4a5511001e6586dbce0978298beddc537f54d8b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8173
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The PCIE and VFIOUSER both can use this function, the only difference
is VFIOUSER should use IOVA=VA to do the vtophys translation, so
here we will move the function to the common PCIe layer as the first
step.
Change-Id: I699edb67a00a2fa534072fc02ac2dd4a27aba8f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8030
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Each time the following file
"/sys/kernel/config/nvmet/subsystems/nqn_name/namespaces/ns_id/enable"
on the target side was changed, the SPDK initiator should receive an
async event (type: SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE, info:
SPDK_NVME_ASYNC_EVENT_NS_ATTR_CHANGED).
But actually not.
Since for SPDK, when target sent the non-first event, the condition
"nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_NS_ATTR)" that prevents
target from sending event was matched.
This commit fix this issue by issuing a get_log_page cmd for each async
event received, just as the kernel initiator does.
Fixes#1825.
Signed-off-by: tyler.sun <tyler.sun@dell.com>
Change-Id: I2973470a81893456ca12e86ac390ea1de0eed62c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7107
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Also add scripts/bpf/nvmf.bt to enable and log these
probes.
This patch also adds a script that can generate
a bpftrace script snippet with string maps for
needed enumerations (currently nvmf_tgt_state and
spdk_nvmf_subsystem_state). This allows us to
dynamically generate this from the source code, and
can be extended for other enums we may want to
add in the future.
Thanks to Michal Berger for converting my original
gen_enums.py script into gen_enums.sh!
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: Iff34a6218aef40055ac14932eea5fc00e1c8bcf5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7194
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This reverts commit 2246a93718.
We are seeing a lot of failure on io_device lookup in the test
pool. These only showed up after this patch was merged and sees
the most likely culprit. Reverting this patch for now while we
continue debug.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2ab098319dfae3a5356eb4fe0dbf9f4af2d2eea5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8199
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use the macros for red black tree provided by Free BSD to speed up
io_device lookup.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Ib3bd382bbeb610503194e7d7bfd569f60a0d0121
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7894
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The original function will disconnect queue pairs first and then free
controller memory finally, so rename it to vfio_user_destroy_ctrlr().
Change-Id: Idc235e4186bd4164be712fc9d4cda4991efc6248
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7624
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
The coming destroy_ctrlr() function will disconnect
queue pairs and free controller at last, so here
rename the original destroy_ctrlr() to free_ctrlr().
Change-Id: I527b2742142d60b0383be5a12391c77dd50d47a7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7623
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
The original destroy_qp() only release the queue pair
related memory, and free_qp() will be called inside
destroy_ctrlr() function, so also remove one duplicated
line here.
Change-Id: I2a06a6704b514361685068acda4e65ed5d502f0d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7622
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
The controller concept in NVMf is like a session, for any
new connection in nvmf_vfio_user_accept() with the endpoint,
we treat it as a new controller, we don't need the `ready`
field in controller to indicate the connection state, we
need the connection state in endpoint, so here just use
endpoint->ctrlr point to indicate the socket connection
is valid or not.
Change-Id: I588dbba7973cb61a1d79d81324a43e052f7dafb0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7621
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Adding iov to the spdk_bdev_zcopy_start function enable spdk_bdev_zcopy_start to
be used by transport layers as the iov is owned by the transport command
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: I6d2be7f49566048bf25b7711ada8d2fb49fea6ee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6816
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
ZCOPY emulation is not required. Modules can check if the bdev module
supports ZCOPY. If not supported the module uses the existing
READ and WRITE operations.
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: Idac0a4d27a79a6c7e567c420e15637e826c347c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6815
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
If a reconnect fails, we restore the original
transport_failure_reason after we're done with
the failed reconnect. Save the original reason
in the qpair itself rather than a local variable,
to facilitate upcoming changes where connect will
be asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I20ff43fc687a379aa5c930e17cf3ff8d730320be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8116
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Previously we were only checking trtype==PCIE to
determine whether a controller was fabrics. This
skipped the vfio-user case. So use the new
spdk_nvme_transport_id_is_fabrics() API instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I81f26853f44b1c47522ce6354e5aa4a905796bd0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8089
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Purpose: Use the new function in order to reduce duplicated code.
Change-Id: Ie848c7586575b3f0bb617d7e767cf459b43d4783
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8174
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Nvme-cli submits a RESCAN IOCTL after a format command to
update any information that may have changed during the
format, such as LBA Format. This patch adds support
for RESCAN by executing nvme_ctrlr_update_namespaces to
update the controller information.
Fixes: #1964
Change-Id: I9f03e00a7f39339947ff02390f69ce806e1cfa0e
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8146
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Rather than to rely on schedulers to access and modify
last_stats values over multiple scheduling periods, move that
operation to event framework.
Providing this to the schedulers in generic manner is better
than enforcing that each scheduler has to keep track of this
data on their own.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Icaf3b4af80d86fafaddf328fd230db9743d21ab5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7971
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
A Security Receive command with the Security Protocol field cleared to
00h shall return information about the security protocols supported by
the controller, so we can check the TCG security protocol is supported
or not before sending it.
Fix issue #1961.
Change-Id: Id061defe45db981b276e2794fd0b59f8db70b7f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8083
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Perf improvement, directs DSA to write to cache as opposed to
mem.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I0d6ba157af8f1b54f8aae3b8e54a6f7754e4a9de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8169
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We already keep a list of outstanding completion locations to
poll and were previously polling all of them. New ones are
added at the tail and we poll the oldest first from the head
so if we break when we find a slot that hasn't completed we
can get more work done while the HW finishes. This is a proven
performance improvement in limited testing.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Icc15041605586f9a31435d447d253c381c00b1f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8161
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
1. struct pxdcap.per is changed to struct pxdcap.rer
Which matches the name in the nvme spec.
2. use new API return value.
3. update specification changes.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: Ida421c4cffd1c65d550e83011ab123b321ea9dff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8088
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
vfio-user transport `req_complete` callback will zero the internal
NVMe command and response fields, the common NVMf library should
not use them after the callback, so here we use stack variables
to save them before the `req_complete` callback.
Fix issue #1965.
Change-Id: Iff2342b6095d9496cdf112d657a0a99ce1fb5d12
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8129
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Initialize ANA state of each namespace to optimized regardless of
whether ANA is supported or not. This will simplify the code to get
the optimal I/O path because we do not have to care if the namespace
supports ANA.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I24dfe08674af398671de6528b884e9d82409eeae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7890
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The purpose is to reduce the duplicated code in nvmf and iscsi
layer.
Change-Id: I7e96f0d5bb1ba4b81378addca3cdd929056384e9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8132
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
We map the SPDK_NVME_TRANSPORT_* values directly to
the NVMe-oF trtype values. Since PCIe isn't
Fabrics, we choose 256 which is outside of the
8-bit trtype range of values.
So we can just check if trtype >= 256 to determine
if the trid is for fabrics or not. This is
preferable to checking PCIE || VFIOUSER in case
additional non-fabrics transport types are added
in the future.
I considered taking a trid as the parameter instead,
but went this route since it is consistent with
the existing spdk_nvme_ctrlr_is_discovery().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib62ff4d30549b2324486c81f2dce67f0f1741e9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8077
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Connect the adminq as part of controller initialization
instead of controller construction.
We never actually 'connected' the adminq for
PCIe or vfio-user transports, since its a nop.
But their connect_qpair transport ops function
is also a nop for the adminq, so it's fine to
generically connect the adminq across all transports.
Note that we cannot read registers (cc or csts)
during controller initialization now until after
the adminq has been connected since reading fabrics
registers depends on a connected adminq. This gets
special cased for now, but eventually reading
cc and csts will need to be part of the state machine
itself to make it asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia5566d7c549d78d24b94ea253df51e697da6237f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8079
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
According to NVMe-oF 1.1 spec, it is not a fatal error.
So according to Figure 126 in NVMe Base specification,
we should return "Transient Transport Error".
Change-Id: I601304ae2bb24508882fb1ec8c7e53ec587ab515
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7795
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This ensures the discovery ctrlr initialization is
done the same as normal ctrlrs. This will be
critical as we make the driver fully asynchronous.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I33c4fd7c82d241c30e7adb89abe79b8088c8776a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8090
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Read CAP (Capabilities) register as part of controller
initialization instead of controller construction.
For now, still read CAP in the pcie and vfio-user
controller construction, since they need the
drstd (doorbell stride) to construct the admin
queue.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I000fe880f2ec0d6de1d565c883d7ea0ae1ac2c81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8078
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Read VS (Version) register as part of controller
initialization instead of controller construction.
This prepares for upcoming changes to make
controller attach fully asynchronous. Since reading
fabrics registers is an asynchronous operation, it
will be easier to read the VS register as part of
controller initialization which operates as an
asynchronous state machine.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I771386dbdf5902633e0d9f91b3b20be98f26fdc3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8076
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
We're going to be adding some new states (READ_CAP
and READ_VS) in future patches, that we want to
come before the current "INIT" state.
So we will simply make "INIT" have the same
value as this new NVME_CTRLR_STATE_CHECK_EN state
for now. That means existing code won't have to
change later once we add new states that come
before CHECK_EN.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07ca92e28ab1cd8d838cdef5c3ff36ba80a224bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8075
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
_reactor_schedule_thread() zeroes out the lw_thread on move.
To properly calculate thread stats since the move,
save them right after reschedule.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I44cc3b5907adda35b3117c2dd7268dc813d59853
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7919
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_thread keeps track of tsc from its whole lifetime,
those can be requested with spdk_thread_get_stats() at any time.
spdk_lw_thread uses stats from above and keeps track of two points in time:
- current_stats reflecting stats at the time of gather_metrics stage
- last_stats reflecting stats from previous gather_metrics stage
1)
Before this patch current_stats were duplicated in snapshot_stats.
There is no need for that so now they are removed.
2)
Removed _spdk_lw_thread_get_current_stats() since it would be copying
current_stats to current_stats, thus not perform any action.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5e5d4039cd0f7cc10ba150a3d915b90ec96589d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7842
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Whenever an event executes, it might change the currently
set thread or reset it to NULL.
To prevent it from affecting other events, set the current
thread each time an event executes.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6f1e7f8b7acab25353b4782058e87a9e01aab2c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8045
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
_get_thread_load() is function used to determine
the load of a thread based on relation of busy/idle tsc
from previous scheduling period.
In order to avoid division by 0 calculating the percentage,
we can simply exit early determining that thread was not
doing any work.
Having this check here will make sure that no matter
the changes in event framework, scheduler dynamic will work.
Removed the place that updated last_stats if they weren't
yet updated at least once (first scheduling period iteration).
In this case after change to _get_thread_load() will be the same,
as only the latest iteration will be used to calculate thread load.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I75f0f12f024675f2473a26e30596d6eb28093d46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7917
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
In struct spdk_nvmf_poll_group_stat, there are statistics of cumulative IO and
admin queue pair counts. But current qpair counts are not reflected. Use
this patch to add current admin and io qpair counts for a poll group.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I7d40aed8b3fb09f9d34e5b5232380d162b97882b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7969
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Eugene Kochetov <evgeniik@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Block device product name is same among same type
of the block devices, while Guest VM may use this
value to generate UUID, so here we change it to
block device name instead.
Change-Id: I58c5fb271a6a436c15520616c2065eee9c37300a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7996
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Now, vfu_setup_region() must specify the region fd offset (which is always zero
in our case).
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I10795d848a4c73ee9e1e78ea63776074401c4b17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8022
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use the macros for red black tree provided by Free BSD to speed up bdev
name lookup in spdk_bdev_get_by_name().
In the bdev_multi_allocation test, we can get 3x ~ 5x speed up when
creating multiple bdevs for various bdev nums.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I49a2fbcccf06d4c36cbd445ce59e0b0dd4ada31d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7837
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Clean up spdk_bdev_register() to facilitate the upcoming patch which
uses RB tree to speed up bdev name lookup.
* move TAILQ_INSERT from bdev_start() into bdev_init();
* rename bdev_init() to bdev_register() and rename bdev_start_finished()
to bdev_register_finished();
* inline bdev_start() into spdk_bdev_register().
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Idbfc800472bc8c6f9b615046e082772e9f6026e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8043
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
nvmf_get_ana_log_page used req->data to store the log page result.
While the req->data only contains the first iov, if req->iovcnt is
larger than 1, the req->data may not hold the complete log page; and
even worse, the log page result may be written to invalid address and
cause memory corruption.
The following patch will fix the same issue for other commands in
nvmf_ctrlr_get_log_page.
Fix#1946
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I495f3be05c82be5cd53609772c655c8924b9179f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7923
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Since we will reuse send_pdu for other purpose in the next
patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iee5166131b70a25bc13aaa847bfc9066231f31a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8028
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
For nvme/tcp connection, we use the synced manner
if the qpair is not fully connected. Thus without
the check, we will stuck here. And this patch
fixes this issue.
Change-Id: I72815bf5b4c0b31c4866bc1b9034b0e42b81d3f1
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Loading subsystems and restoring state from a JSON config file is useful
outside of the SPDK application framework, so move it to lib/init.
Change-Id: I7dd3ceace2e7b1b28eef83c91ce6a4eedc85740e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6645
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
I'm not sure whether this should go into lib/init or to lib/rpc
directly, but I've chosen lib/init for now.
This is to support applications that want to run the SPDK JSON
RPC server, but aren't using the SPDK application framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I79ca39aa0ca6e1a3a6905b0bf73e6cc99b086e55
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6644
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The functions to initialize the SPDK subsystems or tear them down
was previously an internal-only API. Make it public for use by
applications that aren't leverage SPDK's application framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I2ebfd020e6fa4c1947fa1c1a2ac509ce9b0242f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6643
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
There were some functions in the internal header that can be entirely
private to the init library. Move them over.
Also, remove the support for including the header from a C++ file
because these headers are internal to SPDK which is pure C.
Change-Id: Ic4323b2b8664e70106a57b3ca8acbc7c2efe621d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6642
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The header size is very small, which does not have too much value to
offload such calculation by hardware.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iaa82f39312df7eef3282325a33677ea41ab735ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8011
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
open_blobids holds bit array of currently open blobs,
this is a way for quicker determination than iterating
over all blobs. See patch introducing it:
(30ee8137)blob: Add a bitmask for quickly checking which blobs are open
That patch added resizes of this bit array to bs init
and bs recovery path (not shut down cleanly).
But that patch skipped over bs load from a clean shutdown.
This resulted in blob open having multiple blob pointers that
target the same blob id.
Fixes#1937
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3c42a63d168d1f5b013b449f010c5b207936045b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7998
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Mellanox Build Bot
Before this patch idle_tsc was sum of all idle tsc of all
threads running on a reactor.
There are cases when no threads are present on the reactor,
and _reactor_run() spins doing nothing.
To give more accurate representation of the reactors state,
the idle_tsc now adds time spent doing idle spinning.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If797b2a03507d17b07367d56d5f6c40cefbbbd49
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7900
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Disabling interrupt mode on reactor now updates the tsc_last
to current time. So any further tsc caulations will not account
for the time when reactor was in interrupt mode.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I56fb8a738eea60ee5de3b49d586f7cb228b54510
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7901
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
tsc_last value is used to update thread stats
during _reactor_run(). See:
spdk_thread_poll(thread, 0, reactor->tsc_last);
If no threads were present on the reactor,
this value got outdated and resulted in
adding time reactor spent with no threads to
stats of the first thread placed on that reactor.
This patch fixes thread stats by making sure
that argument to spdk_thread_poll() is up to date.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0c35fdba1b63b6ee19a5a2b34751090839cb2438
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7845
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This is useful for applications even if they elect not to use the SPDK
event framework.
This doesn't shift everything in one go - just the subsystem
initialization logic. Configuration file loading also needs to move
in a separate patch later.
Change-Id: Id419df1045442d416650ed90e5ee78adfdd623d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6641
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Get all of the hot stuff to the first cache line.
* Shrink the xfer enum to one byte (it only has 3 values).
* Pull out the dif enabled flag form the dif structure so it
can be access separately
* Rearrange the members
Change-Id: Id4a2fe90a49c055a4672642faac0028671ebfae9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7827
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
(cd83ea4a)thread: Add SPDK internal APIs spdk_thread_get_first/next_active/timed/paused_poller()
Patch above by mistake iterates over active_pollers list
for function that lists paused pollers.
Fixes#1947
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1b69d942675f34f5f046ec46feacc8d81d89f015
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7952
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
When blob persist starts, there can already be multiple
of such requests pending. It is possible to complete
a set of persists at once, if blob state after their
execution would be the same. This is the case when
persists are already pending when a particular persist
request is started.
This patch implements such mechanism by introducing
persists_to_complete queue, containing entries that
were previously queued up before starting the current
persist request. If there are any entries in this queue,
further requests are put into pending_persists.
When first request from persists_to_complete is persisted,
completions are issued for all requests on that queue at once.
If at that point there are any new entries on pending_persists,
all of them are put into persists_to_complete. Persist process is started
again with the first request from that queue.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I10063e55d6f821b1863de016d3148da6a719a422
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Was using reserved field in CRC to store the final address of where
to put the result, this not legal. Move to the completion record
and slightly re-arrange the struct to keep it at 96 bytes.
Refacorted the IO prep function so the caller can udpate both the
descriptor and completion records instead of continuing to add
parameters to the prep function for opcodes that need something
unique in the completion record. This also allowed for a minor
fix where the prep function was returning NULL when vtophys failed
which would have indicated busy as opposed to failire. Now we can
proprely fail that path.
fixes#1929
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic23bc7b68bdd5757c30b7963880677f423368e20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7735
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Update libvfio-user to the current version, updating the client for the relevant
changes.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic64ace08ac0c7e9676f04f8d1f47a9c0388a2652
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7983
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Support ACRE (Advanced Command Retry Enable) feature. Currently set
all crdt to 0 and only select crdt[0] (crd=1) when the IO has any
error.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: If7bc30f91f5b2d0839002dead17188a4b3a52d5d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7885
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
That prevents nvmf target from starting to destroy poll
groups prematurely
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I833f6198ef0e3083fdadf70dd3b62844c905aceb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7881
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch changes the order of IDENTIFY_ACTIVE_NS and CONSTRUCT_NS
controller states. It is required to further improve memory management
for namespaces by allocating memory only for active ones.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ie540442b1bd9e897afcbaa4319c139109dd0c515
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6503
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previous implementation allocated memory just once at the beginning of
active NS list retrieval procedure. It allocated memory for maximum
possible number of active namespaces, i.e. 'cdata.nn'.
This patch changes allocation logic. One page is allocated at the
beginning. If more is needed, reallocation is done with one more
page.
This patch also removes SPDK_MALLOC_DMA flag from allocation since we
don't do RDMA directly into this buffer.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Iaa80c4d70c54daaf71dcbf755c63a01a1d83b772
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6502
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Use the macros for red black tree provided by Free BSD to manage
timed pollers efficiently.
Allow RB_INSERT() to insert elements with duplicated keys by changing
the compare function to return 1 if two keys are equal.
Check the return code of RB_INSERT() because this is the first use case
for RB tree macros in SPDK. We did the same for RB_REMOVE() by
adding another temporary variable but we remove it from this patch
because it is not so important compared with RB_INSERT().
When a timed poller is inserted, update the cache for the closest (leftmost)
timed poller only if the tree was empty before or the closest (leftmost)
timed poller was actually changed. We do not have to use RB_MIN()
because all duplicated entries are inserted on the right side.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibe253ca8eecc10116548b5eedbcdba8fb961b88d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7722
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We already hold thehe next closest timed poller in tmp. Inlining
poller_remove_timer() into thread_poll() makes the cache update
more efficient.
After this patch, poller_remove_timer() is called only in a single case
and the case is compiled only on Linux. So add it inside of a temporary
block is much clearner. However it will be used by spdk_poller_reschedule()
in the end of this patch series. So keep
the current position.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2e6858223713eed84f5d70b160da6122edae6d03
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7910
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This enable us to optimize the cache update when RB tree is supported.
Call poller_remove_timer() after getting the next element because
as TAILQ_FOREACH_SAFE() and RB_FOREACH_SAFE() do, TAILQ_NEXT() may
not be valid after the current element is removed.
Previously, the patch had called poller_remove_timer() before getting
the next element. However, thanks to the nice testing, this bug was
found.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I18afb4412115dc1696cc568610cbe3dc618c2357
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7909
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This change is a preparation to first dequeue the closest timed poller
always when it is expired. Previously the poller_remove_timer() calls
were not consistent and difficult to follow.
spdk_poller_pause() sets poller to PAUSING even when it in RUNNING
and move it to PAUSED after returning from its context.
If spdk_poller_pause() and spdk_poller_resume() are called while poller
runs, it is moved to WAITING. Hence thread_execute_poller() and
thread_execute_timed_poller() ignore such cases.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I29340613a2ec0c3529d0886f4d81c0a0fdf8745d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7908
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This API was removed previously, so remove remaining
references in map file and unit tests.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba2f6a5f5ba590d3996dc133c8181083a33d7405
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7963
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Currently we allocate buffers perf each SGL descriptor.
That can lead to a problem when we use NVME bdev with
PRP controller and length of the 1st SGL descriptor is
not multiple of block size, i.e. the initiator may send
PRP1 (which is SGL[0]) which end address is page aligned
while start address is not aligned. This is allowed by
the spec. But when we read such a data to a local buffer,
start of the buffer is page aligned when its end is not.
That violates PRP requirements and we can't handle such
request. However if we use contig buffer to write both
PRP1 and PRP2 (SGL[0] and SGL[1]) then we won't meet
this problem.
Some existing unit tests were updated, 1 new was added.
Fixes github issue #1853
Change-Id: Ib2d56112b7b25e235d17bbc6df8dce4dc556e12d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7259
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
zipf is a power law probability distribution. When
applied to performance testing of block devices, it
will select blocks over the full range of LBAs, but
will more frequently select lower-numbered LBAs.
The theta parameter governs the distribution - higher
values of theta will concentrate the distribution on
a smaller number of LBAs.
Note that fio supports zipf, so adding it to SPDK
will enable our perf tools (bdevperf, nvme-perf) to
provide similar functionality.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7df129c9d61996a2070188c6cd9f1fde631ac208
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7779
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We cannot solely rely on the qpair_ctx->count reaching
0, because qpairs that are in process of being
disconnected will immediately invoke the qpair
disconnect cb.
Instead, we need to wait until the poll group
no longer has any qpairs remaining on the subsystem.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I977747d367d14a4bf60f66a1147b3d75679e5179
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7870
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
When we introduce RB tree, getting the closest timed poller is not
O(1) but O(log N). To mitigate such delay, cache the closest timed
poller into thread, and update the cache when its content is changed.
Add unit test cases for this change. They will also clarify the current
behavior of spdk_poller_unregister() and spdk_poller_pause() for
timed pollers.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibb98a54c261859a3210034038d3953e5c93ef8aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7720
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Requests that still reside on retry queue should be
submitted to disk before shutdown.
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Change-Id: Id2d020fcaef6443d01cfd8628686e9b0f34a1cfa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6771
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch reduces admount of changes in the next patch,
no functional changes added.
The next patch will add usage of contig IO buffers for
multi SGL payload. To support it we need to pass an
offset to fill_wr_sgl function. Also in the current
version we assume that for 1 iteration we fill 1 IO
buffer, the next patch will change it and we'll need
to swtich to the next IO buffer in special case. That
can't be done easily if we use fill_wr_sge function
Change-Id: Iee8209634637697f700f8fa9fe61ead156b6d622
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7258
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The header is small enough that it likely won't ever make sense
to offload the digest computation.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib6baa201a76d769d978f498f5c65985d5ab06ffd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7766
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We should use `ns->registrants` to count the number of registered
controllers(REGCTL) for Reservation Report command, as `subsystem->ctrlrs`
only list current active sessions(controllers).
Also use the output data buffer directly so that we don't need to calloc/free
during the process of Reservation Report command.
Fix#1928
Change-Id: I650224b751a08416208b8a504b82debff31e92fd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7822
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com>
This is the same effort as spdk_poller.
The following patches will move the definition of struct spdk_thread and
enum spdk_thread_state from include/spdk_internal/thread.h to
lib/thread/thread.c.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7f7bdfdd7a7b1b834d16d79638a4fd2d63e9daf6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7800
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Most accesses to the struct spdk_poller outside lib/thread have been
done via functions but a few direct accesses remain.
Change these to indirect accesses by addinng a few helper functions
as SPDK internal APIs.
Add spdk_poller_get_name() to get the name of the poller.
Remove spdk_poller_state_str() and add spdk_poller_get_state_str().
Exposing enum spdk_poller_state outside lib/thread is not really
necessary.
This removal requires us to update major SO version.
Add spdk_poller_get_period_ticks() to get the period ticks of the
poller.
Add struct spdk_poller_stats and spdk_poller_get_stats() to get
the stats of the poller.
The next patch will move the definition of struct spdk_poller and
enum spdk_poller_state from include/spdk_internal/thread.h to
lib/thread/thread.c.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id597dae074a15fcd8af09fd9d416a22ce2f403c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7798
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It is better if the internal of poller or thread is not accessed
outside lib/thread.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I80d488a111fd9a67a0da32d1e63695ce5a6bcb4c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7776
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The following patches will update the cache to the closest timed
poller when removing it from the list. To do it easier, factor out
the operation to remove a timed poller from timed_pollers list into
a helper function.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I25016d86117b240a2651d1f06e23bea0342211f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7719
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When we introduce red black tree for timed pollers, we will not use
RB_FOREACH_SAFE() but cache the leftmost (smallest) node and iterate
from it via RB_NEXT() instead.
As another preparation, separate TAILQ_FOREACH_SAFE() into TAILQ_FIRST()
and TAILQ_NEXT().
The next patch will cache the first element to thread and refer it
first.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie03c387b5b3a055c668e7b439a5eb05ed77eaa81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7718
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It will be sufficiently reasonable to check if the caller thread is valid
even if spdk_poller_pause() or spdk_poller_resume() does nothing.
Besides, let's write all possibles states explicitly in switch - cases.
This refactoring clarifies the logic and makes the following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1163eff388fe741d6b6924f474a82b1aa7d18acb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7665
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Change if - else if - else blocks of thread_execute_poller() and
thread_execute_timed_poller() to switch - cases blocks and specify
possible states explicitly in these switch - cases blocks.
The code will be simpler and clarified, and then the following patches
will be easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2e283894f5d69e1bd67466ae070c5b8bb9014616
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7664
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There will be no issue even if time poller is unregistered or paused
after it is expired. The iteration is stopped anyway after the head poller
is found not to be expired.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2b394b8b517930a6630dd31f59fcaea12eb80572
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7662
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The following patches will introduce red black tree to manage
timed pollers efficiently but it will be based on macros available only
in lib/thread/thread.c. Hence then it will be difficult to expose the
internal of timed pollers tree outside the file. On the other hand,
we do not want to include JSON into the file.
Hence add a few SPDK internal APIs to iterate pollers list transparently.
For spdk_thread_get_next_active/timed/pause_poller(), we omit the parameter
thread and get it internally from poller->thread even if the names include
the term "thread". This will be slightly cleaner.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I000801a2e4dc42fa79801a2fd6f2b06e1b769c88
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7717
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add {min,max}_cntlid to spdk_nvmf_subsystem, defaulting to 1 and
0xFFEF, respectively, and add nvmf_subsystem_set_cntlid_range() to
allow the controller range to be configured in the range [min_cntlid,
max_cntlid].
Also add {min,max}_cntlid to the nvmf_create_subsystem RPC to allow
the controller ID range to be specified when creating an nvmf
subsystem.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: I936db3bb0c9a38569063a6fd3c11df262dfad776
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7322
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This update will stop using `struct vfio_device_info` from
<linux/vfio.h>.
Fix issue #1922.
Change-Id: Ia7ad745db8d7ed8f5248ca13e3188ebd540b0e40
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7831
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This helps in next patch in series where multiple
completions will be executing.
UT is adjusted since one additional poll is required.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id72377ddef91e40cdbc2bdea6f33c23309b0ca3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
During snapshot creation the original blob becomes
a thin provisioned blob that will only the diff of data after
snapshot creation.
Despite the comment in the UT the number of polls before issuing
blob write was hitting blob BEFORE it swapped with new one.
Issuing I/O during this period shall check for io freeze
before checking cluster allocation.
Otherwise bs_io_unit_is_allocated() hits assert for thin
provisioned blob. This is because cluster map of blob is
empty, but properties have not been updated yet.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I742e1a50b14d456ae1e6de13b5111caec3e8322c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also for the purpose to avoid a burst of children unmap request,
we will submit at most up to 8 children request at a time for
a big UNMAP command.
Change-Id: Iaf0f18b07517e0a8f84dc04e8c93b95691a1a43c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7518
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Both UNMAP and R/W will share the some logic to process the submission,
so combine them to a helper function first.
Change-Id: Ia4f234c6a58f078d3e9f88cacaf1510a17f07acc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7606
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Addressing review comment from 6cebe9d0.
Minor optimisation to cache result of spdk_u32log2() into local variable
instead of calling it multiple times.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I2fd6afd1e3ee461662de3f9d278958664224e106
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7806
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
ctrlr_discovery.c doesn't need this #include.
Including it causes bdev_module.h types to be
emitted to the debug symbols at least with some
compilers, which can result in unwanted abidiff
errors.
The unit tests do need it, so just include it
there instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad28f9778ce08b11b52325658583ae9032295f3a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7813
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michal Berger <michalx.berger@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The library itself doesn't need it. The unit tests
do need it, so just include it there.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9aefd303ae12928d45141029436509f185105bd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Michal Berger <michalx.berger@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Normally, unless RTE_MALLOC_DEBUG is set, DPDK zeroes memory in rte_free().
If RTE_MALLOC_DEBUG rte_free() fills memory with poison pattern, but then
(and only then) the memory is zeroed in rte_zmalloc_socket(). Relying on
this behavior allows to avoid unnecessary memset() in spdk_zmalloc() path.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Change-Id: If3efa4dd22f1568949c3fb529b604bd597ceb32f
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6975
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We did not find this issue, because we always use seed = 0 to test.
Definitely, we need to assign the seed value.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I6bec1b6d61480cdd7c9e27578dcaf5de2f65cf44
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7716
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch fixes missing free of qpair_mask when a listener error
occurs in ctrlr_create.
Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch>
Change-Id: I09162b86d8ac73bf9fc2006a08dcc0a955f222b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7818
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Not just one with extra available info. Also remove the extra
read of the error register, not required.
fixes: #1927
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I28badb45d8cc8d16b72f7019bd2a2044998fc402
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7729
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Nvme-cli uses NVME_IOCTL_IO_CMDs for "io-passthru"
commands to cuse devices. This patch adds support
for that IOCTL.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: I20e0ac91ba08fce91bc5da1f4a1e454058cdd1e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7741
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The nvme cuse IOCTLs are actually creating passthru commands
that can be either IO passthru commands or admin commands.
Renaming the routines to correctly reflect that should limit
the confusion when reading the code. Passthru commands that
are admin commands will go to the spdk_nvme_ctrlr_cmd_admin_raw
interface and passthru commands that are IO will be sent to the
spdk_nvme_ctrlr_cmd_io_raw interface.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: I8d427fe8b5f503fdb2d193236c77d410d5b13886
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7740
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Nvme-cli uses BLKSSZGET so support needs to be added for
nvme cuse devices.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: Ic8316713b2d017c8ff32a225efff6bcb95842799
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7708
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It is useful to have debug log information in the nvme_cuse
path when debugging IOCTls and flows.
Signed-off-by: Curt Bruns <curt.e.bruns@gmail.com>
Change-Id: Ifef1bb82c96438e2fcbb9ad2fafe3f3eb66bed51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7707
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The purpose is to prepared for implement the async crc32 caculation
in the future patch.
Change-Id: Ia75f28154c49f08b527d48c63b9da79a6bdfede8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The purpose is to prepared for implement the async crc32 caculation
in the future patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I49f84ea1966f0acdd6f5aeb7192896f91fd16dee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7793
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Purpose: We will also support the kernel idxd driver, so we do not
need export this feature in the module file.
Change-Id: I965e031497920f527962ba187bccd81de6977b8f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6ad71a65244526e99a36920c630096cc9739d94d
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7809
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is prepared for using the hardware offloading
engine in accel framework. And some fields in nvme_tcp_pdu
needs to be DMA addressable.
Change-Id: I75325e2cd7ff25fe938bea0ac9489a5027e3e0e9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7770
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This is used to prepare using the accel framework to calculate
the crc32 because some fields in this structure needs to be allocated
in DMA addressable memory.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ib8def5596e60f4702709da647145c4e2b6d6848f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7767
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Note: a lot of the TCP and FC trace registers were
specifying '1' which means the arg type is a pointer,
but in reality it is always passing 0 for arg1. So
this patch just changes them to SPDK_TRACE_ARG_TYPE_INT.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I18d3cedd21e516f16cb2cd0a7f8c16670b1895d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7763
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
We already pass the PDU as arg1, so by changing the
trace register descriptions, we can map the PDUs to
more readable IDs when running the spdk_trace app.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad7106eeb0f5fe738f81da5ee174515d1cf4b6ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7757
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This was accidentally moved to the wrong place as
part of some earlier iSCSI refactoring. This trace
record should be executed when we have finished
reading all of the data for any PDU, not just those
with immediate data.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib1d17e5e79ff220e9e9b3dd55e247e745bd58019
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7756
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It can be useful for testing or development purposes to force clients to write
to doorbells using vfio-user messages instead of directly into shared memory;
add a transport-specific option to disable the shared mapping.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I7ed062fbe211ba27c85d00b12d81a0f84a8322ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7554
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will help us to add unmap split function, also
remove bdev_io_type_can_split() because we changed
to use swith(io_type) ... case now.
Change-Id: I449d6a9f5bf2d0b43dd124bbfc9e1ca2afddc15a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7516
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Only the last bdev_io can be freed without this fix.
Change-Id: I0d05b5d89e38ef60872ebc0f23aaed0c622593c4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7571
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Purpose: This patch is used to prepare to add the kernel
idxd support later.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: If89665f95d622c7342ab75050664158ec6fc615a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7330
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
(Note: this patch was previously applied as b32cfc46 and then reverted
as 63642bef.)
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7739
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
1) use spdk_bdev_get_name() accessor
2) use __SPDK_BDEV_MODULE_ONLY #define
The latter allows nvmf to just get the spdk_bdev_module
definitions and APIs that it needs for claiming bdevs
for purposes of avoiding the same namespace used in
different subsystems.
This also ensures that future changes to structures
like spdk_bdev and spdk_bdev_io will not cause
lib/nvmf so version changes.
Note: we include bdev_module.h explicitly in the
nvmf/subsystem unit tests now, before including
subsystem.c, because the unit tests do depend on
knowing the internal structure of spdk_bdev.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2f499a741d19f4749eadb402641f28137245fd23
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7738
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
libvfio-user DMA APIs report all regions notified by the client, including those
that don't have a corresponding shared mapping. There are several of these for
a typical VM, so just ignore this case.
Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: I37b06f4bc6d1818a03c8742616ed142f575d3f0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7532
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When payload_size is 0, we may get wrong cdw10 because of the calculate: 0 - 1,
add value check to fix value inversion bug.
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: I3bcd38ba981c854ff917282341d32aac47d22b76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7443
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This allows us to remove separate fcntl() calls to
set O_NONBLOCK.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1a590cfb3b65b3174bb5ef33e060cdc9bb7ac86c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7598
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
This simplifies the code a bit, removing the need
for the separate fcntl() calls.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4fef8f01a055d1471df87bd979c21d6198e9868a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7596
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It is already set by nvmf_tcp_req_pdu_init
when we get the pdu. So we do not set it again.
Change-Id: I034bbc46e600afd802457c0b152e303f16bafba3
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7714
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This reverts commit b32cfc467b.
This commit fails the ABI checks and only got through because the checks
were disabled until 21.04 hit.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id26b8f8ba551193d99b1ccbd31b35378b4095a20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7731
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7310
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We have seen that dptr was not alligned to 4k using cuse. Added allignment of data in cuse ctx to 4k same as it is done in nvme_allocate_request_user_copy
Signed-off-by: jwyka <jakub.wyka@intel.com>
Change-Id: Ic5c2482eae20d64ba467016eb61f5255467f70a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7453
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
When performing snapshot creation the I/O is frozen
during the process. The blob persists for extent page
allocation is delayed until snapshot creation is finished.
This results in multiple blob persists executing one after
the other, with only intent of writing out updated extent table
pointing to new extent pages.
Since blob->state is marked DIRTY before issuing each persist,
but a single persist completion marks state CLEAR.
Blob serialize correctly expects each persist to contain
dirtied metadata, in order to avoid unnecessary md writes.
Since all other instances of marking blob DIRTY is explicit,
assert in blob serialize is left as is.
Instead when running the queued up blob persists, the blob
state is marked DIRTY.
Side effect is that it will write out same md in some cases.
Fixes#1909
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I39f37299f3f0ebfccbdd4063781b5ecce286e993
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7640
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
spdk_vtophys() takes a mapping_length parameter, so
it can return the length for which the returned
virtual address is valid.
But spdk_vtophys() will only return the max
between the valid length and the input mapping_length
parameter.
So the nvme SGL building code for contiguous buffers
was broken, since it would only set the mapping_length
once, before the loop started. Worst case, if a buffer
started just before (maybe 256 bytes) before a huge page
boundary, each time through the loop we would create
a new SGL for only 256 bytes at a time, very quickly
running out of SGL entries for a large buffer.
Fixes#1852.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib1000d8b130e8e4bfeacccd6e60f8109428dfc1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7659
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The IDENTIFY_CNS quirk was applied as part of QEMU
OCSSD handling in commit 6442451b. But it was applied
not only to the OCSSD dev ID, but also the dev ID
for non-OCSSD NVMe controllers.
Starting with QEMU 5.2, QEMU will allocate a default
256 namespaces, but only some are active (associated
with the backing disks specified by the user). QEMU
supports IDENTIFY_CNS, but since this quirk was set,
we wouldn't send a real IDENTIFY_CNS and instead
would just populate a fake list where all namespaces
were considered active. This causes breakage in
a few places - mainly where we iterate through
the active namespaces, and then are surprised that
calling spdk_nvme_ns_is_active() returns false.
It was also breaking bdev_nvme_attach_controller RPC,
since by default we can only support returning 128
names, but since all of the namespaces were deemed
active, it was trying to return 256.
Fixes#1916.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4fdd27e0e36f0ac07a95f9f29aa83357e8505a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7658
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When zcero copy send is enabled and used by initiator,
it could significantly increase latency in some payloads.
To enable more fine graing configuration of zero copy
send feature, add new parameters enable_zerocopy_send_server
and enable_zerocopy_send_client to spdk_sock_impl_opts to
enable/disable zcopy for specific type of sockets.
Exisiting enable_zerocopy_send parameter affects all types
of sockets.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I111c75608f8826980a56e210c076ab8ff16ddbdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7457
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There are three modules implementing the bdev-zone API:
bdev_nvme, bdev_ocssd, and vbdev_zone_block.
For all three modules, the number of zones can be calculated using:
block_count / zone_size.
To avoid this calculation being performed everywhere, create a helper
function in bdev_zone.h, together with the other zone APIs, such that
a user can easily get the number of zones.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I2967b15a604ab8bf4420588e7510b9820762f925
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7451
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
tcp transport doesn't send a response capsule when
c2h_success is set even if cdw0 or cdw1 are non-0.
Signed-off-by: Ed rodriguez <edwinr@netapp.com>
Signed-off-by: John Meneghini johnm@netapp.com
Change-Id: Ieba81fcc50342a2009f7931526e6f8392e26b6a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6808
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When do spdk_reactor_set_interrupt_mode, if reactor
already runs in the specific mode, directly call
callback function before return 0;
Change-Id: I1fd8b753e9881755aa128aabe6d1e2749e58b39b
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7549
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The net library isn't needed - everything these RPCs
do can be done externally to the SPDK application.
This library will be removed in the 21.07 release.
As part of the deprecation, mark the net RPCs as
private. This will prevent an upcoming patch from
complaining that these RPCs are not documented.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I61118820fd29e410dca763595c3d9fd01a57373d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7592
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
For pollers that don't natively support interrupts, using
a busy wait mechanism temporarily.
An interrupt falicity for busy wait will
be registered for non-periodic poller.
Internally, an eventfd is created to each busy wait
poller. Write the eventfd when set interrupt mode,
and only read the eventfd when set back to poll mode,
then the busy wait poller will be called repeatly
in interrupt mode.
Change-Id: Iaeae14d1ff69fd9ef7d606a0b0a70193764513e9
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6711
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
In next patch, if poller doesn't have a period, eventfd
will be created which's always busy automatically.
This eventfd can be combined with timerfd. So rename
timerfd to interruptfd.
Change-Id: Ibffa30ecfcaa73e55f47e97fac854641b74f2dfb
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7546
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Defined callback for spdk_poller to adapt itself to
set interrupt or poll mode. The callback can
be registered to spdk_poller by new function
`spdk_poller_register_interrupt`
Interrupt callback operations for period poller are implemented,
so period pollers now are interruptable.
Change-Id: I2aa6ebfdd75f76b85a70af7e42530be4131ddc8a
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5752
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
These refined functions are prepared to adapt period
poller to following poller switchable API.
Change-Id: I34d2a785fa0e757b97b0dac5ccf24819d75e0184
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7156
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The interrupt mode of spdk_thread can be operated
by reactor based on reactor's interrupt mode when
the spdk_thread is scheduled or the reactor is set
into interrupt mode.
Change-Id: Ibeef7ffb759589a7b372bd78e59e3410be061383
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6709
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_thread_set_interrupt_mode can get spdk_thread run
between intr and poll mode. It is only valid when thread
interrupt facility is enabled by
spdk_interrupt_mode_enable(). Currently, this function
is limited that no poller is registered to the spdk_thread.
Change-Id: Iba54accd5976beb6f6e155014903928ce2858e36
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6708
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
All the other RPCs use underscores in the names of their fields, so
`framework_get_scheduler` should also use them instead of the spaces.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I0e9edd7c59a4ab61643a7b558a2359e1805ed0b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7557
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DPDK recently clarified some semantics on the rte_devargs 'data'
and 'args' fields. This actually breaks our use of the 'data'
field to store the 2 second timeout timestamp for delaying
attach to newly inserted devices. Investigating this further,
it does not seem our use of the 'data' field was valid - it just
happened to work until now.
We could use the 'args' field now. But knowing whether to use
'args' or 'data' would then be dependent on the DPDK version.
We cannot use RTE_VERSION_NUM to decide, because this is a
compile time decision, and it is possible in shared library
use cases that we could actually link and execute against a
different version of DPDK than we built against.
So instead we will create our own env_devargs structure that
will store these allowed_at timestamps. Currently it's just
a linked list (which is exactly how DPDK does it) - we could
make it more optimal with a hash table down the road, but this
code only executes when we are doing PCI enumeration so it is
not performance critical.
Fixes#1904.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ee5d65ba90635b5a96b97dd0f4ab72a093fe8f7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7506
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Before this patch blob persist wrote out all allocated extent pages.
Intended design was to write out extent pages in two cases:
1) Thin provisioned blobs to write out extent pages when necessary
during cluster allocation.
2) Thick provisioned blobs to write extent pages during blob persist
when the blob was resized
This patch implements 1) by inserting extent before issuing blob persist
in cluster allocation path.
See blob_persist_extent_page_cpl() and blob_insert_new_ep_cb().
Blob persist might have to rewrite the last extent page after blob resize.
See blob_persist_start().
Meanwhile 2) was incorrecly implemented since it always re-wrote all
extent pages starting from 0. This was addressed by limiting number
of extent pages written, only to ones that were resized.
Some considerations were needed:
a) blob resize happen on cluster granularity, it might be needed to re-write
last extent page if resize was not large enough to change number of extent pages
b) first extent page to write should be based on the num_extent_pages from
active or clean, depending on resize direction
See blob_persist_start().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ibba9e3de3aadb64c1844a462eb0246e4ef65d37f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7202
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
When both clone and snapshot had already extent pages
corresponding to the same region in cluster map,
the clone extent page was replaced with one from snapshot.
This was incorrect and would result in loss of clusters
from clones extent page. It did not occur in practice
because all extent pages were rewritten anyway during
md sync. Cluster map was correct so updated extent pages
were too.
Cluster map correctness is verified in UT _blob_inflate_rw(true),
at the very end when checking data consistency of inflated blob.
This patch writes out the updated extent page explicitly.
So it would be possible to skip wirting out extent pages
during md sync later in the series.
Note 1)
At this point in series the extent page is written here,
and in blob persists. The later will be removed later in
series.
Note 2)
Errors during updating extent pages are not accounted for,
but neither does syncing them in blob persist.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I7deac3c64299f33f8df49e860af1a16295c074e6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
The blob_insert_extent() name was confusing, since the function
was actually responsible for writting out the extent page to disk.
Changed to a more fitting name.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ia312b0ef152100f30d5a1bfe123e55135c8afa6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
This patch does not change functionality. It separates
three stages of updating clone during snapshot deletion:
- updating cluster map
- updating extent pages
- removing backing device from clone
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I44869f3be596d9d0f06db4acedfdd7e1500516ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Modifies RPC "framework_get_reactors" to get core frequency for current
core and insert it into JSON response.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Change-Id: Ibb9c25e6e1d28ddb4cde42baa20a7e9808652ae8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6582
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Currently _get_core_curr_freqs returns an index from the array
of available frequencies for given core. This change aims to
make this function execute what its name suggests.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Change-Id: I1143f692e7bbbf2f8f9e1cd4943f8e3ecd70ddea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7452
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function finds a placement_id that does not have a group
associated with it.
Change-Id: I1306690e980fd4661f46dba9fb283f048a962eba
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7223
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also use debug log when the memory region isn't 2MiB aligned,
The QEMU may only use one page for a memory region, we are sure
these memory regions will not be used as NVMe data buffers.
Previously libvfio-user will help us to round up these memory
regions to 2MiB alignment, and it doesn't do it anymore, this
isn't an error case so change it to debug log.
Change-Id: I6c397f50407d4f2a14f78d9f99fffc2e4054ff51
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7545
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The io_msg qpair is allocated and managed by the
primary process, so don't try polling it from
secondary processes.
This fixes a bug where an SPDK target has configured
cuse, and we try to run fio (for example) as a
secondary process.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I48e2b89597196ce2ba1fc02ea3a7c76c5a33281a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7482
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The P55XX serial SSDs can support dlfeat.read_value in the identify
namespace data structure, we don't need to add this quirk for it,
just remove it.
Change-Id: I165d89085e246a570e80dbaf05f41dc331b93f0c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7526
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When the format command is issued, the kioxia drives responds with "NS Attr change" notices.
In the callback function of the notice, the CQ Head Doorbell is updated twice with the same
value while issuing the Active NS list & identify NS commands.
Fixes: #1701
Signed-off-by: G.Balaji <gbalajieie@gmail.com>
Change-Id: I8cc80fba0a226c22753e605ef3129602a9313ce7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7149
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
private tag added to the experimental or internal rpcs
which might be removed in future or not documented.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: I3e967252412f2491860eea5fa69750a7562b994a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7510
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Also update the UT.
Change-Id: I6086bf4cafca8a917a467490955d7df0ba8930d5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7495
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously we can only remap NVMe command using PRP, now we add
the SGL support.
Change-Id: Iec352d858a07bdd3d5f261336d6fa1167ba7aa79
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7279
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The API `spdk_nvme_map_prps` is used in nvmf/vfio-user to
remap VM's NVMe command data buffer to local virtual address,
and for command using PRP, there maybe multiple pages, when
parsing the PRP list to local IOVs, we need a parameter to check
that the maximum number of vectors can't exceed the IOVs, this API
can't meet the requirement, while here, we add a new API `spdk_nvme_map_cmd`
and with a new parameter `max_iovcnt` to fix this case, and it can
also cover the command using SGL in the coming patches.
Change-Id: I71063524bed16ee3434103867a556d3741e55326
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7278
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Change spdk_nvme_map_prps to a internal fucntion with
a new parameter `max_iovcnt` to protect the IOVs. Also
for the purpose to keep API compatibility, we still leave
the API here.
Change-Id: I9a638beb87aab20bba5f8a4fa0a9396110d56aff
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7335
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is required by libvfio-user APIs.
Change-Id: I675a3be0a9650d146c8d37e42debf1191656903b
Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7472
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is used to fix the chained crc32c computing when users pass
a vector. Since we use a union in spdk_accel_task structure to differentiate
the usage on "src" and "the vector info" (iovs and iovcnt). So we cannot
directly write the src field while users pass a vector.
And I verified it in the hardware platform.
Change-Id: I85d6e86fa689b261782f80a2f89d908a5d4db84f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7471
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There are cases that the valid vfio container doesn't contain
any IOMMU group, so for this case we should not return error.
Fix issue #1855.
Change-Id: I2057dc9a519a31ec16452b1e9d1c470eccfc4992
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7470
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I068e74f5b7d078ad37572eff47e772ad6967b827
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7436
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Unlike tcp/rdma transport, the vfio-user transport doesn't need to
wait for the data buffers, so here we add two request states for
now.
The request state will help us for coming request abort API.
Change-Id: Ibbb193fbbd358333f81aa29341493c19ab7bd108
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7435
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously we reset them when getting a new request, but it's more
reasonable in the completion path.
Change-Id: I3dab35ce471d2a5bbd37576540d30a09dcf93410
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7434
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also rename transport request and controller variables
with "vu_" prefix.
The consolidated function will be used in coming patch.
Change-Id: I5219c13d7089dfdaea4a54e0b15cc5e6ecf2eb16
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7433
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since the maps are unique to modules, they can store the group_impls
directly.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7f11db558e38e940267fdf6eaacbe515334391c2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This allows for different policies per module, as well as overlapped
placement_id values.
Change-Id: I0a9c83e68d22733d81f005eb054a4c5f236f88d9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
If the qpair is part of a poll group, the socket will get
flushed as part of polling that group already. We only need
to explicitly flush to handle the case where the qpair is
not in a poll group.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib2a510b6d26d1622950437d81e0a40f6b15d6b54
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7049
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
There was a fix for this that went into the posix layer, but the
underlying problem is the logic in the nvme/tcp transport. Attempt to
fix that instead.
Change-Id: I04dd850bb201641d441c8c1f88c7bb8ba1d09e58
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6751
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Logs are all changed to DEBUGLOG. If you compiles non-debug mode.
Gcc reports error. Using #ifdef DEBUG to exclude them.
Fixes#1903
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Idcaf083e430a77845fbd8443acade4b3f0e1efc9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7445
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Even if the operation is deferred, null it out if it reported success.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I3cc9eaa88bdd7a2e7d13790782f4a9b0966e5585
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6892
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Some iSCSI initiators send a Data-OUT PDU sequence whose PDUs do
not have block size multiples data.
SPDK iSCSI target had replied SCSI write error to such initiators
because previously we had sent a write subtask per Data-OUT PDU.
SPDK SCSI library had rejected the write subtask because its data
was not block size multiples.
This patch fixes the issue.
The idea is to aggregate multiple Data-OUT PDUs into a single write
subtask up to 64KB or until F bit is set. MaxRecvDataSegmentLength
is 64KB but MaxBurstLength is 1MB. Hence one Data-OUT PDU data may
be split into multiple data buffers, but the maximum number of split
is two.
When processing the data segment of the Data-OUT PDU, save the data
buffer of the current PDU to the current task if the data buffer is
not full and F bit is not set. In this case, write subtask is not
submitted.
When processing the header of the Data-OUT PDU, if the current task
saves the data buffer from the last Data-OUT PDU, it passes the data
buffer to the Data-OUT PDU.
When reading the data segment of the current PDU, attach the second
data buffer to the current PDU if the first data buffer becomes full.
These are enabled only if DIF is disabled.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib9cfb53fe8c0807a63e58c61bed3bb52f60f4830
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6439
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
It can divide to two parts:
1, UIO driver - sigbus error handling and uevent
process.
2, VFIO - request notify handling.
sigbus error process is in previous patch.
Change-Id: Idc09754b83ae9ddcaea1f2afcbc13e528ead9863
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5768
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
struct sockaddr_nl {
sa_family_t nl_family; /* AF_NETLINK */
unsigned short nl_pad; /* Zero */
pid_t nl_pid; /* Port ID */
__u32 nl_groups; /* Multicast groups mask */
};
nl_pid is the unicast address of netlink socket. It's always 0
if the destination is in the kernel. For a user-space process,
nl_pid is usually the PID of the process owning the destination
socket. However, nl_pid identifies a netlink socket, not a
process. If a process owns several netlink sockets, then nl_pid
can be equal to the process ID only for at most one socket.
There are two ways to assign nl_pid to a netlink socket. If the
application sets nl_pid before calling bind(), then it is up to
the application to make sure that nl_pid is unique. If the
application sets it to 0, the kernel takes care of assigning it.
The kernel assigns the process ID to the first netlink socket the
process opens and assigns a unique nl_pid to every netlink socket
that the process subsequently creates.
Change-Id: Ic0688228105ea6ba4ebae1d130b9271126c37b0e
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7367
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add the sigbus handler to virtio pci device
such as virtio_blk and virtio_scsi.
Change-Id: I07f2f175a585a425ef14050e2bf83bacb6e4c3bc
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5769
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This prepares for an upcoming patch to fix issue #1701 which
requires handling async events outside of the check
completions loop.
Fixes: #1701
Signed-off-by: G.Balaji <gbalajieie@gmail.com>
Change-Id: I4985d814903143511383172b1a443580db33a78f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7416
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In the case that reactor is needed to be valid, add an
explicit assert there.
Change-Id: Ic47030d50a6a940ddf87a3744bae38c94dd7252e
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7320
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
This makes it possible to traverse from the group_impl to
the group. It hasn't been necessary so far but will be in an
upcoming change.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I2bf119461bfd5ac5c8a63a3f1f4560d32e695c75
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7218
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Individual modules will need to mantain their own placement maps for
this to work correctly, especially if modules have different algorithms.
This is a step toward allowing them to do that.
Change-Id: Ie798baa50b94f1e99d6690adb606b936c7b30da0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7217
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is a step toward allowing for multiple maps. Each module may have a
different meaning for placement_id with different uniqueness rules. They
can't all be in the same map.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I608680a08b947a5d5c0818ff66505ed64e1b891e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7214
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Instead, move it down to the modules. This allows modules
to potentially change the value, if they are able.
Change-Id: I08f5fbadf5d1e96b489ddaaca72aa051ce2cb85c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7212
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There are a small, bounded set of placement_ids that the socket layer
will ever encounter, and they remain valid for the lifetime of the
program. The association between a poll group and a placement_id is now
correctly broken when the reference count drops to 0 (in response to
sock_map_release calls), so do not free the entry when the poll group is
destroyed so that it may be reused again.
Change-Id: Iad90e2da7d0860fa8c5cff24f9699bef30cd7bc2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7210
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Allow the map to have entries with a valid placement_id, but no group.
This will be useful later when the order of placement_id discovery and
group assignment may be reversed.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ia39adb3a030135940aeb9eeadf9df78056e59c0d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7209
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We can return error status when processing RELEASE2 without
a reservation, also add a UT to cover this case.
Fix issue #1898.
Change-Id: I56ffa8eabfc0409307500f8740cb627aab9d2f0b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7379
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This reverts commit aaac48880d.
This patch was showing issues with SPDK vhost mappings
when handling larger numbers of VMs.
Fixes issue #1901.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I81bd311d26037dcb9340d85abcb4ea45b20a5170
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7424
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Change-Id: I2e47b148209ce4c232dbdc5f20c90548be995e1a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7334
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will ensure that we can't exceed the iovcnt when parse
NVMe PRP list to req->iov.
Also comment that the iovcnt in vfio-user transport is used to track
each gpa_to_vva map, for NVMe PRP list command, the PRP2 itself also
will use one entry, so we need add one more entry for this case.
Fix issue #1864.
Change-Id: I06c7137e2c4637c9501f82a9eb1c8e4395d819cd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7264
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For NVMe backend device, we should use vtophys to calculate
physical address when doing DMA from/to VM to drives.
Fix#1822.
Change-Id: Ib8fbc371e19e77a20202d408340e7d65644b1eeb
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7261
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously we poll the MMIO callbacks in the context of ADMIN queue's
poll group, here we do some improvement to start a poller to do MMIO
poll, then the group poll will only process NVMe commands while the
MMIO poller will process MMIO access.
This is useful when doing live migration, because the migration region
defined by VFIO is a BAR region, we should stop polling queue pairs
but ack the MMIO accesses during the live migration.
Change-Id: I63bac44889cbe0c31d47599810aab8335dfd4ff5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7251
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Just code movement for the coming patch.
Change-Id: I7e844bc27a037e086796f9659351f20cdbb517fb
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7333
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is used to update the field definition related with
work queue in the header file.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I68b81d9dfc2497db89e96f0730785be03dcb8add
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7225
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
NVME_MAX_PRP_LIST_ENTRIES has changed over time, so let's
just remove the reference to the exact value here. Also
explain a bit more why the max size isn't
(NUM_ENTRIES + 1) * page_size.
While here, do a small whitespace cleanup as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib75813788abdd3dbb43192f9fdc27f99b33aeadf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvmf_poll_group_remove_subsystem_msg() disconnects all
qpairs associated with controllers in the specified
subsystem. If it finds any controllers that need to
be disconnected, it sends a message to the running
thread to execute the same function again later.
But when it runs again later, the qpair may no longer
be in the poll group, but there could still be
outstanding messages being sent between threads. For
example, _nvmf_qpair_destroy() needs to send a message
to the ctrlr->thread to clear the qpair mask bit.
All of this could result in the nvmf target starting
to destroy poll groups prematurely. Destroy poll
groups results in the nvmf spdk_threads exiting. If
there are still messages being processed from
the STOP_SUBSYSTEMS target state, we can get
use-after-free errors since processing of those
messages could access freed memory associated with
the exited thread.
Fixes issue #1850.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1e63b9addb2956495a69b5108a41e029f6f9a85d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7275
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This include isn't needed in queue_extras.h itself.
There were a few places that were implicitly
depending on this include, so fix those to include
util.h explicitly.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia962ae5a4403ee8ae15f3106d0d5e7d7412a4535
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7172
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Check if qpair has a poll group during the connect process,
use poll group's statistics or allocate own structure per
qpair. That is done due to not all applications use poll
groups and we want to avoid "if (qpair->group)"
conditions in data path.
Admin qpair always allocates its own statistics
structure but the statistics are not reported
since this qpair is not attached to a poll group.
Statistics are reported by spdk_nvme_perf tool
if --transport-stats and in bdev_nvme_transport_statistics
RPC method.
Change-Id: I58765be161491fe394968ea65ea22db1478b219a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6304
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
These are interface functions that can be used by
an application e.g. spdk_nvme_perf or bdev_nvme
library. The next patches will add usage of these
functions.
Change-Id: I33b88e0e713c2ea5967f9241885e3257c5070577
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6300
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The new 2 API function allow to get and free stats
per poll group. New function to get transport name
have been added to report not only transport type but
also the name.
For now only RDMA transport reports statistics,
other transports will be added later.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I2824cb474fde5fa859cf8196dabac2c48c05709c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6299
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
New statistics include number of poller calls,
number of idle polls and total number of completions.
These statistics allow to estimate % of idle polls
and the number of completions per poll.
Since nvme_rdma_cq_process_completions function
returns number of completed NVMF requests and each
NVMF request consumes 2 RDMA completions (send+recv),
this function was extended to return the number of
RDMA completions.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ifdc1e2e467f645adb5d66d39ff2a379e161fbd77
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6298
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
These statistics allow to estimate WRs batching
efficiency. The number of send WRs equals the total
number of submitted NVME commands.
Change-Id: I96c9836cd6b9070cf5f62e43b4d2738506866e94
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6297
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Although currently acking msg_fd inside function
msg_queue_run_batch() will also ack critical_msg's
notification, it is easier to understand the code
if moving acking msg_fd code into
thread_interrupt_msg_process().
Change-Id: I98267c5c28358334a2c1133e3dbc125788de77ab
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7265
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
As a start of combining interrupt ability into poller,
it aims to get spdk_thread & spdk_poller runnable between
poll mode and interrupt mode with dynamic switching.
spdk_interrupt_mode_is_enabled() indicate whether interrupt
mode is enabled and dynamic switching is permitted. So
spdk_interrupt_mode_is_enabled==true leads to set up
interrupt mode related resources;
in_interrupt flag indicates whether one spdk_thread now
is running in intr mode.
It is possible that spdk_interrupt_mode_is_enabled==true
but in_interrupt==false. this means spdk_thread & spdk_poller
switched to poll mode from interrupt mode due to heavy
workload coming.
To align with spdk_reactor, use "in_interrupt" to
indicate whether one spdk_thread now runs in intr.
Change-Id: I2cd806bf4dec9969f3df88fac7f6b0c0b716d907
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6540
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Restrict spdk_interrupt_mode_enable must be called
once prior to initializing the threading library.
Change-Id: I833ff63fae19882e82154195d03dd7ce56ffb1de
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6707
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be helpful to simplify the upcoming change.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1f170fe48d2ec1b5ea05da6a8aa3589060c5c32d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6438
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use current_data_offset of task to track the current offset of
large write I/O by following the last patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iec3a371c6050fe11478b6f158259d8f4013f5238
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Custom transport may not provide the `qpair_abort_request`
callback function, so here for transport API we will just
call it when it's not empty. We will add the callback
support with vfio-user in another patch.
Fix#1883.
Change-Id: Icd82a26bde4ed90068bc85ee04cce9642cb6135d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7291
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We used the controller ready field to indicate ADMIN queue connection,
but the accept poller and ADMIN poll group may run in different
threads, this may lead vfu_attach_ctx() be called several times, so
change the 'ready' to true when a new socket connection is created.
Fix issue #1854.
Change-Id: Iab6ffd6dffb3fff5cf893e79774bc28fe0b2830c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7073
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will make the code easier to understand.
Change-Id: I7112d3fd5f0d6dce9b66d44375b68ce7d1e8951d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7072
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
unmap_q is only be called in unmap_qp, so remove this function to make the
code more clear to read.
Change-Id: I627c7a1efdcb85476cb618fced8b0bfc2d8f1f62
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6886
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When killing QEMU or remote client is terminated normally,
we can release current controller related data structure,
users may restart QEMU to connect the same socket file
again, for the new connection, vfio-user will create
a new controller data structure for it.
Here we add a lock in the endpoint data structure to protect
number of connected queue pairs variable, because controller
data structure is like a session, while endpoint is related
with the socket file, so it's safe here. Moreover, we can
use this lock to protect live migration related data
structures in future.
Change-Id: Ie7060041a253604e7a2242813ec284eae46fe4e8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6862
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The following patches will aggregate multiple Data-OUT PDUs into a
single write subtask and we will not be able to use reqh->buffer_offset
to track the current offset of large write I/O to submit write subtasks.
On the other hand, each iscsi_task or iscsi_subtask is only read or write
Hence rename current_datain_offset of iscsi_task by current_data_offset
in this patch.
The next patch will use it to track the current offset of large write I/O
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I922582c5b9474a3c512f81d0f0425158a38a9a8d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6423
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The following patches will aggregate multiple Data-OUT PDUs into a
single SCSI write up to 64KB. Any variable to accumulate data length
is necessary.
Hence add data_len to mobj and accumulate read data length into
mobj->data_len, and then refer mobj instead of pdu->data and
pdu->data_segment_len to submit write subtask.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6354534769e67c0fd995bbc3c2b4a80d21a23915
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6422
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Wrap an operation to get a data buffer from mempool into a helper
function iscsi_datapool_get() and wrap an operation to put a data
buffer to mempool into a helper function iscsi_data_pool_put().
Use inline for both functions.
Besides, as a minor fix, remove duplicated file inclusion between
iscsi.c and iscsi.h.
7
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia3005dffaa93a6bca16f19bb467fb5b64ae1aad2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6366
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add crc32c to struct spdk_iscsi_pdu and initialize it by SPDK_CRC32C_INITIAL,
and then use it as the initial value of _iscsi_pdu_calc_data_digest().
Separate finalization of crc32c into _iscsi_pdu_finalize_data_digest().
Move the definition of related macro constants from iscsi.c to iscsi.h.
iscsi_pdu_calc_data_digest() is used for read too. So setting
pdu->valid_data_bytes before calling iscsi_pdu_calc_data_digest()
for read.
Data split will be supported only if DIF is disabled, and hence
DIF case is not changed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9d24f605fd0d452782e17695b613cd2f63d2e42f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6421
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The following patches will want to aggregate multiple Data-OUT PDUs
into the same data buffer, but it will be 64KB at most.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I72eabbeae0b027c2fbff2a5837d180b06b0a1b49
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6418
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The following patches will change the handler for Data-OUT PDU to
submit subtask only when 64KB data is read or F bit is set.
Previously, we had created a subtask when processing header and
before reading data segment. Creating a subtask beforehand is not
convenient for the following changes.
Hence create a subtask after reading data segment.
If LUN is removed while processing the Data-OUT PDU, the corresponding
primary task will be terminated by iscsi_clear_all_transfer_task(),
and any subtask completion is not sent to initiator. Hence we can
reject the received Data-OUT PDU safely.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifb6d6988676080b458b31d12fef065f3c1de0cb6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The hotplug lib can be used for pcie devices
such as nvme, virtio_blk and virtio scsi.
For the sigbus handler, there is only one in a
process and it should handle all the devices.
And align nvme to the hotplug lib
Add the ADD uevent support for allowing the
device hotplug.
Change-Id: I82cd3b4af38ca24cee8b041a215a85c4a69e60f7
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5653
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Introduce new utilities NVME_CTRLR_ERRLOG, NVME_CTRLR_WARNLOG and so
on to output the ctrlr's identification at different log levels.
For RDMA and TCP, the subnqn will be output and for PCIe and custom,
the traddr will be ouptput.
Change-Id: I81a112463bf752999aa1fe4e0c867d88e09a2f64
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7057
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This can be handled in a cleaner way by having the sock group
create/close operations take an extra reference.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id701b6dd9a19b01cd40e0d95eb870aef977eea99
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7208
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
added to a group
The process of adding a socket to a group may, in some scenarios, change
the placement id.
Change-Id: I879d9641099d86978ede5d5e2be1a72eda65a79b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7207
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also fix the comment. It's never going to make sense to add a socket
to a group twice.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id4845b77114aef32bbe4ea0e53d2e1fde8e116f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7204
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Using nvmf_tcp_destroy() would destroy ttransport->lock, which hasn't
been initialized by that point yet.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie9ced97ef520236dddaa70453b6807e8382ce534
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7235
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will be useful as the same purpose as
spdk_io_channel_iter_get_io_device() and will be used in the
following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id45f5980c65543703b91df2afeb47448232fe503
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7237
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
When a request is submitted, it may have incorrect iov
alignment that doesn't fit PRP requirements. In the
current version an internal function fails such a request
and returns a NULL pointer. This is mapped to -ENOMEM
error which is returned to generic bdev layer where
such a request is queued in a "nomem_io" queue and
later can be resubmitted. That is incorrect and such
a request must be completed immediately. To fail the
request, we need to differentiate between -ENOMEM and
other cases, so we pass a pointer to a result to
local nvme functions
Change-Id: I7120d49114d801497a71fca5a23b172732d088de
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7036
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function will be used the next patch, current
behaviour remains unchanged
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie74c7395f67a08b0cac018eb5114f358a6b583cb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7092
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The max_io_size transport option should be a power of 2 and be >= 8KB.
Max data tranfer size is defined in NVMe-oF spec as 2^(mdts cmd field) * 4KB.
Mdts cmd field is calculated as spdk_u32log2(transport->opts.max_io_size / 4096),
so max_io_size < 8KB results in mdts=0, which means no size limit (according to spec).
User can set max_io_size = 0 explicitly to allow no size limit.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: Id88a77efce5f217e1fc7750f61c0bd330aaa3791
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6384
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The dbdf format is xxxx:xx:xx.x and with the wrong
format the rte_devargs_parse always fails.
Change-Id: Ia34bc5e68f6401bb25907d5d07c65636b4f491b5
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7140
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
We can send a message to repeat subsystem pause
and free a context that will be used later
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ia5e8b0ff43f5e38bd8e659a8a64d42926e1d3c6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6661
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
With this change, each polling group will use one
accel_engine channel. This change will be more suitable
to utlize the underlying accelerated device.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ibab183a1f65baff7e58529ee05e96b1b04731285
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7055
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Actually we already do this when removeing a memory region, but
the check for it is too strict, we should unmap queue pairs when
the queue pair is in the memory region.
Change-Id: Ia646a0255e32ecdd0a70537a8011ce622eb59195
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6861
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The error response should be processed at the beginning of this
function.
Change-Id: Id583951c82981cf58984ab68b23ad6f7ea80cd3f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6859
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When starting VM, there are error logs such as:
vfio_user.c: 510:acq_map: *ERROR*: Map ACQ failed, ACQ 3ffde000, errno -1
vfio_user.c:1043:map_admin_queue: *ERROR*: /var/run/muser/domain/muser1/1: failed to map CQ0: -1
vfio_user.c:1103:memory_region_add_cb: *NOTICE*: Failed to map SQID 1 0x3ffd8000-0x3ffdc000, will try again in next poll
This isn't the error case, because when the Guest memory hot add/remove from QEMU, vfio-user
target will stop and unmap all queue pairs and remap them again, so let's use a more friendly
log instead.
Also use a notice log when adding listener.
Change-Id: Iaa4dc29e02523b5e85ec716d200ec355f8a575ed
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6650
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add support in bdev_zone.h for getting the maximum zone append data
transfer size.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I61203e64d51601232c6578a090fa52975364c1f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6910
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The specification for Maximum Data Transfer Size (MDTS) says this field
should include the length of metadata, if metadata is interleaved with the
logical block data. However, some drives can support MDTS without counting
the interleaved metadata, so for this case SPDK will only use data length
without interleaved metadata length.
Change-Id: I29920a25885699e2689be043b87122367be0e416
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Although the value stored to 'rc' is used in the enclosing
expression, the value is never actually read from 'rc'
Fixes#1860
Change-Id: Id1001552e635968e373cad0fd27d7bda41d887cd
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7082
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When param len > 0, param data must not be NULL.
So we add a comment to make it clearer.
Change-Id: I053c3e45ddb8fa23fb67ce899d32dadd8e286946
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Remove the polling group check. Because at this moment,
the qpair is not added into a polling group. If we do
not remove it, we will never enable zcopy feature for
I/O qpair.
And in sock implementmentation, we already fixed the zero copy
handling if a socket is not in a polling group. See
posix_sock_flush function. So we can fix this issue if we directly
remove this check.
Reported by: Aleksey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I969936c4b6c7f13cbfa4d6eb479010c53f3e384a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Add print to confirm how groups/queues/engines are being
programmed based on the init RPC used.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic9462c19c6899478a803433f90d9db9249dd5ca1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
DPDK has added APIs for registering externally allocated
memory regions. Use them instead of doing our own thing.
We have to postpone spdk_mem_unregister call in
memory_hotplug_cb() because SPDK mutex (g_spdk_mem_map_mutex)
and DPDK mutex (memory_hotplug_lock) may overlap
and cause deadlock when one thread is calling spdk_free()
(locks memory_hotplug_lock first and then tries to lock
g_spdk_mem_map_mutex) and another one is calling
vhost_session_mem_unregister() (locks g_spdk_mem_map_mutex
first and then tries to lock memory_hotplug_lock).
Change-Id: I547b4ffc3987ef088a1b659addba1456ad760a71
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3560
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Purpose: The default value of placement_id in spdk_sock
should be set to -1 in spdk_sock_connect_ext. If we still let it to 0 and call
sock_get_placement_id for the spdk socket used in the initiator side,
we will never get the correct placement_id when enable_placement_id configuration
is configured, because we will always get placement_id = 0
instead. And the same comments in spdk_sock_accept function.
And this patch also change the judgement of placement_id in other related places.
PS: Why we need to explictly set default placement_id = -1, because when use
"enable_placement_id=2" for the socket, placment_id=0 is a valid value.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9fcc3a1c6a5007c22d11da5aeed0022577652a76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6955
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
nvmf_subsystem_remove_listener RPC handler may fail to remove
the listener (e.g. it doesn't exist) but in eror case we
spdk_nvmf_transport_stop_listen_async and send an error
response. In a completion callback passed to
spdk_nvmf_transport_stop_listen_async we try to send a
response again but the response handler had already been
released and we dereference a NULL pointer.
The fix is to skip spdk_nvmf_transport_stop_listen_async
in error case and continue with the subsystem resuming.
Fixes github issue #1821
Change-Id: I8d96b943cca25d9f95d19e8ea600242f019e6b21
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6699
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Updating thread stat directly in spdk_thread_poll()
will cover the time spend in msg process in interrupt
mode.
Change-Id: I9b71790281f10fb784ef4fd4059c41438bbaabac
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6722
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
thread_interrupt_msg_process is registered to thread's
fd_group, so it will be called inside spdk_thread_poll.
Since spdk_thread_poll will set/restore tls_thread,
there is no need to set or restore it again here.
Change-Id: Ida10c736ef904ff975eeb42fd0cccad9fd8317cf
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6720
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Move get_rusage directly into reactor_run(), then both
poll mode and interrupt mode can check rusage info.
Change-Id: Id5926752cfb19c13cb969fbfbb35f643e5d49d9a
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6718
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
In interrupt mode, reactor spends its valid cpu cycles
to process registered thread interrupt function. So we
can count idle_tsc and busy_tsc in it, and update
reactor's last_tsc in it.
Change-Id: I65f4ae7d3b1e5c7c5c06937d6855f5d1b5c0349f
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6716
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When we iterate qpairs that belong to a subsystem
and try to disconnect them, there is a chance that
some qpair can be disconnected on transport level,
e.g. the initiator may receive a disconnect for
the first qpair and disconnect others. That may lead
to a dead loop when we call spdk_nvmf_qpair_disconnect
with a callback, the callback is called immediatelly
and tries to disconnect the qpair again.
To solve this problem, move part of nvmf_poll_group_remove_subsystem
function to another function nvmf_poll_group_remove_subsystem_msg
which disconnects all qpair at once without any callback
and calls itself via thread_send_msg untill all qpairs are
disconnected.
Fixes github issue #1780
Change-Id: I1000cda73e6164917fc13f7f374366af90571b99
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6597
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The NVMe Zoned Namespace Command Set Specification has, in addition to a
Max Open Resources limit, a Max Active Resources limit.
An active resource is defined as zone being in zone state implicit open,
explicit open, or closed.
Create a function spdk_bdev_get_max_active_zones() in the generic SPDK
zone layer, so that this limit can be exposed to the user.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I6f61fc45e1dc38689dc54d5649c35fa9b91dbdfc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6908
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This change refactors the way nvmf_get_stats RPC works.
The RPC layer passes JSON write context to custom dump function defined within transport ops.
The RPC layer no longer needs to know the structure of transport poll group statictics.
Functions and structures used in the previous flow have been deprecated and will be removed.
JSON returned for RDMA transport should be the same as before this change.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: I03308c45be120793d316bf79814a1295afd9fb95
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6681
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Current update_core_mode is started from the next
core of the master core.
For futher's new scheduler, starting from master
core is required. This change won't impact current
schedulers' behavior.
Change-Id: Ibffd2c93a4288b5e87945ae523ccba88091c4031
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6757
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_nvmf_tgt_listen() is deprecated, so moved
the remaining instance to spdk_nvmf_tgt_listen_ext().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I32b54e99f83fa10f1074f80aad82bb0608c9ae11
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6630
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This API has been deprecated since SPDK 20.07,
see commit (b2947f52).
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Idb45906c81ea5682c6a67def0265910266d861b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
This patch refactor the pdu sending logic with the async manner,
then if the group contains the accel engine, we can use it.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I2d669c0a3255d7a8898441e406906add2f3a3556
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6759
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Purpose: To setup an accelerated function callback
for created spdk_nvme_poll_group. In this patch,
we just create the interface. The real usage of this
call back will be provided in the other patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I0d936aa4eba4dbfcc0137942156b9f2919eb5b78
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6758
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It is invalid to try to delete a NULL qpair, so do
not check for it in nvme_tcp_ctrlr_delete_io_qpair and
return an error when NULL. Just change it to an
assert instead. This makes it consistent with pcie
and rdma.
While here, add an assert in rdma as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic2f76deecb21b78749dac85e33fb1fa0d14a1239
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6917
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
spdk_bdev_io_complete_nvme_status() had set the bdev_io status to
NVME_ERROR even if it is aborted, i.e, sc is ABORTED_BY_REQUEST.
Fix it to ABORTED, and verify the fix by unit tests.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6b22547105a6d7986747053f93875854336959b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6884
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These three goto cases are using device->fd,
so put them in cleanup, it has no impact on
vfio_user_dev_setup failed.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I28028dda2977cf8158e703afa5b8af38c48f3d85
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6922
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I191ad5e3b153fb563256eba1aa695716f66db788
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6377
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This is an internal API used in several places. The call can fail, so
make sure it can report that correctly.
Change-Id: Iac0ed2c8299c9dd3d2556070278a2224c3807b7b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
It is possible that nbd pthread is created but not executed,
then spdk_nbd_stop is call before nbd_pthread's execution,
but nbd pthread starts to execute while nbd is totally stopped.
This patch can get spdk_stop_nbd aligned with nbd pthread.
Change-Id: I57cc92b94d36cd706616c9058134f716f0812892
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6278
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
This is better naming to represent their usage.
retry_poller and count can also be used to do
async nbd_stop procedure in the following patch.
Change-Id: Ie5a74e4add3f1a6c7257df00aded8b5d52a09955
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
spdk_vbdev_register() was deprecated in SPDK 19.04.
config_text field in spdk_bdev_module was deprecated in SPDK 20.10.
spdk_bdev_part_base_construct() was deprecated in SPDK 20.10.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib795ccdf61154c168032ccf8b81ea77e5e663851
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6628
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This statistic is incremented when we don't reap
anything from the CQ. Together with the total number
of polls it can be useful to estimate idle percentage.
Change-Id: I61b51d049b0bc506fb8a896e225187e46e75a564
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6295
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Leverage SO_INCOMING_CPU to get the CPU affinity of connections
(sockets). And allocate the connections to specific poll groups,
which aims to utilize cache locality.
From our test:
6 P4600 NVMe on target,target uses 8 cores, NIC irqs are bound to
these 8 cores, and initiator side uses 24 and 32 cores,
we can get 11%~17% randwrite performance boost for posix, and 8%~12%
for uring.
Change-Id: I011e0a21502c85adcccd4a14fbe9838b43f54976
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5748
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The shadow registers need to be zero when the qpair is
created. This happens automatically when a given qid
is used for the first time, since the page is allocated
with zmalloc. But if a qid is reused, we need to make
sure its shadow registers are cleared *before* we create
the qpair again with the same qid.
So clear the registers in nvme_pcie_ctrlr_delete_io_qpair,
just after the cq is deleted.
Fixes issue #1795.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I08c30d1ea248559a01b802cd132dd57199b491b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6752
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If the iov_len is 0, it is OK for the iov_base to be
NULL.
Reported-by: Yi Ren <yunye.ry@alibaba-inc.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I45c9be68fc2975bf2abd91a9d77935ce516c5210
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6706
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Previously we only process Read/Write/Flush IO commands, we should
not block the DSM command in vfio-user layer if the backend block
device can support it.
Change-Id: Ia6b90397adcc36015f331f011a5bdf3e3d6562d8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6525
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
max_delay_us was deprecated in SPDK 19.04.
config_file was deprecated in SPDK 20.10.
master_core/pci_blacklist/pci_whitelist were deprecated in SPDK 21.01.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie8be4c347de58044a7c3d5b1329d96e47ce084b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously the callback parameter for this function is NULL, this will
cause segment fault, so pass the correct parameter here.
Fix#1817
Change-Id: Ie768b7bf4a72862d16a44742ab3032803d0939a2
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6690
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Community-CI: Mellanox Build Bot
Can not remove device in the remove event
callback as we can not unregister the remove
callback. So use the alarm_set to fix this issue.
Fixes#1809
Change-Id: Ib86bc4eeecc0fe2bc51538e28684d015405e8835
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6553
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
nvmf_create_transport rpc parameter to configure the CQ size helps
if the user is aware of CQ size needed as iWARP doesn't support CQ resize.
Fixes issue #1747
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: Ia9ba2b5f612993be27ebfa3455fb4fefd80ae738
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6495
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
iSCSI library had used goto label to consolidate iscsi_reject()
calls but calling iscsi_reject() in return statements will be simpler
and easier to read. This patch series focuses on Data-OUT PDU processing,
and so change goto label to function call in return statements for
Data-OUT PDU first.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5f30aff764820aab87233ea8cf22263611591a96
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6533
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The spec does not disallow TEXT PDUs with no data. In that
case, just return immediately from iscsi_parse_params.
This avoids a NULL pointer dereference with a TEXT PDU that has
no data, but CONTINUE flag is set.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2605293daf171633a45132d7b5532fdfc9128aff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
instead appending to output file (which occurs on each make execution)
sed is used to modify `Requires` section of the *.pc file
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1a8cb1ec35bf583293c7174a413302191bbbd735
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6460
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These statistics can help to estimate efficiency of
Work Requests batching and show busy/idle polls ratio.
Send: the doorbell updates statistics for verbs
provider are incremented per each ibv_post_send call,
for mlx5_dv per each ibv_wr_complete call.
Recv: the doorbell updates statistics for both
providers are updated when either ibv_post_recv
or ibv_post_srq_recv functions are called.
Each qpair on initialization accepts an optional
pointer to shared statistics (nvmf/nvme poll groups).
If the pointer to statistics is not provided then
qpair allocates its own structure. That is done
to support cases when NVME RDMA initiator doesn't
use poll groups, so we can avoid checks that qpair
has statistics in IO path
Change-Id: I07dea603cb870b85ea23c42e8e2c4520b1c66252
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6293
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This change follows the large read which submits only subtasks, and
simplifies large write cases.
Associate the PDU which sends a SCSI Write PDU with immediate data
with both the primary task and the first secondary task. Then stop
incrementing reference count of the primary task twice.
As same as the last patch, copy the failure status directly among
the primary task and the secondary tasks because the primary task
is not submitted now. Then remove related data from struct
spdk_iscsi_task and related helper functions from conn.c.
Finally simplify unit tests for process_non_read_task_completion().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I54aa38c9b9fb7d7352da040dcdd8bcc1b1756a83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For the following nvme controller statemachine states:
NVME_CTRLR_STATE_IDENTIFY_NS
NVME_CTRLR_STATE_IDENTIFY_ID_DESCS
NVME_CTRLR_STATE_IDENTIFY_NS_IOCS_SPECIFIC
The statemachine can either:
- Jump to succeeding state
- If active ns list is empty, jump directly to NVME_CTRLR_STATE_CONFIGURE_AER
- In the unlikely case if NVMe completion error, jump to NVME_CTRLR_STATE_ERROR
Simply this such that we either:
- Jump to succeeding state
- In the unlikely case if NVMe completion error, jump to NVME_CTRLR_STATE_ERROR
This will help to reduce the complexity of the nvme controller statemachine,
especially considering that there are new additional states
(NVME_CTRLR_STATE_IDENTIFY_NS_DIRECTIVE and
NVME_CTRLR_STATE_CONFIGURE_NS_STREAMS) currently on review that would continue
with the bad habit of having three possible jump states instead of just two.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I3242052b1108afcd8adbe6d0378b1358fef58ec8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6521
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
bus_pci depends on pci so it shall be listed before, otherwise it can
result in linking issue e.g.
/usr/bin/ld: /home/jkalwas/spdk/dpdk/build/lib/librte_bus_pci.a(bus_pci_pci_common.c.o): in function `pci_parse':
pci_common.c:(.text+0x6e): undefined reference to `rte_pci_addr_parse'
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Idff446df82c37844edc122d5171e8ffa684b296f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6404
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Purpose: To get the optimal group, we need the socket information.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I17b048a402fbf002307dd225f64b20a9f876d642
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
This patch is used to leverage accelerated engine to compute
the data digest in the following case:
1 DIF is not used.
2 The data to compute is aligned with size 4, i.e, %4 = 0.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I51fb6e3ab04391062b244cba6e249c8e20d3180f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6014
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch added the chained crc32 support API for both batched
and non batched mode usage. And also update the accel_perf
program in order to use the revised accelerated crc32 function.
For example, you can use the following command:
./build/examples/accel_perf -C 4 -q 128 -o 4096 -t 5 -w crc32c -y
In this command, "-C 4" means that caculate the chained
crc32 for an iov array.
(even if you do not have the accelerated DSA hardware)
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ifede26f9040980b5791da8e5afef41177eede9f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6457
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
For a SGL using PRPs, there is always an alignment check of the start
address in the beginning of the loop. This is stored in start_valid.
If the start is indeed valid, we might fetch a new SGE,
and then perform a second alignment check on this new SGE.
However, this second alignment check is done unconditionally,
meaning that for the last SGE in a request, we check if the
same start address is aligned twice.
Only perform the second alignment check if we actually fetched
a new SGE.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I9df8038c650b0879f838d1d9d895e8dd7172840d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6493
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The statement causes this issue is:
assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL);
The call trace is:
The previous solution is:
commitid with: e71e81b631
But with this solution, it will always add the sock
into the removed_socks list even if it is not under polling
context by sock_group_impl_poll_count. So it will exceed the size of
removed_socks array if sock_group_impl_poll_count function will not be
called. And we should not use a large array, because it is just a workaround,
it just hides the bug.
So our current solution is:
1 Remove the code in sock layer, i.e., rollback the commit
e71e81b631. This patch is
not the right fix. The sock->cb_fn's NULL pointer case is
caused by the cb_fn of write operation (if the
spdk_sock_group_remove_sock is inside the cb_fn). And it is not
caused by the epoll related cache issue described in commit
"e7181.." commit, but caused by the following situation:
(1)The socket's cb_fn is set to NULL which is caused by
spdk_sock_group_remove_sock by the socket itself
inside a call back function from a write operation.
(2) And the socket is already in the pending_recv list. It is
not caused by the epoll event issue, e.g., socket A changes Socket B's
cb_fn. By the way, A socket A should never remove a socket B from a polling group.
If it really does it, it should use spdk_thread_sendmsg to make sure
it happens in the next round.
2 Add the code check in each posix, uring implementation module.
If sock->cb_fn is NULL, we will not return the socket to the active socks list.
And this is enough to address the issue.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I79187f2f1301c819c46a5c3bdd84372f75534f2f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6472
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add a function to get the number of max active zones for a zoned
namespace.
The value inside the identify namespace struct is a 0's based value,
where 0xffffffff means unlimited.
If unlimited, the addition will overflow and return 0,
which is the intended value to represent unlimited for this API.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia09e3db157ca0afadbd3ca4032eedd7bcd88248c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6443
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add a function to get the number of max open zones for a zoned
namespace.
The value inside the identify namespace struct is a 0's based value,
where 0xffffffff means unlimited.
If unlimited, the addition will overflow and return 0,
which is the intended value to represent unlimited for this API.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I4223146bc1ddf90486892a0af5fe5ce006dc5fd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6442
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch provides two new accelerated crc32c function interface.
And the next patch will be used to add the real support of chained crc32c feature.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I3f8dd55c3da636e29e5fb02fc229b51f05653cd6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6456
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When read is split, only secondary tasks are submitted. Hence we can
copy the failure status directly among secondary tasks and primary
task now.
Additionally, improve the comment in the source code to make us easier
to understand.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I857711dfaf90515231048f8c31c9273eac854d28
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6343
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This will make the current code simpler and make the following changes easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5a06f7e876fee03ed05d880525b594f92cadcdca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6410
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
data_buf was duplicated with data and was not necessary. Hence
remove it and use data instead in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I207047ce73d938f83e39f1454d44a9e4bba6b2f7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6407
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This patch follows the last cleanup.
Factor out reading PDU payload operation from iscsi_read_pdu() into a
helper function iscsi_pdu_payload_read(). This reduces the nesting
level, improves the readability, and make the following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie5f51eedefe00f3b43a7b45dcf84be79f8df4e27
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6414
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
For the logic in ISCSI_PDU_RECV_STATE_AWAIT_PDU_PAYLOAD case,
this change will make it easier to read.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iccc582dd5c749c60b3d22b2b9b73fb8407e59b0d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6360
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK block devices can only be resized up when
it is open. So there is no need to pause the
associated namespace itself when resized - just
pausing the subsystem is enough.
Also modify the ns_hotplug_test to do null bdev
resizing - this will help test this resize code path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3cb7b9de0892c296f2abf2280bed434d18ebe6b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6467
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Provide a default stub definition for spdk_pci_device_claim/unclaim
for non-linux platforms, rather than just for FreeBSD.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: Ica45d967878582d9a58e37b088eba4bf0d94104e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6464
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
In _set_thread_name, use pthread_setname_np as the default for
platforms that are not Linux or FreeBSD; it's the most common
'non-portable' pthread extension used to set the thread name.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: Ia841166f0537cd1303eded15bc7ef1a9f03e3b6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6465
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
bdev channel is used in nbd fini process, so it should
be released in the latter part of nbd_stop
Change-Id: I87edea63d2d91954cc41cdb71261485ae24c0d9f
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6280
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Similar issue was fixed in
813869d823
nvmf: Fix possible race condition when adding IO qpair
This patch fixes the same issue which occurs a bit later,
when a message is delivered to another thread. This issue
occurred on CI, callstack is the following:
00:11:46.296 #6 0x00007f2705199f05 in __ubsan_handle_type_mismatch_v1 () from /lib64/libubsan.so.1
00:11:46.296 No symbol table info available.
00:11:46.296 #7 0x00007f27067ace6f in ctrlr_add_qpair_and_update_rsp (qpair=0x221edc0, ctrlr=0x1dc4ea0, rsp=0x2242918) at ctrlr.c:230
00:11:46.296 __PRETTY_FUNCTION__ = "ctrlr_add_qpair_and_update_rsp"
00:11:46.296 __func__ = "ctrlr_add_qpair_and_update_rsp"
00:11:46.296 #8 0x00007f27067b1d0b in nvmf_ctrlr_add_io_qpair (ctx=0x2242540) at ctrlr.c:534
00:11:46.296 req = 0x2242540
00:11:46.296 rsp = 0x2242918
00:11:46.296 qpair = 0x221edc0
00:11:46.296 ctrlr = 0x1dc4ea0
00:11:46.296 __func__ = "nvmf_ctrlr_add_io_qpair"
00:11:46.296 #9 0x00007f27062553ce in msg_queue_run_batch (thread=0x1cff540, max_msgs=8) at thread.c:553
where line 230 in ctrlr.c was
assert(ctrlr->admin_qpair->group->thread == spdk_get_thread());
That means that admin qpair was disconnected from the poll
group and controller is in the process of destruction
Change-Id: I818ba56adda5ed3488a8df78483c0b6839758192
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6364
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We already have support for spdk_nvme_zns_zone_append(),
add support for spdk_nvme_zns_zone_appendv() (zone append with
NVME_PAYLOAD_TYPE_SGL).
_nvme_ns_cmd_rw() currently performs verification of the SGL,
if the parameter check_sgl is set. This parameter is set for all
calls with payload of type NVME_PAYLOAD_TYPE_SGL.
In order to be able to perform the same check_sgl verfication on
zone append vectors, we need to refactor _nvme_ns_cmd_rw() a bit.
Setting check_sgl ensures that _nvme_ns_cmd_split_request_sgl() or
_nvme_ns_cmd_split_request_prp() gets called.
These functions will split an oversized I/O into several different
requests. However, they also iterate the SGE entries, verifies that
the total payload size, total SGE entries is not too many, and that
buffers are properly aligned. A proper request will not get split.
For zone append, splitting a request into several is not allowed,
however, we still want the verification part to be done, such that
(e.g.) a non first/last SGE which is not page aligned, will cause
the whole request to be rejected.
(In the case of spdk_nvme_ns_cmd_write(), a non first/last SGE which
is not page aligned will instead cause the request to be split.)
An alternative would be to try to rip out the verification part from
_nvme_ns_cmd_split_request_sgl() and _nvme_ns_cmd_split_request_prp().
However, that is non-trivial, and would most likely end up with a lot
of duplicated code, which would easily get out of sync.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I2728acdcadeb70b1f0ed628704df19e75d14dcca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Use the new function spdk_nvme_zns_ns_get_zone_size_sectors() where
it is appropriate (in comparison to the existing
spdk_nvme_zns_ns_get_zone_size() variant).
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ic929ffbc5a1f4a16ba6719a985c05ae625caed46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6417
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Sometimes it is more optimal to get the zone size in number
of sectors, instead of in number of bytes.
Therefore, add a new spdk_nvme_zns_ns_get_zone_size_sectors()
function to get zone size in number of sectors.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I0fe67e00a3d74dd27acfc895ae97448d995b89a3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In order to make sure we do always response to the kernel module if
there are valid commands in the socket. If we do not see this,
we will see stuck request kernel info in nbd module. And the kernel
will print the timeout message of nbd module again and again.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I2ecc3e9c948231a712778f0126e2ecc6220e1d3c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6276
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
The current implementation treats HPDA/CPDA as the absolute offset
to the beginning of the PDU where the payload data starts. This is
incorrect. The HPDA/CPDA actually specify where the payload data
should start such that the starting location is a multiple of HPDA
(for C2H PDU) or CPDA (for H2C PDU or CapsuleCmd PDU).
The other issue fixed is that the current implementation calculates
padding only when header digest is enabled. This is also incorrect.
Signed-off-by: Wenhua Liu <liuw@vmware.com>
Change-Id: If7a3896a4c1d73f6d062bd3dbe6a912d31771180
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6256
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For correct behaviour, pthread_mutex should not be locked after it has
been destroyed.
g_bdev_mgr.mutex is statically initialized. It is destroyed in
bdev_mgr_unregister_cb, but not re-initialized in spdk_bdev_initialize.
Repeated calls to initialize/unregister occur during unit tests.
Remove the destroy from bdev_mgr_unregister_cb, which seems
the simplest way of resolving the issue.
The sequence: spdk_put_io_channel(), spdk_bdev_close(),
spdk_bdev_unregister() occurs during unit tests.
spdk_bdev_unregister() destroys internal.mutex which is then
locked by a call to bdev_channel_destroy() resulting from the
earlier spdk_put_io_channel(). Move the destroy and the free of
internal.qos into bdev_destroy_cb so that they don't occur until
all of the channels have been released. Remove the no longer
required bdev_fini.
Repeat calls to spdk_bdev_unregister that occur after an unregister has
completed will lock internal.mutex which has been destroyed by the
previous unregister. This occurs during unit tests. Defer locking
internal.mutex until after the internal.status has been checked for
SPDK_BDEV_STATUS_REMOVING. This is the only place where
internal.status is set to removing and g_bdev_mgr.mutex alone is
sufficient to ensure atomicity here.
Tested with a pthreads library that contains debugging code to
check the mutex state and a modified version of bdev_io_types_test
to call get_io_channel on a different thread.
Suggested-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I81cc46a1b8a766700253829b19cc86c7f0eb79f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6217
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Vhost is enabled by default, so rte_net was always included.
When disabled, rte_power failed as it depends on rte_ethdev and rte_net.
rte_vhost was only possible to enable on Linux, so there
is no conflict with adding it next to rte_power under this condition.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2e183004d6457e404471740a0540dcb08aa738d8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6398
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Identify application prints the PMR details if it is supported
Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Change-Id: Iaba4c15e18e1402035b11a34b2defe8078855751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6209
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In dpdk 19.11 version, RTE_VHOST_USER_ASYNC_COPY is not define.
After dpdk 20.08, we can use RTE_VHOST_USER_ASYNC_COPY.
Use version check to avoid this problem.
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: Iaf9914e8380f3d54cded1e2f16af6a7dc3504f95
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6274
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
There is a special case when using 8-byte metadata + PI + PRACT
where no metadata is transferred to/from controller.
Since _nvme_ns_cmd_rw() already calculates the proper sector size
using _nvme_get_host_buffer_sector_size(), which takes PRACT into
account, change the sectors_per_max_io calculation to also take
PRACT into account.
This will avoid certain requests that don't need splitting getting
split.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I8d450d37c2458453701189f0e0eca4b8fe71173b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6247
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
After setting io timeout, host can avoid nbd io
stuck or kernel hang occasionally caused by nbd
stop or underlying bdev removal.
Change-Id: I4ba2a0af7ff7bed369cdaf86121f082136dc1a0b
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6191
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Aspects of bit fields are 'implementation defined'. On some platforms
alignment will occur if two adjacent fields are of different types. This
occurs in spdk_nvme_feat_async_event_configutation after the crit_warn
member which is effectively an int8_t, followed by an int16_t. There
isn't a generic way of changing the compiler's behaviour, so the best
options are:
- Change crit_warn to a uint32_t bit field and copy the value to/from
a spdk_nvme_critical_warning_state variable to use it. This requires
changes to code using the field.
- Adjust the structure definition to use smaller types to avoid the
problem. This preserves existing semantics, but the field order will
need to be reviewed if big-endian support is ever added (other places
in nvme_spec.h will need similar attention). A second reserved field
is required.
Use smaller types which seems the most straightforward option. Adjust
the use of the spdk_nvme_feat_async_event_configuration reserved fields
in lib/nvmf/ctrlr.c.
The new structure is binary compatible and the fields behave in the same
way, with the exception of an additional reserved field, so updating
CHANGELOG.md probably isn't necessary.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I7d8163c84b4f410fc95a5b7064506ad7b4b62c6c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6340
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
__builtin_clzl takes an unsigned long argument which may be smaller
than uint64_t on some platforms. GCC silently ignores the mismatch,
returning the wrong answer at runtime. Use __builtin_clzll instead and
add static assertions to detect the issue.
Attribute 'target_clones' requires 'ifunc' support which only applies to
ELF targets. Add check for defined(__ELF__).
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: Iff76640b34223649de531250ad40471d829512c7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
insert_queue() will copy it to internal data structure, so that
before successful map we don't need to consider the error path.
Change-Id: Id7ea2ef73da7914ea430ea568e7981657016d3f7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6310
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The CQ is created first, so it's more reasonable to connect
the IO queue pair after creating the SQ.
Change-Id: I196c19a54a015310a3777d9bfca7db8735a4d5b2
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6309
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
add_qp() function is only called when creating NVMe SQ/CQ, so unpack
it into the caller to make the code more clear.
Change-Id: Id5cc1152b1684df980909b2f7d73ed2788c0efb2
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6308
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The clients(QEMU and NVMe vfio-user driver) use shared memory with
NVMf vfio-user target for zero-copied IO processing, when memory
region hotplug happens from clients, the backend target can get
notification via callbacks. Here we rename them to reflect the
action. For now NVMe vfio-user example applications use static
memory model and single memory segment.
Change-Id: Icecbe13883668dd8267019b5fe57d0fef1c68b81
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6307
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We can use the NVMf library ABORT implementation directly, so remove
it in vfio-user.
Change-Id: I0f204a869c53c6a6ce67ad900a64d5bb59ac2aab
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6306
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The group poll context is for queue pair state, so we don't need to
check controller state here, and for the disconnect case below, the ADMIN
queue pair will be removed from group poll.
Also add spdk_unlikely in the poll context.
Change-Id: I5ef32ef3cf41ad757a7cb167e1e1fa32c52a84d6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6227
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Zone append commands cannot be split.
_nvme_ns_cmd_rw() should never cause a NVME_PAYLOAD_TYPE_CONTIG
zone append request to be split.
This is currently true, but add an assert to make sure that
any refactoring to _nvme_ns_cmd_rw() does not break this promise.
Also add error handling, since release builds are built with
asserts disabled.
Follow-up patches will refactor _nvme_ns_cmd_rw().
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I5fd2440c4c9d6bd8d56f30354b208a9047b64729
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6246
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
This function allows applications to specify whether
they wish to allow probing a newly attached NVMe
PCIe SSD.
The env layer will only even probe devices that have
been allowed. By default, this is all devices, but
if the user has specified some list of
allowed PCI addresses (via spdk_env_opts pci_allowed)
then newly attached PCIe devices are implicitly not
allowed. This API allows applications to add
device addresses to the allowed list after the
application has started.
This API will be useful for use cases where multiple
SPDK processes are running on one server, and assignment
of PCIe SSDs to those processes are based on some function
of the SSD's PCIe address.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I919bc267f2ad9130ab5c875ff760a301028b047e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6184
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
The env layer has a pci_allowed list, which specifies
that only a subset of PCI devices may be attached
by the associated process.
But that doesn't cover PCI devices that are hot-inserted
after the application starts, which is common for
storage/NVMe.
So add a new spdk_pci_device_allow() API which allows
an application to add new devices to the allowed list.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7bd5ff428d84480d46bc236698daadd019b20b8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6183
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Fixes#1777.
When a qpair cannot be allocated because the transport connection fails,
the qpair was freed without unlinking it from the other structures.
This was leading to a segfault when attempting to create and free other
qpairs.
Also added a unit test to cover this case.
Change-Id: I74b78d1847f90117248b07203b43a11ff5cfa5d6
Signed-off-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6272
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
To allow SO_MINOR updates on LTS for the whole year it is supported,
the major version for all components needs to be increased.
This is to prevent scenario where two versions exists with matching
versions, but conflicting ABI.
Ex. Next SPDK release adds an API call increasing the minor version,
then LTS needs just a subset of those additions.
Increasing major so version after LTS, allows the quarterly releases
to update versions as needed. Yet allowing LTS to increase minor
version separately.
Disabled test for increasing SO version without ABI change, as
that is goal of this patch. This check shall be removed with SPDK 21.04
release.
This patch:
- increases SO_VER by 1 for all components
- resets SO_MINOR to 0 for all components
- removes suppressions for ABI tests
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I44d01154430a074103bd21c7084f44932e81fe72
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6167
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In dpdk 19.11, rte_kernel_driver is the old version, add version check before use the members.
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: Ic1db37cc0760c7d03692fd2cdcbb6ff1e41f872d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6252
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Developer convenience - make this based on a specific version.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I228b8aff6e8957cad5e8c1fae5615b113e16cfb5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5950
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot