rdma_reg_msgs() was replaced by ibv_reg_mr() recently to support
persistent PD per RDMA device. The difference between rdma_dereg_mr()
and ibv_dereg_mr() is only return value and errno. For consistency,
replace rdma_dereg_mr() by ibv_dereg_mr().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I55e0743690e74f9510863bfa122a75d0632dce4e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13949
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Get a PD for the device from the PD pool managed by the RDMA provider
when creating a QP, and put the PD when destroying the PD.
By this change, PD is managed completely by the RDMA provider or the hooks.
nvme_rdma_ctrlr::pd was added long time ago but is not referenced
anywhere. Remove nvme_rdma_ctrlr::pd for cleanup and clarification.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: If8dc8ad011eed70149012128bd1b33f1a8b7b90b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13770
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
SPDK NVMe RDMA initiator used the default PD per RDMA device. Default PD
may be changed when all QPs for the RDMA device are destroyed and created
again.
For multipath, the RDMA zero copy feature require the PD per RDMA device
to be persistent when all QPs for the RDMA device are destroyed and
created again.
Maintain such persistent PDs in this patch.
Add two APIs, spdk_rdma_get_pd() and spdk_rdma_put_pd().
In each call of two APIs, synchronize RDMA device list with
rdma_get_devices().
Context may be deleted anytime by rdma-core. To avoid such deletion,
hold the returned array by rdma_get_devices().
RDMA device has PD, context, ref. count, and removed flag. If context
is missing in rdma_get_devices(), set the removed flag to true. Then,
if the ref count becomes zero, free the PD and the RDMA device.
The ref. count of a RDMA device is incremented when spdk_rdma_get_pd()
is called and decremented when spdk_rdma_put_pd() is called.
To simplify synchronization, sort the returned array by
rdma_get_devices().
To avoid resource leakage, add destructor function and free all PDs
and related data at termination.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I093cb4ec2c7d8432642edfbffa270797ccf3e715
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13769
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
An earlier commit added ctrlr_ready into struct
spdk_nvme_transport_ops. However, the major SO
version was not increased.
Fixes: 3dd0bc9e (nvme: Add transport controller ready step)
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id903634f9aaf5bdaa62fd30e92a4fb39a985b86f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13981
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Instead of passing each parameter to create a module, just have the user
make one and pass it in. This makes it easier to change the module
definition later.
Change-Id: I3a29f59432a6f0773129d7b210fbc011175b2252
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13909
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Masked by how accel_perf was doing decomp verificiation which is
changed in the next few patches and verifies these fixes.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Icb03fc169bf8d2f05396addaf1db56d6de1827d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13038
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Docs explaining how to use the RPC are in the next patch in the
series.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7dab8fdbeb90cdfde8b3e916ed6d19930ad36e66
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12848
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
submit_inflight_desc() actually do some meaningful work, so when it really process tasks, the poller should return BUSY status.
Signed-off-by: YafeiWangAlice <yafei.wang@samsung.com>
Change-Id: I2103cea6d28e8b355dad4ddd603d917f10e44c08
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13486
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There may be for_each operations outstanding on an
io_device when it is unregistered. Currently we just
return when this happens, not unregistering the
device but also not notifying the caller that this
happened (since it returns void, and the callback
function doesn't have a status parameter either).
We could just push this responsibility to the caller,
to never unregister an io_device if it knows it has
outstanding for_each calls waiting to complete. But
I think we can simplify this a lot by just handling
this inside of the thread library. Mark that the
device is pending registration, and unregister it
(on the original requesting thread!) when the
for_each count gets back to zero. Also don't
allow any new for_each operations either.
Note this requires a bit of refactoring on the
thread unit tests, since it is now possible to
unregister a device with outstanding for_each
operations.
Fixes issue #2631.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I978f2d99a25e65d2b7d71ce9b1926a79a6c94263
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13890
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If spdk_for_each_channel is called on a device that doesn't
exist, we need to set a non-zero status (-ENODEV in this
case) to the completion function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I898ad5ea499fb6087338b621b2befcadd6a05414
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13889
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This has been implicitly required before, and all
in-tree apps (except accel_perf) set it, so let's
explicitly require it. This name gets used for
things like the shm name for spdk trace event file.
While here, add the name for accel_perf.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I47a22466550d4b31bacafee58d30339b4f22f4b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13876
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When decoupling a snapshot from its parent, we need to clear its parent.
So we should remove the xattr BLOB_SNAPSHOT. Modifying the xattrs of a blob
only works if its metadata are not in read-only mode.
By default, a snapshot is in read-only mode so this operation fails. When we
later want to delete the snapshot, we will see that it has a parent, so we will
try to remove the snapshot from its parent's clones list. This will cause a
crash.
The fix is to remove the BLOB_SNAPSHOT xattr only after setting the snapshot's
metadata in rw mode.
Signed-off-by: Alex Michon <amichon@kalrayinc.com>
Change-Id: I80efa6dd3dcb38b4c738ce2e97aa2ffc281cefa5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13723
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
DPDK may use this NULL pointer to access its member,
And then got segmentation fault. But we only need it
exit or report normal error.
To minimize the impact, and to prevent these going on,
we add check the error return for creating NULL mempool
in spdk_thread_lib_init_ext in spdk_reactors_init.
when error returning from spdk_thread_lib_init_ext in spdk_reactors_init.
It contains thread_lib_init which reports error for failed mempool.
Thus, codes will return and will not cause segmentation fault.
Fixes issue #2620.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I63369fdaeb231196e8f8daa826eb5b057ed829b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13842
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Here should return -ENOMEM, and other places are
changed.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: Id81cd7485733e66d996b1501061a45f774f2b51a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13863
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Parameter `num_queues` for virtio_scsi PCI device means
maximum number of queues, it SHOULD include the `eventq`
and `controlq`, while for `vhost_user` RPC call, it means
the number of IO queues, so here we use it as `max_queues`
in lib/virtio and add the fixed number queues for `vhost_user`
SCSI device.
Also fix `vhost_fuzz` to get `num_queues` earlier than
negotiate the feature bits.
Change-Id: I41b3da5e4b4dc37127befd414226ea6eafcd9ad0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13791
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The `vhost_user` socket transport APIs are already in the
same source file, so just call the function directly.
No code logic changes in this commit.
Change-Id: If471b9b0166d43591fb8614e95a17473c964e87c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13789
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Similar with NVMe device driver, here `virtio` is a specification
abstraction library, `pci` and `vhost_user` are transports layer,
here we merge vhost_user.c and virtio_user.c into one new source
file `virtio_vhost_user.c` so that to make code more clear.
No logic change, just code movement in this commit.
Change-Id: I8e3e5c477e7c45e6eeebad240b8cc3c9476b86d1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13788
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
While running into this function, even the subsystem can't be
destroyed due to error subsystem state, it's better to continue
the execution.
Continue to fix#2590, QEMU is stuck for the failure case, and
nvmf target should process such error because it may support other
normal subsystems at the same time.
Change-Id: Ib05e24996378b52070d2b760519f476f9b2d7e76
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13839
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The RPC provides a list of initialized engine names along with
that engine's supported operations.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I59f9e5cb7aa51a6193f0bd2ec31e543a56c12f17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13745
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
In prep for upcoming patch that will provide an RPC to override
and automatic assignment of an op code to an engine.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I17d4b962fb376a77f97ce051a513679d0fba698e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12829
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
fix: If the first six characters of two scsi lun's name are the same,
such as aaaaaa0 and aaaaaa1, so do theirs naa identifier
Signed-off-by: Bin Yang <bin.yang@jaguarmicro.com>
Change-Id: I4e0541b372a0e20e95e0a24d62dd3d85b7abe230
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13824
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is another preparation to create and use ibv_context and pd.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Id594fa1ccb2daf535b1aaaef0a397bda2ec98578
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13710
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The following patches will create and use ibv_context and pd
explicitly instead of using default ibv_context and pd created
by rdmacm.
As a preparation, pass pd instead of cm_id to nvme_rdma_reg_mr().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: Denis Nagorny <denisn@nvidia.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ifdcd18ed363b8ba4a23a920bf3559237e38821c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13599
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk in interrupt, reactor dosen't correctly handle exited threads,
causing vhost threads still in reactor's lw_threads list. The fix
will do cleanup thread when it's state becomes EXITED. Though it's
exposed in v22.05.x, but the master branch also has the problem.
We will do this as below:
(1) When thread's state becomes SPDK_THREAD_STATE_EXITED, reactor
process thread exits first.
(2) Then reactor do remove lw_thread and destroy it.
Fix issue: #2574
Signed-off-by: Apokleos <oliverliyn@gmail.com>
Change-Id: I3ac2681d70480563db3a0aee4aff61c2f272b140
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13706
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If Controller Fatal Status (CFS) bit is set, there's no point in waiting
for CSTS.RDY and the only way to move forward with the initialization is
to perform a controller reset.
This fixes issues with test/nvme/sw_hotplug.sh when running under qemu.
It seems that during that test, qemu marks the emulated NVMe drives as
fatal, so if we didn't check CSTS.CFS, the initialization would time
out.
Fixes#2201.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I97712debc80c3dd6199545d393c0f340f29d33b2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13820
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michal Berger <michal.berger@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Sometimes VM may get a kernel panic when starting, and SPDK CI will kill
`nvmf_tgt` after 60 seconds, and for this exception, SPDK will raise an
assertion when destroying the subsystem, while here, we remove this
assertion and print the error information.
CI will still mark this case as a failed case, then we can use this error
information to understand error subsystem state in vfio-user.
Fix issue #2590.
Change-Id: I20b16f9e96a566730eca2dd9ea165645bd9160bd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13773
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
this is a prework for further changes - with lock on generic layer
lock on specific transport (e.g. tcp, rdma) layer becomes optional
possibly it won't be required if some contract introduced on public
interfaces (to be considered)
- spdk_nvmf_poll_group_[create|destroy]
- spdk_nvmf_tgt_listen_ext, spdk_nvmf_tgt_stop_listen
- spdk_nvmf_get_optimal_poll_group
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ib132babf9e7022342129fe795991cdad834e7f53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13665
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When there are not enought transport buffers for
multi SGL request in state NEED_BUFFER, WRs
received from the data_wr_pool are returned back
to the pool. However rdma_req->data.wr.next pointer
still points to the first WR from the pool. Usually
it doesn't cause any problems since rdma_req will
try to fill buffers again, but when qpair is being
destroyed, all requests are completed forcefully.
When the request is completed and data.wr.next
pointer is not NULL, we'll try to put already
released WRs into the pool one more time.
That corrupts the pool and leads to undefined
behavior.
Fixes#2541
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I238b92eec132d8d845330362af6f335421177454
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13760
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The function `nvme_ctrlr_init_ana_log_page` is exactly
same with `nvme_ctrlr_update_ana_log_page`, so remove it.
Change-Id: I1ad51635f47cf95cfa6de217e3b9144885c3b74e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13652
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
allow DSA device to async offload crc32 calculation in nvmf-TCP
This patch can use DSA to accelerate crc32 computation, making
the io performance of TCP paths using crc32 approach the io
performance of TCP paths that do not use crc32.
Using SLIST to minimize the performance drop. SLIST has less
operation compared to TAILQ.
Thinking about memory thrashing, we should use the same memory as
possible to receive new PDUs. So, insert newly freed PDU in to head
is better.
The performance drop is within 1% compared to the TCP path without
crc32.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I480eb8db25f0e730cb198ca5ec19dbe3b4d38440
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11708
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We have to process whole QoS queue on each QoS poll. It may contain
IOs that still have quota or not affected by QoS rules at all. If we
stop on the first queued IO, all IOs will be limited by the minimum
QoS rule even if they're not affected by this rule.
Here is an example and simple test. We have a NVMf target with Null
bdev and QoS configured with read bandwidth limited to 10 MB/s and
write bandwidth limited to 100 MB/s. First we start nvme_perf with
only write IOs and we see that reported bandwidth is 100 MB/s. Then we
start another instance of nvme_perf with only read IOs. We see that
reported read bandwidth is 10 MB/s but we also see that write
bandwidth also drops to 10 MB/s.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I1edf09d038e65f873deef19ecb0f4bf9725a5ca5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13767
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Original implementation creates pollers and CQs for all discovered
devices at poll group creation. Device (ibv_context) that has no
references, i.e. has no QPs, may be removed from the system and
ibv_context may be closed by rdma_cm. In this case we will have a CQ
that refers to closed ibv_context and it may crash in ibv_poll_cq.
With this patch pollers are created on demand when we create the first
QP for a device. When there are no more QPs on the poller, we destroy
the poller. This also helps to avoid polling CQs that don't have any
QPs attached.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I46dd2c8b9b2902168dba24e139c904f51bd1b101
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Both PCIE and VFIO-USER can use the same APIs to get IO queue
pair statistic data, so merge them here.
Change-Id: Iadf9ead2bd5abaf11d2ef5d1884acb67369f85bb
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13538
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
When a bdev is registered, it is examined by the bdev modules before the
bdev register even is notified.
Examination may be asychronous, e.g. when the bdev module has to perform
I/O on the new bdev.
This causes a race condition where the bdev might be destroyed while
examination is not finished. Then, once all modules have signaled that
examination is done, `bdev_register_finished` makes an invalid access to
the freed bdev pointer.
To fix this, defer the unregistration until the examine is completed by
opening a descriptor on the bdev.
Change-Id: I79a2faa96c1c893fc1cee645fbe31f689b03ea4a
Signed-off-by: Nathan Claudel <nclaudel@kalray.eu>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13630
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When a secondary process exit without deleting allocated IO
queue pair, then a new secondary process will do cleanup for
previous allocated queue pair, then segment fault will happen
due to `stat` inside IO queue pair data strucutre can't be
accessed in this cleanup process.
Fix issue #2565.
Change-Id: I01a037642683901941b5268ac20d17b78b6c6350
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13537
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This avoids conflict with public vhost_user.h header
file which can cause problems with abidiff.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia258b4621eda9f6855d46bbf67d8369a053a7116
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13732
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This avoids conflict with public vmd.h header which
can cause problems with abidiff.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2f00c07226dec273516868f5fa9d7aa384378308
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13731
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Avoids conflict with public tree.h that can cause
problems with abidiff.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ccf4c0198f7975d8ebbee57f50c52f9f2e96fc0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13730
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This avoids confusion with the public idxd.h
header file which causes problems with abidiff.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7910c93d9d95b99c82f4dfdba845e6804e1b6568
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13729
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When doing DIF insert and strip, we will reserve extra
buffer in block device layer to save DIF information,
so when attaching one device to Namespace, we will
check the value first so that the reserved buffer
size isn't smaller than metadata size.
Change-Id: Id9272886ce8a7c01271279686730af4e5b24f35a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12188
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
When `dif_insert_or_strip` is enabled, NVMf library will do
DIF insert and strip automatically, client isn't aware of
it, when `dif_insert_or_strip` is disabled, we will report
Namespace E2E Protection Capabilities to client, but we
don't process PRACT and PRCHK flags in NVMf library, so
here we don't report the capabilities to client and leave
the use of extended LBA buffer to users.
Change-Id: Ic610dc65fef210a7799c6ab693d89138b99e1193
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12165
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Some of the options in sock_impl_opts could be different for different
sockets (even if they're using the same impl). However, outside of a
few selected options (recv_buf_size, send_buf_size), there was no
interface to change them.
This change will allow users to change impl_opts on a per-socket basis
when creating a socket. Sockets created through accept() inherit
impl_opts from the listening socket.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I7628ae19def25cef6ffa62aa54bd34e446632579
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13661
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Now that spdk_sock has impl_opts, we no longer need to store a copy of
impl_opts.zerocopy_threshold in spdk_sock.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I96377e330351b1afb57811578acfadf05d53f49c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13660
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Also remove ftl_mngt_get_status() because it won't be necessary now.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Change-Id: I335831cb1c506379e9afeb0bf87f1f873033073d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13668
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
it is impl and used only in nvmf.c source file
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I1236f9ede28c5da313d118ce73e1da64381379c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13664
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
That is possible to get/set registers from any thread,
during regs processing we are polling admin qpair to
get a completion. At the same time, another thread
can also poll admin qpair and that can lead to
undefined behavior.
This patch fixes an issue when bdev_nvme is configured
with io_timeout. If remote target becomes unresponsive
(e.g. due to link down), IO timeout occurs and bdev_nvme
tries to get csts registers in timeout_cb. At the same
time another thread can process adminq, so we may have
2 simultaneous adminq polls. If admin qpair is disconnecting
at that time (RDMA transport) we may destroy resources
twice from different threads.
We don't see a problem with set_regs function but it
won't be redundant to lock mutex in set_regs as well.
Signed-off-by: Alexey Marchuk <alexeymar@nvidia.com>
Change-Id: I7ec3984d25d0249061005533d13b22315b44ddf2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13687
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We cannot count AERs as outstanding IO for purposes
of subsystem pause, because we cannot expect them
to be completed. Previously we would account for this
in nvmf_ctrlr_async_event_request() by decrementing
the counter, but this did not consider cases in the
calling function (nvmf_ctrlr_process_admin_cmd) where
an AER might complete with error before this function,
resulting in the counter getting stuck indefinitely
with a >0 value.
Rather than adding a decrement in all of those
error cases, do a single check at the beginning
of nvmf_ctrlr_process_admin_cmd, and remove the
one from nvmf_ctrlr_async_event_request.
Fixes issue #2215.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ica969f116d80dfba0168369ff2fba9a4a42fc076
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13678
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
This patch defines a new function, spdk_sock_readv_async(), which allows
the user to send a readv request and receive a callback once the
supplied buffer is filled with data from the socket. It works simiarly
to asynchronous writes, but there can only be a single outstanding read
request at a time.
For now, the interface isn't implemented and any calls will return
-ENOTSUP. Subsequent patches will add support for it in the uring
module and as well as emulation in the posix module.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I924e2cdade49ffa18be6390109dc7e65c2728087
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12170
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In NVMe/TCP target, the socket low water mark is set to
sizeof(struct spdk_nvme_tcp_common_pdu_hdr), which is 8 bytes.
In corner test, there might be 4 bytes data packet sent to
NVMe/TCP target, after that, if there is no more data sent to
the same socket, the 4 bytes won't be read by NVMe/TCP target
qpair thread. Because of this, there is a IO request didn't
complete in initiator. Then, if manual call the readv function to
read the 4 bytes for the pdu in target, the io request complete
normally in initiator. It seems like the pdu might be split,
and in the situation, the IO request will not complete until
new IO request reach.
After set low water mark in NVMe/TCP target to 1 byte, just
like iscsi target done, the issue disappear immediately.
Signed-off-by: BinYang0 <bin.yang@jaguarmicro.com>
Change-Id: I59d3d900f0b25632d786ef25ab096eabe43476bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13633
Reviewed-by: <chuanwei.ji@jaguarmicro.com>
Reviewed-by: Qingmin Liu <qingmin.liu@jaguarmicro.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It is a nicer API to allow users to use an
md-related API such as spdk_bdev_read_blocks_with_md
passing md_buf as NULL to mean "don't read metadata".
This avoids the need for an if-statement in the users
code to check if the md buffer is NULL before deciding
which API needs to be called.
This basically requires two changes:
1) only check if the metadata is separate for the bdev
if the md_buf != NULL
2) do not fail if the buffer is specified but the
md buffer is not (we only need to fail the case where
the md buffer is specified but the data buffer is not)
Note that spdk_bdev_readv/writev_blocks_ext was already
allowing the metadata buffer to be NULL, but change
those functions too to match the others on how we check
if the data buffer isn't allocated.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I764cf49b9f573fccb19e73876a376fd231cc3580
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13612
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will allow the code calling this RPC to interpret the error and
check whether the transport already exists (-EEXIST) or some other error
occurred.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8c4af84763ddba908c59ff881b09834a439186a8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13577
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
In the multi-process case, a process may call `spdk_nvme_ctrlr_free_io_qpair` on
a foreign I/O qpair (i.e. one that this process did not create) when that qpairs
process exits unexpectedly.
The variable `qpair->poll_group` isn't multi-process safe, we can't use it
in `spdk_nvme_ctrlr_free_io_qpair` and related transport poll group APIs.
Change-Id: Ic13a6a2c7d760477be5be5a56a45caa2b5518717
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13573
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
It will not find the h2c related reqs in the tailq now.
We can get it from tqpair->reqs directly.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I25f0900e875b054d7617450477e9719e7a59aa18
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12861
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Currently we initialize pending_bytes only in pre-copy state. This is
pointless since we don't generate any migration data at this state, so
if the vfio-user client reads migration data it will be garbage. Even
worse, we don't re-initialize pending_bytes in stop-and-copy state, so
if the vfio-user client reads the entire migration data in pre-copy state
then there will be nothing left to read in the stop-and-copy state,
which is where we actually produce the migration data. This results in
corruption of the controller's state (e.g. queues).
This patch ensures that migration data are available in the
stop-and-copy state, by setting pending_bytes accordingly only in that
state.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Change-Id: I0b215e64cd1f58f254e1079f06402d196f984099
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11718
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The read_data, write_data, and data_written migration callbacks assume
that the migration data are accessed in one go. Until this is fixed,
with this patch we ensure we don't ignore unsupported ranges.
Change-Id: I640415858b8c374ffc9e487cd20f5130e0be9305
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11717
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
All of the callers immediately put the req right
after the nvme_rdma_req_complete call, so just move
the put into that function instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic370cf689850924e0c902a6071af8b3a7ed58c0b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13527
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
This follows similar logic in the pcie and tcp
completion paths, including omitting error
messages when aborting aers by adding a print_on_error
parameter to the completion function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id558d0af2cdd705dfb60abb842bd567a0949ccce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13525
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
By default, the SPDK nvmf target reports vid==INTEL,
which results in the SPDK nvme driver trying to enable
Intel vendor-specific log page. Fix this by trying to
enable those log pages only for PCIE transport
controllers.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I78ebf365d4fa6295d1f610697266c3ead765988d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13524
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
We can only get to this code path if the controller
has vid==INTEL, so make that more clear by changing
the check to an assert.
Remove unit test that calls
nvme_ctrlr_construct_intel_support_log_page_list()
for a controller that is not VID==INTEL - this is
no longer valid.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3b58451bc95992bf641e7452f0ac4c2bac9fe31c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13523
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
This follows similar logic in the pcie completion
path, including omitting error messages when aborting
aers by adding a print_on_error parameter to the
completion function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I96df72280bb8fcbee3847fdc27f38e14a1bf3251
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13522
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
nvme_tcp_req_complete_safe caches values on
the request, so that we can free the request *before*
completing it. This allows the recently completed
req to get reused in full queue depth workloads, if
the callback function submits a new I/O.
So do this nvme_tcp_req_complete as well, to make
all of the completion paths identical. The paths
that were calling nvme_tcp_req_complete previously
are all non-fast-path, so the extra overhead is
not important.
This allows us to call nvme_tcp_req_complete from
nvme_tcp_req_complete_safe to reduce code duplication,
so do that in this patch as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I876cea5ea20aba8ccc57d179e63546a463a87b35
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13521
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
All callers of nvme_tcp_req_complete call
nvme_tcp_req_put immediately afterwards, so move
this call into nvme_tcp_req_complete.
This will help enable some improvements in later
patches.
Note that nvme_tcp_req_complete_safe has this same
functionality open coded right now, but that will
get changed in the next patch. It calls
nvme_tcp_req_put immediately after the TAILQ_REMOVE,
so do that in nvme_tcp_req_complete as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I368122bc49a7f0772e3011e5427e3c43618380eb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13520
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
nvme_qpair_abort_all_queued_reqs() aborts error injections, queued
requests, aborting queued requests, and outstanding requests. (Aborting
outstanding requests depends on transports.) However, it did not abort
queued aborts.
Include nvme_ctrlr_abort_queued_aborts() into
nvme_qpair_abort_all_queued_reqs() to do really the name of the
function indicates.
nvme_ctrlr_abort_queued_aborts() has been called in a few cases, but
we do not care duplication.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I19102cc6603a72ce5c398a7947cb4d606b692991
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12849
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Vasuki Manikarnike <vasuki.manikarnike@hpe.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
QEMU may exit due to some exceptions which mean the socket
connection may be disconnected at any time, so for asynchronous
callbacks especially the subsystem pause/resume callbacks, they
all run in asynchronous way, the controller pointer may become
invalid before the callbacks are called.
Fix#2530.
Change-Id: I6d73597d75761e28844e83bfee7f8a446d85fa49
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Fix issue: #2561
The issue here is that in the bdev_set_qd_sampling_period RPC
command, the QD sampling period has been set. Then later the
related Desc is closed and in the bdev_close() function the
QD sampling period is reset to 0.
A new QD desc is added as the QD sampling period update could
be handled properly.
Meanwhile, a new QD Poll In Progress flag is also added so as
to indicate there are ongoing events of QD sampling and the
Bdev unregister will be handled in the proper way.
Related test case and unit test also updated for this change.
Change-Id: Iac86c2c6447fe338c7480cf468897fc8f41f8741
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13016
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
The bdev_lvol_grow_lvstore will grow the lvstore size if the undering
bdev size is increased. It invokes spdk_bs_grow internally. The
spdk_bs_grow will extend the used_clusters bitmap. If there is no
enough space resereved for the used_clusters bitmap, the api will
fail. The reserved space was calculated according to the num_md_pages
at blobstore creating time.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: If6e8c0794dbe4eaa7042acf5031de58138ce7bca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9730
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reserve space for used_cluster bitmap. The reserved space is calculated
according to the num_md_pages. The reserved space would be used when
the blobstore is extended in the future.
Add the num_md_pages_per_cluster_ratio parameter to the
bdev_lvol_create_lvstore API. Then calculate the num_md_pages
according to the num_md_pages_per_cluster_ratio and bdev total size, then
pass the num_md_pages to the blobstore.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: I61a28a3c931227e0fd3e1ef6b145fc18a3657751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9517
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
SPDK was previously incorrectly requesting log levels such as
LOG_NOTICE. Update libvfio-user so it is in fact supported, and check
that setting up the callback actually worked.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I41c2a8cf683868c3c2e40470f78e1af3dba29de4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12839
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Update libvfio-user such that the SGL access APIs can be used
concurrently. We are guaranteed that the guest memory remains mappable
now that the vfio-user transport has implemented quiescence.
This is currently only really useful (for a single controller) in poll
mode, but shouldn't break interrupt mode, as we still ensure all a
controller's queues are on the same poll group in that case.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I0988e731558e9bf63992026afc53abc66ec2a706
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12349
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In SPDK, declarations have the return type on the same line. Definitions
have the return type on a separate line. Astyle has an option for
enforcing this. Unfortunately, it seems to have two bugs:
1) It doesn't work correctly at all on C++ files.
2) It often fails on functions that return enums, or long type names
Deal with 1) by adjusting the check_format.sh script to only tell astyle
to fix return type line breaks for C files and not C++. Deal with 2) by
adding a few typedefs to work around the problem.
Change-Id: Idf28281466cab8411ce252d5f02ab384166790c6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
the underlying spdk_cpuset_copy() takes `const spdk_cpuset*` as the
`src` parameter. there is no need to take non-const spdk_cpuset*.
hence, in this change, let's relax the requirement of the pointer type.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Change-Id: I1f626c7fea45cf7250bf56b891bcba4a0f2a8917
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13443
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We had not incremented ctrlr->outstanding_aborts when aborting a
request in the ctrlr->queued_aborts, and ctrlr->outstanding_aborts
became negative. Fix the bug in this patch. Additionally add assert
to check if ctrlr->outstanding_aborts is not negative.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I58090286f070ba854bdea87f0f8ecb7810890338
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13452
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
While we are quiesced, we're not allowed to access guest memory via the
SGL APIs. Refuse to process any commands unless we're in RUNNING state.
We need to synchronize with each poll group via a message before we can
call vfu_device_quiesced(), otherwise we could still be processing
commands via nvmf_vfio_user_sq_poll().
For interrupt mode, we then might miss processing commands in a
corresponding interrupt callback, so make sure we process them when we
return to RUNNING state.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ieae5a9ae8d9de722e0bdf4bb8d61e7e678159f1f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12912
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
An oversight meant that quiesce was in fact only pausing the admin
queue, and not ensuring no I/O was ongoing. Fix this by passing the
right flag to spdk_nvmf_subsystem_pause().
Change-Id: I930c616d1170ac0299339b04928da57f6a7489ab
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13441
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If the broadcast NSID is supplied, every namespace is paused.
Change-Id: I40cc3e04b5a75b731ab0c8946ed8146275cc8ee4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13394
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It's useful to add these APIs.
spdk_copy_iovs_to_buf and spdk_copy_buf_to_iovs.
It prepares that other ones can call these.
We don't need to define them in static state
repeatedly.
And add corresponding unit tests.
Change-Id: Ife40fec8d047a48af67b04e6c055e4932282abfb
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12075
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Profiling data showed the deference of the CQ head in cq_is_full() was a
significant contributor to the CPU cost of post_completion(). Use the
cached ->last_head value instead of a doorbell read every time.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ib8c92ce4fa79683950555d7b0c235449e457b844
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11848
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Fully asynchronous ctrlr detach (b6ecc3729) introduce a register
operation state machine that waits for operation to complete. When
controller failed to initialize, `nvme_ctrlr_fail` set qpair state to
`DISCONNECTED` immediately, causing qpair process completions to
never complete register operations therefore prevent async detach exit.
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Change-Id: I205c5157b8ea7b4535f98ff4052414310e421446
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12858
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
reactor_run() decides whether to start gather_metrics
based on non-zero scheduler period.
The default of 1 sec was set during initialization,
in scheduler_subsystem_init().
This resulted in unessecary operations each second,
even if only 'static' scheduler is used.
This patch moves setting default scheduling period to
respective schedulers.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I953aee271a959b6314c8e83434c922dba9638de4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9492
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
When calculating the bar_addr which is used to access SPARSE MMAP area, we should use the
(offset - region->mmaps[i].offset) as the increment to get the valid access address.
Signed-off-by: Jun Zeng <jun1.zeng@intel.com>
Change-Id: Ie5d0c63cf572847d15dc92f0995fddecf35f1cdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13021
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
rte_power was added to DPDK long time ago,
but some of the DPDK packages do not include it.
For those cases just skip building components that depend on in.
This change still allows to use dynamic scheduler, since
the dpdk_governor usage is optional.
Fixes#2534
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ied88edc8d58aae07d1384c1c40203fc80b919d80
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12993
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
dpdk_governor and gscheduler use rte_power,
which is only available on Linux and when
DPDK env is used.
Rather than repeat those checks in each mk or Makefile,
added DPDK_POWER flag directly to DPDK env.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I438caad8d333a4df697a79aa45de2930cce71d23
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12992
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
rte_net is a dependency for both rte_vhost and rte_power.
Next patch will simplify the checks to include rte_power,
and keeping this depenency next to component that directly
depends on it will make it easier to understand.
Since DPDK_LIB_LIST is sorted by the end of the env.mk,
it shouldn't be a problem to include the rte_net twice.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If2bb2aa5d972148ca8143023657b0aec45306a08
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12991
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When running perf test, sometimes after CONNECT req's resp was
received and processed, the qpair still failed to change from state
CONNECTING to CONNECTED. For when it goes to nvme_fabric_qpair_connect_poll
-> nvme_wait_for_completion_robust_lock_timeout_poll to process the
CONNECT req's resp, the req may have not been finished in sock_check_zcopy,
although its resp has been received and processed, which means the
tcp_req->ordering.bits.send_ack is still 0 and the status->done still
is false. And after the req is completed in sock_check_zcopy, we need
to poll this qpair again to make the state enter CONNECTED.
And if icreq's resp received and processed before nvme_tcp_send_icreq_complete
is called by _sock_check_zcopy, the qpair will be stuck in CONNECTING
and it never proceed to send the CONNECT req. We also need to put it
in pgroup->needs_poll to fix it.
I can reproduce this bug with the following configuration.
target: 16NVMe SSD, running on 20 cores;
initiator: randread test using nvme perf with 32 cpu cores and
zerocopy enabled.
The error doesn't always occur. CONNECT failure is about 1 failure in
ten with the following log. And icreq failure is less frequent with
only target side's "keep alive timeout" log.
Error reported in initiator side:
Initialization complete. Launching workers.
[2022-05-23 14:51:07.286794] nvme_qpair.c: 760:spdk_nvme_qpair_process_completions:
*ERROR*: CQ transport error -6 (No such device or address) on qpair id 2
ERROR: unable to connect I/O qpair.
ERROR: init_ns_worker_ctx() failed
And target side shows:
Disconnecting host from subsystem nqn.2016-06.io.spdk:cnode2 due to keep alive timeout
Change-Id: Id72c2ffd615ab73c5fc67d36c3ff8b730cebcef7
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12975
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Patch below changed the struct spdk_nvmf_ctrlr_data by inserting
spdk_nvme_cdata_fuses. This affects large number of nvmf interfaces.
(cbfd581) nvmf: Add NVMe fused operations to spdk_nvmf_ctrlr_data
Unfortunately was missed due to lack of rebase after ABI update on
CI machines.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ifd06d0ddbefe9ea6c9715adae9881d4606e34b44
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/13013
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Kamil Godzwon <kamilx.godzwon@intel.com>
Reviewed-by: Michal Berger <michallinuxstuff@gmail.com>
Community-CI: Mellanox Build Bot
We already list the libraries with their explicit
pathnames, so the -rpath-link serves no purpose.
Our Makefile was actually specifying this option
without an = sign - i.e:
-Wl,-rpath-link /path/to/lib
On the submitter's system, this resulted in an error:
cc: Missing argument for -Wl,-rpath-link
I have no idea why no one has ever run into this
error, except for this one submitter. But removing
the -rpath-link is the right thing to do here, since it
is not needed - so do that rather than adding the =
sign and continuing to figure out differences in
-Wl option processing on these different systems..
Fixes issue #2540.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4f6176e55701a5dea5b10bba1ad621250cb5cb51
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12984
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Add an option to stop nvmf transport advertising support for both the
compare command and the fused compare_and_write operation in vfio_user
transport.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I3900218c0e9884f86a5c8698a030f8106b64f2f7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12919
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Compare command, when not supported natively by the underlying bdev
is emulated by the bdev layer.
Change nvmf ctrlr data to advertise compare command by default.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I88646e6c1a7d7a2829be813ff0241661724bd127
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12918
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fused compare_and_write operation is always advertised by the nvmf
transport.
Add the fuses structure to spdk_nvmf_ctrlr_data to make advertising
fused operation configurable.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I73ee03dc8948f1d250cc0a8f0b8a3bde042a45e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12917
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There are a few places we can replace existing license
text with SPDX license identifiers, that did not match
the auto-replacement script in the previous patch.
Make those replacements manually in this patch instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I258720c03bc2153d1c56a8adf6357f224b911c0b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12913
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Many open source projects have moved to using SPDX identifiers
to specify license information, reducing the amount of
boilerplate code in every source file. This patch replaces
the bulk of SPDK .c, .cpp and Makefiles with the BSD-3-Clause
identifier.
Almost all of these files share the exact same license text,
and this patch only modifies the files that contain the
most common license text. There can be slight variations
because the third clause contains company names - most say
"Intel Corporation", but there are instances for Nvidia,
Samsung, Eideticom and even "the copyright holder".
Used a bash script to automate replacement of the license text
with SPDX identifier which is checked into scripts/spdx.sh.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa88ab5e92ea471691dc298cfe41ebfb5d169780
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12904
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: <qun.wan@intel.com>
Reflect that we are kicking the entire controller.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If5723a5f485745ef0a2456942b6df1d54133815b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12665
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
This function is really about re-arming all SQs for a poll group;
refactor to reflect this.
This is necessary ground-work before we can support multiple reactors in
vfio_user.c in interrupt mode.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I170fae2076fc80e742926cf448973671ac9e3bd9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12664
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Play it safe and add the same memory barrier in
nvme_pcie_qpair_process_completions() as for ppc64.
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Change-Id: I7079b4769d30106387ef4549495a72b7fea6a77a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12879
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Prepare for the later patch, and make the later patch code clean
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I12b175c86a5245f38dc76fe2d3918ec4b30a475a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12830
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
If accel_tasks are used up, we should not directly return but give
an another chance to calc it directly.
Signed-off-by: MengjinWu <mengjin.wu@intel.com>
Change-Id: I983b65d7dfff0fea3974682e886d2dcf309cd2c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12841
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
For a C2HTermReq PDU, there's no associated tcp_req, so we need to check
it for NULL before dereferencing it.
Also, while here, moved some of the assignments to the declarations to
reduce the number of boilerplate lines.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Iac05ef0ba605e2f40d0026ad1b131c28d29f7314
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12845
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The timeout poller might still be registered when a qpair is destroyed
if we send C2HTermReq and then destroy the qpair before host terminates
the connection.
Fixes#2527
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I21acc147fdba3aaac66b0c6ed54e155195fe9816
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
We need to check that the given SQ is active (i.e. is currently mapped
into the process), so make the check the same as that in
poll_group_poll().
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ibd3babd7520f611f596f3bab15765fa13b4d6b99
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12663
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is better represented under the name vfio_user_ctrlr_intr().
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic3fa0fe238fd8ce4930bfd3e34b9dbc1b935aa6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12662
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There's a non-zero cost to looking up the CQ; only call this function in
the poll path if we need to.
While here, we'll streamline the ctrlr-level check.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I6bf123f759fcd856196f6613cb6c7d0219550136
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12660
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Rui Chang <rui.chang@arm.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This SGL type was missed in the original commit
that added the pretty printing.
Fixes: 4d9ab1e9a1 ("nvme: pretty print dptr")
Reported-by: Ramanjaneya Burugula <burugula@gmail.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibc655db4e65009071f39f55f691c94a094cea0bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12705
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Use the conventional huge-pages based spdk allocation scheme for the initiator
data-structures unconditionally.
Change-Id: I5baee7614e3ac9b5497b3d771dfddfbaa7fdf65b
Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12687
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I7d3804a84851753992af4a3a37b60dc6de0d22cb
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12780
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: Ie50c7421f991ad0474edba0e0f339180f7afee00
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Note that without ISAL or IAA a call to compress/decompress
will fail.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id20a08f6e61b9a51fa4a1634a5314e6ca18fa504
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12310
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously an error would have been completed twice.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ief645fc30754433398531c50357876e92804e4b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12789
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Provide an interface to allow the caller to provide a proprely
formatted descriptor.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I5c397761f556361040ec962d61169459150b6494
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12703
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: Ia09368e426a83274d9c7fc90ed8b0391f4d0b67c
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12774
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch adds virtio_blk abstraction for custom transports,
with the 'vhost_user_blk' first one being used.
Added spdk_virtio_blk_transport_ops describing the nessecary
callbacks to be implemented by each transport.
Please use SPDK_VIRTIO_BLK_TRANSPORT_REGISTER to register the transport.
Transports can use virtio_blk_process_request() to process the
incoming I/O from their queues.
virtio_blk_create_transport RPC was added to create one of the
registered transports, possibly with custom JSON arguments.
Added 'transport' argument to vhost_create_blk_controller RPC,
to specify which transport should create the controller.
By default the vhost_user_blk transport is used.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ic9d93a6e0f483796eb56b7174a678e41a6ea4808
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9540
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I56dbaef56ff793e48441219e07dc6b02dda0b470
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12777
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I33a497fb134320f13606b66ad55fc7b068d011d9
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12716
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I477da05a42ca607fbad4d178aa541726197d7c83
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12775
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
And associated RPC to enable.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I06785bcd8b8957293ad41d13bab556fe62f29fd5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12765
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Accel module coming in next patch...
Add support for compress and decompress. The low level IDXD
library supports both DSA and IAA hardware. There are separate
modules for DSA and IAA.
accel_perf patch follows.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I55014122f6555f80985c11d49a54eddc5d51c337
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12292
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In prep for upcoming IAA additions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id89124a3c3d5b1bcfd4d805ff4ee84a2f64f8a4a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12767
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Misc internal IDXD changes needed to support the upcoming addition
of IAA.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Idb180088af545b174ed33a4f8ee113e58640477f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12764
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Intel Analytics Accelerator, this is the start of the patches to
add this support to accel_fw.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7410710697d2947355181616b35cc8ab78bbddfe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11985
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In prep for upcoming addition of IAA.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I47c5880aac37da9a38d6af6e52a51cefbfec91b9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12762
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In prep for adding IAA support
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7eed173f9f907aa1c010d12db87b8dc27cd7495b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12760
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Generic vhost-blk layer is responsible for opening the bdev
attached to the vhost controller.
This patch adds vhost_user_bdev_event_cb() that is called
for vhost_user backend. This function will be replaced with
a callback to particular virtio-blk transport.
Having this piped through to the transports, allows
to adjust their behavior upon bdev events.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id73f5131b6e57f0354e970d0bce92716ec69985b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12132
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There are configuration details that are needed to configure
the virtio device based on spdk_bdev properties.
Please see vhost_blk_get_config() for an example
of vhost_user retrieving properties of bdev such as size
or supported I/O type.
Rather than trying to anticipate every such property,
add vhost_blk_get_bdev() to allow usage of bdev API directly.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I757f96e2fb0861c97b07ce279a7c04c77a2ad11f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12373
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This requires handling vtophys entries that cross page boundaries.
Fixes#2316
Change-Id: I9e9aafc1612bc89375c783bcf91bd04ab523ab9e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12217
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
If compress driver doesn't support SGL input of output
then we need to copy user's buffers into reduce internal
buffers
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0c07243a5b668d0e0adcc153e5b573f59c26ab64
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12281
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reduce library allocates one big chunk of memory and
then splits it between requests. The problem is that
a chunk of memory assigned to a request may cross huge
page boundary and if compress driver doesn't support
SGL input of output, operation will be failed.
To avoid this problem, align buffer start on 2MiB
and check each chunk of memory if it crosses huge page
boundary.
Fixes issue #2454
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie730b8ba928f27a43bde1222b6c18d29b797575a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12249
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
ext_io_opts uses the size member to allow backwards
compatibility however currently we only check if it is
below or equal the current size of the opts struct and
that it is not 0. size is only used when we copy opts
because of split or push/pull.
This patch introduces size checks to allow safe access
to e.g. metadata and memory domain pointers of the user
provided opts pointer. The minimum size of the struct
passed is now the size of the initial version of
spdk_bdev_ext_io_opts. To not introduce additional
checks when opts are consumed by a bdev module we
now always copy if the size is smaller than the
current opts struct size.
When introducing new members to opts additional
checks might be needed if those are directly accessed
through the passed pointer or bdev_io->internal.ext_opts.
Change-Id: Ibd181a5840a3d5022018a9f61403df961ffd6e1d
Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12550
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Separate out SCSI and BLK vhost subsystems to later add
virtio_blk transport abstraction.
This allows for further changes to the vhost_blk, not
affecting vhost_scsi.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id1ecfeafeb936809a479a43c321e13f75cb3d5ad
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9539
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
A iterator function nvme_request_add_abort() covers not only a small
I/O request but also children of a large I/O.
However nvme_qpair_abort_queued_reqs_with_cbarg() did not check the
latter. check if cmd_cb_arg matches not only req->cb_arg but also
req->parent_cb_arg.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I015e29b0a8f58920b9a13081330a94f9dd976a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12557
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Only 4 bytes or 8 bytes are valid numbers when to access NVMe
registers, add the check here.
Fix issue #2495.
Change-Id: I63b6e16a156f6eba17f397ec9d1a447e6a80b4da
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12643
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
For PCIe transport, we need to stop any activity of the controller
before deleting I/O qpair resource in a controller reset sequence.
However, we set I/O qpairs to failed before disabling a controller.
In the NVMe bdev module, this caused disconnected qpair callback to
delete I/O qpairs before disabling the controller.
Hence, change the code slightly to set I/O qpairs to failed only if
reset is synchronous to keep backward compatibility.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ica71aad0a1dabce45616dfdfff5f11b07131bbd1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12736
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The CSTS.SHN is changed only in shutting down the controller,
nvmf library already ensure that all the outstanding IOs will
be flushed before that, so we can remove this check here.
Change-Id: Ib93a256e986b7b2ec1da0fc7992feb3a02c1d657
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11674
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
After finishing migration in source VM, the subsystem is in
PAUSED state, the controller is dead for the source VM, we will
destroy the controller when disconnecting socket, but after that,
we should RESUME the subsystem so that it can be ready for the
next new client.
Fix issue #2363.
Change-Id: Icf0999b9085cebe8be4c8783e1a43bb13d4f7987
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11422
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The completion callback of `spdk_nvmf_subsystem_resume`
and `spdk_nvmf_subsystem_pause` can run in different
core other than the `vfu_ctx` core, this may lead to
race condition when changing controller's state. Here
we use a thread message to change it in the same thread
context.
Change-Id: I53d139adcca6ff72a3b91a2a931f1239f3271fa9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12558
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The following patches swaps the ordering of destrloying I/O qpairs
and disconnecting a controller for PCIe transport.
prepare_for_reset is a flag for PCIe transport.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3009de9fea089fc93ecf87adba42e85c9a77e715
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12582
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
As described in the previous patches, we need to delete all I/O
SQ/CQs before aborting trackers when disconnecting a controller.
The following patches reorder the operations. This patch changes
adminq disconnection to initiate a Controller Level Reset and
adminq completion processes it if ctrlr->is_disconnecting is true.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I64f06bae2ce8a9127124029fd042db0028198e3c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12560
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Make this a transport-level decision instead. TCP and RDMA do want to
abort, but PCIe cannot because these commands may still be receiving DMA
operations from the device.
Change-Id: I305acddc3819c903eb3217e8f710d4216d0b3931
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11509
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Previously, we did not do any Controller Level Reset when disconnecting
the admin qpair.
However, for PCIe transport, we need to stop any activity of the
controller, i.e., delete all I/O SQ and CQs before
nvme_transport_ctrlr_disconnect_qpair_done() calls
nvme_transport_qpair_abort_reqs() (i.e., nvme_pcie_qpair_abort_trackers()).
Otherwise, some corruption may occur because completed I/Os may still be
in progress on the NVMe device.
Not to change any public API, nvme_pcie_ctrlr_disconnect_qpair() is a
convenient place to initiate a Controller Level Reset because it is
called from spdk_nvme_ctrlr_disconnect(). Then
nvme_pcie_qpair_process_completions() can process it until completion.
However, necessary functions are not accessible from PCIe transport.
This patch adds two helper functions and guards us from some undesirable
behaviors because it was not assumed that nvme_ctrlr_process_init() is
called from the completion context and ends in the middle of transition.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3d986e94ba71b83beeff7e75cf92033b5fa6f075
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12559
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When a new cluster is added to a thin provisioned blob,
md_page is allocated to update extents in base dev
This memory allocation reduces perfromance, it can
take 250usec - 1 msec on ARM platform.
Since we may have only 1 outstainding cluster
allocation per io_channel, we can preallcoate md_page
on each channel and remove dynamic memory allocation.
With this change blob_write_extent_page() expects
that md_page is given by the caller. Sicne this function
is also used during snapshot deletion, this patch also
updates this process. Now we allocate a single page
and reuse it for each extent in the snapshot.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I815a4c8c69bd38d8eff4f45c088e5d05215b9e57
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12129
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
To fix issue: #2484
When unregistering the bdev, will send out the message
to each thread to abort all the IOs including IOs from
nomem_io queue, need_buf_small queue and need_buf_large queue.
The new SPDK_BDEV_STATUS_UNREGISTERING state is newly
added to indicate this unregister operation.
In this case, the bdev unregister operation becomes the
async operation as each thread will be sent the message
to abort the IOs and as the last step, it will unregister
the required bdev and associted io device.
On the other hand, the queued_resets will be handled
separately and not aborted in the bdev unregister.
New unit test cases are also added:
enomem_multi_bdev_unregister: to abort the IO from
nomem_io queue during the unregister operation
bdev_open_ext_unregister: to handle the events and
async operations from the unregister operation
Change-Id: Ib1663c0f71ffe87144869cb3a684e18eb956046b
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12573
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Instead of releasing the batch memory when the batch generates a
completion, instead do it via refcnt. This will allow us to later hold
onto batch memory longer if vectored transactions end up spanning a
batch.
Change-Id: I942d6aa5052029eb0951e51a046dd98943108b94
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12259
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If nbytes is not set, then the desination iovec sent to the underlying
driver has a length of 0.
Change-Id: Ia55f5ece942bd70f32bfdb3bcf02134ba98fca96
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12612
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It simplifies code and removes cast of nvme_qpair
to rdma_qpair
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I363246cf9d8c9cbafd48b26facdb5cc37fdd8e67
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12701
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When qpair is attached to a poll group, disconnect
process is async - we are waiting for the DISCONNECTED
event from rdmacm to destroy rdma resources. However
the user (nvme_perf) can destroy qpair immediatelly,
so memory allocated for qpair is freed but rdma
resouces are still allocated. That means that we may
receive rdmacm event (DISCONNECTED) for the destroyed qpair,
that leads to use-after-free.
To fix this problem, add a check for internal qpair state
when qpair is destroyed, if disconnect is not finished, then
we forcefully destroy rdma resources.
Fixes issue #2515
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@nvidia.com>
Change-Id: I7bfa53c9f6fe6ed787323a8941f1f2db17ea0c20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12700
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Each spdk_vhost_dev_backend is local to either
SCSI or BLK backends, so its not possible to gauge which
backend is used by the vdev on generic vhost layer.
Added a `type` field with matching enums to differentiate
between the two. Later patches will check that field
in vhost.c.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2a95961b9f9b5f070db7b22d44cf5114a24b1067
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12675
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Without this, it does not think there is a driver available.
Change-Id: I0b8b42374e0ed82abb22bf27e0b8907bb03c61f6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12641
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Currently, when you run vhost user target, no matter if the reactor is
busy or not, spdk_top always shows 100% busy. Fix this by real load
status.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I610a8c2f4e74f46bd56955d31284372c775507ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12647
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
These function accept optional spdk_blob_ext_io_opts
structure. If this structure is provided by the user
then readv/writev_ext ops of base dev will be used
in data path
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I370dd43f8c56f5752f7a52d0780bcfe3e3ae2d9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11371
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
When snapshot is created, the new blob is loaded and
examined for BLOB_SNAPSHOT xattr in blob_load_backing_dev
function. At this step there is no such xattr, so zeroes
back_bs_dev is created. Later snapshot inherits back_bs_dev
from original blob, so previously created back_bs_dev can
be lost.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I90cc9b02f56598d8c5c7fe00409f571fba0aa91a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11384
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Introduce spdk_blob_ext_io_opts structure which
is used in the new *_ext functions.
Zeroes dev is updated with implementation of
readv_ext which uses memory domains memzero
or regular memset().
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Id94542196eff999827bf00591fd43804256fccb4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
In the following patches, nvme_ctrlr_process_init() will be used to
disable the controller when disconnecting the admin qpair for PCIe
transport. In this case, we will have to exit nvme_ctrlr_process_init()
after CSTS.RDY is 0. However, spdk_nvme_ctrlr_reset() and
spdk_nvme_ctrlr_reconnect_poll_async() have to continue
nvme_ctrlr_process_init() until the controller becomes ready.
To differentiate stop and continue clearly, add a new state
NVME_CTRLR_STATE_DISABLED to enum nvme_ctrlr_state.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic0a5fb7114d4eeb1cefec28bc404184768fb0a96
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12613
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I3b75eea83bd7d700d20a6189e8fb6d1f066dc9b4
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12603
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I32dd9960bc397244d8e3d0a384fc8b67e907bf68
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12601
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Move check if opts needs copy to its own function.
Move check if opts is valid into its own function.
Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch>
Change-Id: Ie006ddd0b642eb97aaa3ab13890800322dee7a42
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12571
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
vfio_user/host is the PCI abstraction over vfio-user transport, it's
client library. We will add a target library to emulate PCI devices
in next patch, so new lib/vfio_user contains two libraries, one
is for host, the other one is for target.
Change-Id: I9bb40043105525654360691d6db62e4958384e7f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12314
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Clarify via a variable name that we're dealing with the admin CQ
specifically.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I032f6b27e2d75bffb9d95481f177ce0c3655550c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12556
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
These were deprecated in 2019, it's time to remove
support for them now.
Change-Id: I6931e80c836b568dec8989dad2a7be4e112c42b4
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12577
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Fix ocf test script that was still using the
deprecated get_bdevs RPC name - change it to
bdev_get_bdevs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7f8caedc250b80503671a0236694181613f63860
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12553
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
These were deprecated in 2019, it's time to remove
support for them now.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2c9918ed0296f644b0728c5106c47d93e3c7ec30
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12552
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Constantly polling the socket degrades performance significantly.
Polling the socket at a much lower frequency, every 1ms, is good enough
for now.
fixes#2494
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Co-authored-by: John Levon <john.levon@nutanix.com>
Change-Id: I4a7d35c45ece863b9df756324c23f41736df49f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12494
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When using the SPDK nvme driver in multi-process mode,
and multiple processes detach from their devices
(i.e. at application exit) very close to each other,
there are cases where a process can have started its
own detach, and then get two remove notifications - one
from its own detach, and another from another process'
detach.
So we need to remove the assertion in pci_device_fini(),
if the removed flag has already been set.
Fixes#2456.
Tested using a modified version of the
nvme_multi_secondary() function in test/nvme/nvme.sh
that starts several additional perf applications.
The test consistently failed without this patch,
and passes every time with this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I91e16985cdc4a463aaac2c45096bb967aab85560
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12454
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Piatek <pawelx.piatek@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We enable multiple engines by:
* getting rid of the globals that point to the one available HW
and one available SW engine
* adding a submit_tasks() entry point for the SW engine so that
it is treated like any other engine allowing us to just call
submit_tasks() to the assigned engine for the opcode instead of
checking what is supported
* changing the definition of engine capabilities from
"HW accelerated" to simply "supported"
* during init, use a global (g_engines_opc) that contains engines
and is indexed by opcode so we know what the best engine is for each
op code
* future patches will add RPC's to override engine priorities or
specifically assign an opcode(s) to an engine.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I9b9f3d5a2e499124aa7ccf71f0da83c8ee3dd9f9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11870
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
If we don't set the cid before failing the misordered
command, we use some other random cid, causing the
initiator to think the wrong command was completed.
Fixes#2481.
For this issue, the target was completing a
previously submitted AER, not the fuzzed fused
command. The initiator would then submit another
AER to replace the completed one, but the target
complained that the initiator sent too many AERs
since the target didn't really know it had completed
an AER so hadn't adjusted its num_aer count.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4bd66f147086b262d0e48b8399d237e5ed3c2651
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12452
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We see reports that Huawei SSDs can't handle hardware
SGL properly, it requires additional alignment, so add
a quirk here to force Huawei SSDs use PRP instead.
Fix#2489.
Change-Id: I20a57e754bc6ff8666d681191994818f2192decc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12405
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_bdev_get_acwu() is a 1-based number, so we need
to subtract 1 from it before assigning the value to
nsdata->nacwu.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I32708b28a35670cba6013a48b79389fa48226285
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12399
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This does not actually need a doubly linked list. Single is enough. That
frees up 4 more bytes in the op for other uses.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8c2a30de175b42815afd0a3ba3c694aef2f35882
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12258
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
If we don't find a completion, break out of the loop at the top. This
removes a level of indentation but most importantly makes it very clear
that we only remove elements from ops_outstanding from the front of the
list.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I5e8784a5af5449c14ff7015bc8e6062e6aee6b4e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12257
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The partition bdev checks if the underlying device supports
the io type and sends the bdev_io directly down to the bdev.
This patch adds missing compare and compare&write io types
to the partition bdev.
Signed-off-by: Jonas Pfefferle <pepperjo@japf.ch>
Change-Id: Ice7e5c0332ce7e564bad2bb8d7f4bb1d535388c1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12390
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
The bdevio test app has some test cases verifying
that write zeroes commands are handled correctly,
but using knowledge of the ZERO_BUFFER_SIZE that
the bdev library uses for splitting larger write
zeroes commands. Instead of hardcoding that 1MB
value in bdevio.c, have bdevio.c use ZERO_BUFFER_SIZE
directly instead. But this requires moving
ZERO_BUFFER_SIZE into bdev_internal.h and having
bdevio.c include that file.
We do this instead of putting ZERO_BUFFER_SIZE in
the public API because we don't want users to
make any kind of dependencies on this value.
While here, also rename the tests that are using this
value, so that the test names don't include any reference
to the specific size of this bdev-internal zero buffer
size.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia29d92a706cb1f86b4c29374dc2a9beccf679208
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12383
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This patch adds process_virtio_blk_request() that will be called
by virtio_blk transports to process incoming requests.
Meanwhile vhost_user_blk_request_finish() will be replaced with
a callback to the virtio_blk transport to notify of the result.
blk_request_finish() should only be called as direct result of
process_virtio_blk_request(), usually from it.
Some error paths now call vhost_user_blk_request_finish() directly,
if vhost_user_process_blk_request() was not called yet.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0cce22f15b922fe45f30fb659c384b6e836def4c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
struct spdk_vhost_blk_task shall contain only fields
that are relevant to the generic virtio blk layer.
Moved to vhost_internal.h header to allow for broader
use.
In contrast to spdk_vhost_user_blk_task that is
specific to vhost_user transport.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I8438d3e6dbca816855f55ee998a632f50acde045
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12282
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Later in the series, the spdk_vhost_blk_task will not contain
vhost_user specific fields. As such the generic part of processing
I/O will not be able to refer to those.
References to req_idx are replaced with pointer to the task,
meanwhile bvsession reference when queueing I/O is removed.
Note that vhost_user fields are accessible in the final callback
for the I/O, and will be printed. See blk_request_finish().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I100d60968146da778bd6bf4fbcf2a2694d3be6e6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12335
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously the blk_request_queue_io() and blk_request_resubmit()
relied on vdev and channel contained in task or bvsession structures.
In an effor to make the I/O processing for virtio blk not reliant
on vhost_user, this patch caches the vbdev and io channel submited
in process_blk_request().
Later in the series, vhost_user structures will be separated out from
the spdk_vhost_blk_task.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If1ea38a77af8fcfee12054f5857a6db2db2093c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12334
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
ACWU is a 0's based value, and our intent is to
report that our target's ACWU is 1 block. This means
we should report ACWU as 0, not 1.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6ad0606be07fd38bc6c2e3a8e4bb78225b3dfadc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12385
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
The spdk_rmb() in nvmf_vfio_user_poll_group_poll() is unnecessary: we
already have a read barrier for SQ tail updates at the per-SQ level, so
this doesn't add anything.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I88cddd968f4a949640754526e19cb869d9fb31af
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12381
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There's no need to spdk_rmb() in nvmf_vfio_user_sq_poll() unless we
actually found the tail has advanced.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I778835c527409764c3db78459b2aa76420cc0105
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12378
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
When sending the first part of a fuse command, we set the
first_fused_submitted flag so that we don't ring the doorbell
immediately. When the second part is sent, we ring the doorbell for
both commands.
However, this doesn't work well when we use the option to delay ringing
the doorbell. We send both parts, then later when we try to ring the
doorbell, we don't because of the first_fused_submitted flag from the
first command.
Replace this mechanism by keeping track of the last submitted fuse.
Change-Id: Ia4ac9b3ce9c319ee4c7e42f86eadda93dac85fca
Signed-off-by: Alex Michon <amichon@kalrayinc.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12182
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We need to keep track of the shadow doorbell buffer locations, and make
sure to re-initialize on resume.
Co-authored-by: Thanos Makatos <thanos.makatos@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If3ba456fb35f6f6199e4ff14cec1aad96775f71a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12237
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
process_blk_request() accepted spdk_vhost_blk_session argument
which is specific for vhost_user. In preparation to making
this function usable outside of vhost_user, replace this field
with struct spdk_io_channel and spdk_vhost_dev.
For this purpose vhost_user_process_blk_request() was created to
translate from spdk_vhost_blk_task to the generic arguments.
to_blk_dev() was moved further up so it can be more commonly used
throughout the file.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie61a1ae2a615c4f1a95601e533b9eec51998cd07
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12333
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In the qemu virtualization environment, when the virtio driver of the
windows image is upgraded to virtio-win 0.1.185, and virtio uses
legacy mode. In the negotiation stage of vhost, qemu's host_features
will enable VIRTIO_F_ANY_LAYOUT, and guest_features will also enable
this feature, because qemu's feature is a superset, so the feature
will be passed to the spdk side through the set feature process,
which leads to "VHOST_CONFIG: Processing VHOST_USER_SET_FEATURES
failed. VHOST_CONFIG: vhost message handling failed.", so we enable
this bit for spdk vhost.
Signed-off-by: suhua <suhua1@kingsoft.com>
Signed-off-by: lizhaoxin <lizhaoxin1@kingsoft.com>
Change-Id: I27323bf5a03dce774c8a74cfb070ddd43be05534
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12300
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
There were a couple of places not using the standard formatting for qid
still.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If96c3f6d762128b0f274e2c4e9eebf4e80e35139
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12346
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This patch adds an extra spdk_thread_send_msg() call to destroy a qpair
to make sure that it isn't freed from the context of a socket write
callback. Otherwise, spdk_sock_close() won't abort pending requests,
causing their completions to be exected after the qpair is freed.
Fixes#2471
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia510d5d754baccca1e444afdb10696ab9b58e28b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There is a race condition when a bdev is unregistered while reset is
submitted from the upper layer very frequently.
spdk_io_device_unregister() may fail because it is called while
spdk_for_each_channel() is processed.
spdk_io_device_unregister io_device bdev_Nvme0n1 (0x7f4be8053aa1)
has 1 for_each calls outstanding
To avoid this failure, defer calling spdk_io_device_unregister() until
reset completes if reset is in progress when unregistration is ready
to do, and then reset completion calls spdk_io_device_unregister()
later.
A bdev cannot be opened if it is already deleting. So we do not need
to hold mutex.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ida1681ba9f3096670ff62274b35bb3e4fd69398a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12222
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Now controller initialization with RDMA
transport is fully async
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I26e857740d3137d0b0e987facc81fc5f6ef81f2b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10756
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Transport specific qpair_abort_reqs() set SC to SC_ABORTED_SQ_DELETION.
However, nvme_qpair_abort_queued_reqs() set SC to SC_ABORTED_BY_REQUEST
even if its call is not requested by the upper layer.
Change nvme_qpair_abort_queued_reqs() to set SC to SC_ABORTED_SQ_DELETION
for consistency.
nvme_qpair_abort_queued_reqs() is used to abort queued requests that
were sent while adminq was connecting. SC_ABORTED_SQ_DELETION will not
be so bad even for the case.
This change is required for the NVMe bdev module to be resilient for I/O
error. The NVMe bdev module does not retry I/O if SC is
SC_ABORTED_BY_REQUEST.
SC is set to SC_INTERNAL_DEVICE_ERROR if a request is failed to submit
to qpair by a generic qpair layer. We can change it to
SC_ABORTED_SQ_DELETION as well but we keep this for now.
SC_INTERNAL_DEVICE_ERROR is also retriable for the NVMe bdev module.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I7d8d5e97b222fe9275afc4fed024c1654c9579a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12121
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
As per the NVMe specification, a host can identify two areas of guest
memory: one of which is used for the host-written doorbells, and one of
which contains event indexes. The host writes to the shadow doorbell
area, but also writes to the controller's BAR0 doorbell area if the
corresponding event index is crossed by the update. This avoids many
mmio exits in interrupt mode, where BAR0 doorbells are not directly
mapped into the guest VM, with greatly improved performance.
This isn't a useful feature in BAR0 doorbells are mapped into the VM, so
we explicitly disable support in that case.
NB: the Windows NVMe driver doesn't yet support this feature.
Although the specification says that the admin queues should also engage
in this behaviour, in practice, no VM does, so have to include some
hacks to account for this.
Co-authored-by: John Levon <john.levon@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I0646b234d31fbbf9a6b85572042c6cdaf8366659
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11492
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
probed
This can cause a mismatch of kernel vs user driver and isn't allowed.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I9c572ea1fa1da89d7b41e31ab4719eec719fb50a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10588
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The only place a batch can be created is by assigning it to the channel
now, so this isn't a mistake that can be made and the checks can all be
removed.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I915edb4f212c0751396554655ffe95ae3bb20cd6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11538
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This effectively means there is only ever a single batch being build at
a time, which simplifies a lot of the APIs.
Change-Id: Ifd66cd1ce6f6f0abe2011528dd862c5324213658
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11223
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This lets us use it more widely.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9c67be19020677fab3eafe05c1e0f91c3d04611d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12307
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The value of ack_timeout is calculated according to
the formula 2^(transport_ack_timeout) msec.
Signed-off-by: zhangduan <zhangd28@chinatelecom.cn>
Change-Id: I5a938635d70693ddd405fa5907555bb745b4df0f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12215
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reduces a lot of casting.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ibc04f422858642d0e20c9b020bb6c5d1b70256fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11534
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
When a qpair is disconnected, any outstanding zero-copy requests are
freed to release their buffers before the qpair gets destroyed.
However, if there is a PDU being sent to the host as part of this
request (e.g. C2HData/R2T), we need to wait until that write is done
before freeing the request to avoid freeing it twice.
Fixes#2445
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I2a6e82f26a4f011dfd18c55c821e9039de7e584a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12255
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This makes the flag indicate whether there's an outstanding PDU write
for a given request. Additionally, it reduces the number of places we
need to update this flag.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Id7e587f84955b096c46bfbf88d4dd222214d4a6a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12254
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This will make it possible to have some common handling in request's PDU
write completion.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Icaff38da0e47dd93327e3d8f09edd9fdba8f532e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12253
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When an request using zcopy is completed, it might have an unreleased
zcopy_bdev_io attached in three cases:
1) the request was a read,
2) the request was a failed write,
3) the qpair is being disconnected.
The last case was missing from the assertion.
Fixes#2425
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5cbeaa198a1fd878c98caf148a0bc47060e35bca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12263
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Now that IO qpairs can be created asynchronously, we need to make sure
that all the create IO CQ/SQ commands can be executed simultaneously.
It is pretty common to create multiple IO qpairs at the same time, e.g.
adding an NVMe bdev to an nvmf subsystem will create an IO qpair on each
poll group. In that case, if the number of cores exceed the size of the
admin queue (actually it can be even lower due to outstanding AERs), we
might run out nvme_requests on the admin queue.
The chosen minimum value for the admin queue size, 256, should be enough
to cover most cases.
Fixes#2465
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I55c59aef64f3fdb33f7b4824d3e9beb403602633
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12270
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The RDMA transport can disconnect qpair asynchronously now.
Previously, we tried to release the resource of the qpair after disconnected.
However it did not work because it was done when deleting the qpair.
The admin qpair was not deleted in a ctrlr reset sequence.
This patch tries to satisfy the same aim again but by a different way.
Previously, we released the resource of the qpair before starting
actual disconnection process. This patch release the resource of the
qpair after the qpair is actually disconnected.
The related patches are:
b9518a5540eb09178a59
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Id6a814895a35b1589b781a91744ef872b42aaa69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11783
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The code to handle the lingering qpair when deleting it was really
complicated.
The RDMA transport can connect or disconnect qpair asynchronously.
Then we can include the code to handle the lingering qpair into the
code to disconnect qpair now.
If the disconnected qpair is still busy, defer completion of the
disconnection until qpair becomes idle.
If poll group is not used, we can complete disconnection immediately
because cq is already destroyed.
The related data and unit test cases are not necessary anymore.
So delete them in this patch.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ic8f81143fcad0714ac9b7db862313aa8094eeefb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Include delayed disconnect/connect retries with finite times into
the state machine of asynchronous qpair connnection.
We do not need to call back to the common transport layer but
we need to do the following, clear rqpair->cq before starting disconnection
if qpair uses poll group, and clear qpair->transport_failure_reason after
disconnected.
Additionally locate the new state STALE_CONN before INITIALIZING
because cq is not ready to use for admin qpair when the state is
STALE_CONN.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibc779a2b772be9506ffd8226d5f64d6d12102ff2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11690
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
TCP transport already does it but was not documented clearly.
RDMA and PCIe transports follow it and document it clearly.
Then we can check each qpair's state if
spdk_nvme_poll_group_process_completions() returns -ENXIO before
disconnected_qpair_cb() is called.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I2afe920cfd06c374251fccc1c205948fb498dd33
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11328
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add three states, INITIALIZING, EXITING, and EXITED to the rqpair
state.
Add async parameter to nvme_rdma_ctrlr_create_qpair() and set it
to opts->async_mode for I/O qpair and true for admin qpair.
Replace all nvme_rdma_process_event() calls by
nvme_rdma_process_event_start() calls.
nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to INITIALIZING
when starting to process CM events.
nvme_rdma_ctrlr_connect_qpair_poll() calls
nvme_rdma_process_event_poll() with ctrlr->ctrlr_lock if qpair is
not admin qpair.
nvme_rdma_ctrlr_disconnect_qpair() returns if qpair->async is true
or qpair->poll_group is not NULL before polling CM events, or polls
CM events until completion otherwise. Add comments to clarify why
we do like this.
nvme_rdma_poll_group_process_completions() does not process submission
for any qpair which is still connecting.
Change-Id: Ie04c3408785124f2919eaaba7b2bd68f8da452c9
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11442
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This was added in patch 07526d85, back in March 2018.
This was before DPDK supported dynamic hugepage allocations.
Presumably this flag was added to reduce the amount of
memory lost due to mempool buffers that would otherwise
span an IOVA boundary (mostly typical with IOMMU off and
we are relying on physical addresses).
Removing it simplifies any code in SPDK that uses
mempool buffers for DMA operations, since it doesn't have
to worry about splitting buffers that span an IOVA
boundary - DPDK has already done it for us.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I49f6c1407fad02acae7e07c9dd00cb0449bd3554
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12277
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Both blk_request_finish() and invalid_blk_request()
acomplished the same thing, with variation on handled
statuses and debug logs.
Consolidating those two into single function will help
later on when replacing completion of request processing
to single callback.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Iae7b93db01bfd98819b2bb8fad9e11afcdb3a459
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12196
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch adds vhost_blk_[get/put]_io_channel() to be used
by virtio_blk transports.
Functions related to vhost_user sessions were modified to
use it.
dummy_io_channel reference is managed at the vhost_blk
layer and as such continues to use the spdk_[get/put]_io_channel()
APIs. The description is updated to reflect its not specific to
vhost_user transport.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6644198da83bfa0210c167e203d3875e96f1e7ea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11101
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The vhost_user_config_json() will be replaced with callback
to virtio_blk transport.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6ea0ea38f505f0d354cd34ee5ab9cd3a858bd82e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9538
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_vhost_dev structure should only contain generic fields
that are to be used by either vhost, vhost_blk or vhost_scsi
layer.
The vhost_user backend can hold its properties in
spdk_vhost_user_dev, which is maintained within rte_vhost.
Both structures contain references back to each other.
The reference in spdk_vhost_dev is a void pointer to
allow future transports to keep the reference
to their own structures.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I68640c524426d885c20242146365ba242fa9df8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11813
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In a similar manner for what we do for other per IO data-structures of cmds,
cpls and bufs, use the conventional huge-pages based spdk allocation scheme
for RDMA requests and receives.
Change-Id: I4c2e86e928106e78c053f24915e2a9ce1a200c78
Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12273
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
To maximize cache locality, use lifo and not fifo when managing objects
which are used per IO such as the RDMA receive elements queue.
Change-Id: Id8917558acc1bec29943fcbae6afe6b072bde6ac
Reported-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12272
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Due to the same reason as transport_ack_timeout for
RDMA transport, TCP transport also needs ack timeout.
This timeout in msec will make TCP socket to wait for
ack util closes connection.
Signed-off-by: zhangduan <zhangd28@chinatelecom.cn>
Change-Id: I81c0089ac0d4afe4afdd2f2c7e5bff1790f59199
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12214
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In support of upcoming patches and to greatly simplify things,
the capabilites enum which held bit positions for each opcode
has been removed. Only the opcodes enum remains and thus only
opcodes are used throughout. For the capabiltiies bitmap a helper
function is added to convert from opcode to bit position. Right
now it is used in the IO path but in upcoming patches that goes away
and the conversion is only done at init time.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic4ad15b9f24ad3675a7bba4831f4e81de9b7bc70
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11949
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_for_each_bdev() can be applied simply to rpc_bdev_get_bdevs()
because the callback function rpc_dump_bdev_info() is synchronous.
spdk_for_each_bdev() cannot be applied simply to rpc_bdev_get_iostat().
The factored-out callback function _bdev_get_device_stat() is
asynchronous. Add desc pointer to struct spdk_bdev_io_stat and open
before and close after executing spdk_bdev_get_device_stat().
Replace spdk_bdev_get_by_name() by spdk_bdev_open_ext() when a bdev name
is specified.
spdk_bdev_register() checks if the name of each bdev is set and each bdev
is opened while collecting its stats now. Hence it is not possible that
spdk_bdev_get_name() returns NULL. Simplify rpc_bdev_get_iostat_cb()
based on this fact.
Furthermore, we want to fail the RPC for all failures. The callback
function to spdk_bdev_get_device_stat() is executed after stack
unwinding if successful. Defer starting RPC response until it is ensured
that all spdk_bdev_get_device_stat() calls will succeed or there is no bdev.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I7b036d6d707c49d19c8922a159b12b5b5ce7ca41
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12089
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
OpenSSL 3.0 deprecated the MD5_xxx APIs, so switch
the md5 code in the iscsi library to use the EVP
APIs recommended by OpenSSL instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic5e3cd6e30ebc8b027f0715434cc3be045f1b770
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12240
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Some SeaBIOS versions are not aligning virtio_blk_outhdr
on 8-byte boundary, causing ubsan failures. To be safe,
let's just make sure on our end that we only access a
properly aligned structure by copying this small (16-byte)
data structure to a local structure variable.
Fixes issue #2452.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iacad72c3a1759fb8dc5ba411272a34d93ef2a6fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12238
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Only FDs are used for passing them to another process,
we can unlink them after creation.
Here we only unlink the files created in vfio-user,
and there is still one file created via libvfio-user,
it will be fixed via
https://github.com/nutanix/libvfio-user/issues/660.
Partly fix issue #2449.
Change-Id: Ie27640e0cb85f44596e9d0ad5a2b67adf0419f5c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12195
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We can use lib/vfio-user API to setup BAR0 doorbells,
existing implementation is redundant.
Change-Id: Ib880d167c84c6b8482bf1a35559a34c939f6a02d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12211
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The config_bus_number is an offset within the config space reserved for
the devices behind the VMD, while bus_number refers to the actual bus
number assigned by VMD that depend on the VMCAP and VMCONFIG registers.
So, to access the mapped config space we have to use config_bus_number.
We didn't do that when resetting root ports', which could lead to
segfaults if these values were different, as we'd access unmapped
memory.
Fixes#2451
Change-Id: I4e7bbb81400462284014565099bec98f6171c8c9
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12208
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously spdk_vhost_dev_backend held callbacks
for vhost_blk and vhost_scsi functionality, along
with ones that are called by the vhost_user backend.
This patch separates out those callbacks into two
structures:
- spdk_vhost_dev_backend - to be implemented by vhost_blk
and vhost_scsi
- spdk_vhost_user_dev_backend - is only implemented by
vhost_user backend, callbacks for session managment
specific to that transport
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I348090df5dddeb2b1945b082b85aec53d03c781b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Previously g_packed_ring_recovery was set globally.
Setting that during controller creation, would affect
all previously created controllers.
This is now set on per-controller basis and only
enabled if packed ring feature is used.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Idcc7231471446c805154648ab835a6af78f6543c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12040
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Separate parsing generic rpc vhost params form device specific,
this solution allow to create various device which share
common parameters.
Change-Id: I50b1a89a8260fb1394880a750591e95539995288
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
None of the current DEBUGLOGs are in any kind of
performance critical code path. Making them INFOLOG
means that we can enable them even in release builds
to get additional information if needed.
(Note: planning to do this with other libraries and
modules as well.)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I61765fab843a06c36ac1979151589e8f57fea76e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12209
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
No functional change; this just makes the poll code a little easier to
read.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: If6d1dcd940ed5b461856b535b1bf01c4efa8612a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12076
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
RDMA READs can cause cmds to be submitted to the target
layer in a different order than they were received
from the host. Normally this is fine, but not for
fused commands.
So track fused commands as they reach
nvmf_rdma_request_process(). If we find a pair of
sequential commands that don't have valid FUSED settings
(i.e. NONE/SECOND, FIRST/NONE, FIRST/FIRST), we mark
the requests as "fused_failed" and will later fail them
just before they would be normally sent to the target
layer.
When we do find a pair of valid fused commands (FIRST
followed by SECOND), we will wait until both are
READY_TO_EXECUTE, and then submit them to the target
layer consecutively.
This fixes issue #2428 for RDMA transport.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I01ebf90e17761499fb6601456811f442dc2a2950
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12018
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To execute a callback function for each registered bdev or unclaimed
bdev, add new public APIs, spdk_for_each_bdev() and
spdk_for_each_bdev_leaf().
These functions are safe for race conditions by opening before and
closing after executing the provided callback function.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I59b702ffec7b4fc5e9779de5a3a75d44922b829b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12088
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Bdev open/close will be done for each bdev when traversing the bdev
list. This patch is a preparation.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I2486bd823953fe020ed6106844877e1cf49d8a0d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12126
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_io_device_unregister() send message to call its callback. So to
make the following patches easier, consolidate g_bdev_mgr.mutex unlocks
to the end of spdk_bdev_close().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ib3b5c72be06e764918da30d7aa9fbc2ccd33956e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12125
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Bdev open/close will be done for each bdev when traversing the bdev
list. This patch is a preparation.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I4e4fe6f1248176631a74c09585c931b21eb49d2b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12124
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
That will allow to pass ext opts to bdev layer
Since in ext API metadata is passed as part of ext IO opts
structure and ext opts can be NULL (e.g. upper layed used
regular API), in that case we use *blocks_with_md API
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I1bfb3fcb11bf42e100ecc7e4058087f12086db3a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11048
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
memory domains
If bdev doesn't support any memory domain then allocate
internal bounce buffer, pull data for write operation before
IO submission, push data to memory domain once IO completes
for read operation.
Update test tool, add simple pull/push functions
implementation.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie9b94463e6a818bcd606fbb898fb0d6e0b5d5027
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10069
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
To unregister a bdev more correctly, we had to call
spdk_bdev_open_ext(), spdk_bdev_desc_get_bdev(), spdk_bdev_unregister(),
and then spdk_bdev_close(). This was correct but complicated.
Hence add a new public API spdk_bdev_unregister_by_name() which does
the whole correct sequence of bdev unregistration.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I9068d4ac49dca944436e0ba587308fd356dfef75
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12065
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The process of matching qpair to poll group is split into
two distinct parts that occur on different threads.
See spdk_nvmf_tgt_new_qpair().
This results in a race condition for TCP between spdk_sock_map_lookup()
and spdk_sock_map_insert(), which are called in spdk_nvmf_get_optimal_poll_group()
and spdk_nvmf_poll_group_add() respectively.
Fixes#2113
This patch picks a hint from nvmf_tcp for next poll group,
which is then passed down to spdk_sock_map_lookup().
When matching placement_id exists, but does not have
a poll group assigned - the hint will be used.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I4abde2bc9c39225c9f5dd7c3654fa2639bb0a27f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10271
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The rdma buffer for stripping DIF metadata is added. CPU strips the DIF
metadata and copies it to the rdma buffer, improving the rdma write
bandwith. The network bandwidth during 4KB random read test is increased
from 79 Gbps to 99 Gbps, the IOPS is increased from 2075K to 2637K.
Fixes issue #2418
Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Change-Id: If1c31256f0390f31d396812fa33cd650bf52b336
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11861
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently the idxd driver requires VFIO so avoid unexpected errors
if someone tries without it (with UIO).
Temp workaround for issue #2316
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I430cd2193bc8dbd6939af7d0ca799832e7a73213
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11816
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
In the current behavior the iovcnt is lowered before sending it to
the next BDEV in the stack - however if the returned value is ENOMEM
(due to eg. not enough bdev requests in the pool), the request needs
to be returned to its original state, as it would be resubmitted with
skipped iov entries.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Change-Id: I7240510a2ec04594b248f7347e86ac11ecfd26a0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11976
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When iovs are copied from bounce or to bounce, the bounce is usually
alloced from data_buf_pool for better performance, and is multi iovs
instead of a single buffer. Therefore, block-aligned bounce are
supported.
Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Change-Id: If56b21d9e46c73d4c956c227bec33ddd0ab9745b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11860
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Extract the code for DIF from nvmf_rdma_fill_wr_sgl() into
nvmf_rdma_fill_wr_sgl_with_dif().
Then clean up nvmf_rdma_request_fill_iovs() and
nvmf_rdma_request_fill_iovs_multi_sgl().
Additionally, this patch has a bug fix. nvmf_rdma_fill_wr_sgl_with_dif()
returned false if spdk_rdma_get_translation() failed. However, the
type of return value of nvmf_rdma_fill_wr_sgl_with_dif() is not bool
but int. The boolean false is 0 in integer. Hence in this case,
nvmf_rdma_fill_wr_sgl_with_dif() returned 0 even if it failed.
Change nvmf_rdma_fill_wr_sgl_with_dif() to return rc as is if
spdk_rdma_get_translation() returns non-zero rc.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I71cc186458bfe8863964ab68e2d014c495312cd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11965
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: fengchunsong <fengchunsong@huawei.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
To help grep, use a standard sqid:%d style format for identifying queue
IDs.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ib82c81939f85f9beb333a4db10d006524522a1d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11822
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
As a general utility function, move it up with the others.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I32881c01afd9819c889730d7c09163c95fbb827e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11790
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For each queue, track its doorbell location individually, rather than
needlessly recalculating it every time we look up the doorbell value.
This will also greatly simplify shadow doorbell support.
Co-authored-by: Andreas Economides <andreas.economides@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I6882d2f92ee2f2b2b90c54ee14e5f6b41ecca85d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11789
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Separate nvme_rdma_process_event() into nvme_rdma_process_event_start()
and nvme_rdma_process_event_poll().
Use nvme_rdma_process_event_start() and nvme_rdma_process_event_poll()
in nvme_rdma_process_event() to ensure compatibility.
Change-Id: Idc960fab2540efec612dcf22f156acabd2e2874e
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It will be convenient for the following patches to return
negated errno directly.
Change-Id: Ic80181b2ee449946dd60ad0c97a325fd48b92231
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10990
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
nvme_rdma_poll_events() gets the cm_channel pointer itself.
Before calling nvme_rdma_process_event(), we checks the
rctrlr is valid.
Hence we do not have to pass the cm_channel pointer to
nvme_rdma_process_event() via a parameter.
This simplifies the code and makes the following patches a little easier.
Change-Id: I03f095833469c5b64592264d63a592106d49e13b
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11167
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Replace nvme_fabric_qpair_connect() by nvme_fabric_qpair_connect_async()
and nvme_fabric_qpair_connect_poll().
The following is a detail.
Define state of the nvme_rdma_qpair and each rqpair holds it.
Initialize rqpair->state by INVALID at nvme_rdma_ctrlr_create_qpair().
_nvme_rdma_ctrlr_connect_qpair() sets rqpair->state to
FABRIC_CONNECT_SEND instead of calling nvme_fabric_qpair_connect().
Then the new function nvme_rdma_ctrlr_connect_qpair_poll() calls
nvme_fabric_qpair_connect_async() at FABRIC_CONNECT_SEND and
nvme_fabric_qpair_connect_poll() until it returns 0 at FABRIC_CONNECT_POLL.
nvme_rdma_qpair_process_completions() or
nvme_rdma_poll_group_process_completions() calls
nvme_rdma_ctrlr_connect_qpair_poll() if qpair->state is CONECTING.
This patter follows the TCP transport.
Change-Id: I411f4fa8071cb5ea27581f3820eba9b02c731e4c
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11334
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
At this time only spdk_sock_map_insert() allocates
entries for the sock_map. Next patch will introduce it
to lookup too. So now this part is refactored out
to separate function.
This patch should not introduce any functional change.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I46ac88aedebffe0cbc1f4616dc1fcfaf7f950b05
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10726
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_sock_map_insert() allows for allocating a sock_map
entry, without assigning any sock_group.
This is useful for cases where placement_id determined
by the component using spdk_sock_map_*. See PLACEMENT_MARK mode.
Placement_id's are allocated first, then an empty one is found
using spdk_sock_map_find_free().
Since the above is a valid use case, then entry in sock_map
can exist without a group assigned. spdk_sock_map_lookup() has
to handle such cases, rather than trigger an assert.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ia717c38fef5e71fe44471ea12f61a5548463f0cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10725
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: wanghailiang <hailiangx.e.wang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
As we now only support a single WQ, there's no need for a teble of
them and no need to assert that the stride from WQ to WQ is the
same as the WQ struct size.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I205f36aae22070f532653726dd75249bbafbe3ef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12081
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This patch fixes the issue with custom nvme transport. It is possible
to register custom nvme transport with arbitrary name but it is not
usable because 'spdk_nvme_trid_populate_transport' call in probe
function will always set trstring to 'CUSTOM' and transport lookup
will fail.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I83fd24dd8732ac0a21e22435e0acff20ab0e7521
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9557
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
First in a series of patches that will enable multiple engines
to exist at once and choose the best one based on their priorities
and capabilites, the public API will no longer be needed.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia87b83aa2263745a94a822a160b6e97bb2e0dc19
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11948
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In the compression operation we may have SGL input
if user's buffer is fragmented or less than chunk_size.
If the backing device doesn't support SGL input then
we should copy user's buffers into decomp_buffer
(including paddings if any).
In the decompression operation, if the backing device
doesn't support SGL output, we use a single output buffer
which is pointing to decomp_buffer. Once the operation
completes, we should copy the result into user's buffers.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ic7fddd38374bb6898256633eacd192dbaf36541a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11970
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
R2Ts can cause cmds to be submitted to the target
layer in a different order than they were received
from the host. Normally this is fine, but not for
fused commands.
So track fused commands as they reach
nvmf_tcp_req_process(). If we find a pair of sequential
commands that don't have valid FUSED settings (i.e.
NONE/SECOND, FIRST/NONE, FIRST/FIRST), we mark the
requests as "fused_failed" and will later fail them
just before they would be normally sent to the target
layer.
When we do find a pair of valid fused commands (FIRST
followed by SECOND), we will wait until both are
READY_TO_EXECUTE, and then submit them to the target
layer consecutively.
This fixes issue #2428 for TCP transport.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8a9e13690ecb16429df68ae41b16b439a0913e4e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/12017
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
This opption allows the bdev_get_bdevs RPC to block until a bdev with
specified name appears. It can be useful, when a bdev is created
asynchronously and the exact moment at which it appears is not known.
For instance, with a discovery service, a bdev is created when a
namespace on a remote NVMeoF target is added, but it's not possible to
specify when that happens exactly.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6c1f974fba445376ca9d45aac2639202547410cc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11960
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Some targets report they support log page offset,
but then fail GET_LOG_PAGE commands that specify
a non-zero offset, or report the wrong number
of discovery entries when reading more than the
discovery log page header but not the entire log
page.
So just revert to reading the entire discovery log
page, after we've read the header to know how big the
log page will be. This means that when we read the
log page initially (without the individual entries),
we need to save off the genctr, since it will get
overwritten when we read the log page again. We can
just store this in the discovery context, and compare
it to the genctr that we read with the whole log page.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I34929253312fed9924db58904a051f3979283730
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11478
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
The driver always creates a single group containing all of the engines
and a single work queue.
Change-Id: I83f170f966abbd141304c49bd75ffe4608f5ad03
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11533
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It is not used for anything.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1d967b2d0e404756f7ceda98ddc4ee9017ec83f7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11489
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
It turns out that this can stay on the stack.
Change-Id: I961366307dae5ec7413a86271cd1dfb370b8f9f3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11488
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
initialization
We can make the structs do all of the offset math for us.
Change-Id: Ibe6d86c2abc58655c1354f1eb31091c95cfb283c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11487
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
These aren't ever accessed in the main I/O path, so we can read them in
whenever we need them and make the code a lot simpler.
Change-Id: Icfdbfe9f2d9db13f4d0d28b2b4103cd0c443bcf4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11485
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Assume any given WQ we are allocated can submit the full device allowed
queue depth.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I044e1f70031ea83ae722ed285b84c06b3e5efb27
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11486
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This is no longer needed.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I08c788ca0451e739804b568d613c1e52e071c61f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
In vfio-user transport, whenever one IO is completed, it will trigger
an interrupt to guest machine. This cost quite some overhead. This patch
adds an adaptive irq feature to reduce interrupt overhead and boost
performance.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I585be072231a934fa2e4fdf2439405de95151381
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11840
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
responder_resources parameter of rdma cm tells remote
side how many outstaing RDMA_READ of atomic operations
local side can handle.
Previously it was adjusted on queue depth but that was
not correct since these parameters do not depend on
each other. Even with qdepth=1 remote side may send
several RDMA_READ operations per 1 IO request.
With this change we report responder_resources
equal to the maximum supported by RDMA device.
Linux kernel nvme rdma driver reports this value
in the same way.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I77e5c2ead6269da44c32a75a9188429f50d32ae4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11698
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Fill is sent in as a uint8, we need to populate the full uint64
input with the uint8 pattern or we'll get a miscompare. This is
how idxd was doing it, instead of adding the same code to ioat just
move it up a layer.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia4aab1c6230f35ab88bb8a0e3b8e16dbd93007c7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11947
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Functions _start_writev_request and _write_decompress_done
have very similar decomp_iov configuration code, the only
difference is whether to fill gaps with zeroes or with
decompressed data. Move this code to a common function,
that will reduce amount of changes in the next patch
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I14509a17a12156b25ceab85a98b4dbd6fb11c732
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11969
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add NVME_CTRLR_ERRLOGs to nvme_ctrlr_process_init().
The main goal is to help with debugging #2201 issue.
Change-Id: I1ae6a9b30d6124dfe25eb7912402c37d476b0d4c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10627
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Rename ->doorbells to ->bar0_doorbells. This will help avoid confusion
later with shadow doorbells.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Id432938cfeb3033e79dc6e1b491dad964227687a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11788
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
NVMe over PCIe Transport Spec 3.1.2:
The host should not read the doorbell registers.
Explicitly refuse these reads.
Co-authored-by: John Levon <john.levon@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ie64fd5ce7988ee86c612b3ef6046a57af467e266
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11787
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Refactor controller reset a little bit for cleaner code.
Co-authored-by: John Levon <john.levon@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I2b3323005d4e788ffe980d41c349702828886981
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11786
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If we're in interrupt mode and live migrating a guest, there is a window
where the I/O queues haven't been set up but the device is in running
state, during which the guest might write to a doorbell. This doorbell
write will go unnoticed. This patch ensures that we re-check the
doorbells after an I/O queue has been set up.
Fixes#2410
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I161d2a0e7ab3065022b2bccbe17f019640cceeba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11809
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make sure cq->group is set even in interrupt mode.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9f722917a8e3aebbd5d66648a3909f795897ec1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11997
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Parameters submitted by bdev trace record may not be uint64_t
type (ex: bdev_io->type), which may bring outliers in trace parsing
results. The reason is that, in _spdk_trace_record() function,
va_list hold information about variable arguments, when passing
SPDK_TRACE_ARG_TYPE_INT arguments, it will call va_arg(vl, uint64_t)
to get value. However, if the submitted type is not uint64_t,
va_arg may return outlier.
To solve the problem, the parameter should be converted into
uint64_t before be delivered into trace record if it is not
uint64_t.
Signed-off-by: zhaoshushu.zss <zhaoshushu.zss@alibaba-inc.com>
Change-Id: Id710e39581f5e7b7551f3ff3a308f122f1344f1f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11691
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
pending_async_op_num is specific to vhost_user.
To prepare for virtio_blk transport abstraction,
checking for the pending operations is moved to vhost_user
specific code from vhost_blk.
Ideally the vhost_user_dev_unregister() should be the only
function that checks this field, but due to
order of calls in vhost_scsi it needs to remain there separetly.
This should not pose any issue, as vhost_scsi is still dependent
directly on vhost_user.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I53dde73766dcf8596715560f6909a4679f8da930
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
That is done to correctly handle metadata pointer which is part
of ext_opts structure. It will also be used by the next patch to
remove memory_domain pointer if request which uses local buffers
is split
Force the user to set correct ext_opts size, update API functions
description.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I77517d70df34a998d718cc6474fb4c538a42918f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11349
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Bdev modules must not access internal bdev_io
structure, so add a new pointer in a public
section. Pointer in internal section will be
used in next patch
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ib631563015b3e5fa9300d22b7ae59d8db43c8275
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10421
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is a preparation for enabling of memory domains
pull/psuh functionality. Since memory domains API is
asynchronous, this patch makes asynchronous operations
with bounce buffers.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ieb1f2a0c151149af58cfd7607dbde4c76c3c288d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Here were some cases for file->name using strdup
missing check for no memory, but some had.
So add them.
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Change-Id: I91feeea3711f127135aedf37e53624811a4ab5e8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11989
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
With these we can write a simple bpftrace script to identify work being done,
and in particular what woke us up from sleeping.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I8997d847625ee4558092dbd753e6fc1b17beca92
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9424
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
SPDK has settled on what the optimal DSA configuration is, so let's
always use it.
Change-Id: I24b9b717709d553789285198b1aa391f4d7f0445
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11532
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The names on these were changed.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib75a60342c08f72dad39635a9244421c1cca5485
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11793
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The feature will be redesigned and restored in the following patches.
For the NVMe bdev module, it can reconnect by itself without relying
on the feature.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I2d9c0437f7ad8412ad8cf40d11e574723b735bee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For nvme_rdma_qpair_process_completions(), consolidate the operations to
call nvme_rdma_fail_qpair() and return -ENXIO into a single place.
Besides, shorten pointer references for
nvme_rdma_qpair_process_completions() and
nvme_rdma_poll_group_process_completions().
These will make the following patches a little easier.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iaf72cfca0b5b3ba223d86e267da8069d43a15292
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11439
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add a new flag is_disconnecting to struct spdk_nvme_ctrlr.
Separate calling nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done()
by using the flag is_disconnecting.
Additionally, change nvme_ctrlr_fail() to skip setting ctrlr->is_failed
to true if ctrlr->is_disconnecting is true.
Change-Id: Ie2c74ba41f120662a30f6198751d07005d23abcf
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11000
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
nvme_transport_ctrlr_connect_qpair() calls
nvme_transport_ctrlr_disconnect_qpair() if failed.
If async qpair disconnect is supported, even when connect qpair failed,
nvme_transport_ctrlr_connect_qpair() may complete asynchronously later.
The cases that qpair->async is set to true are I/O qpair for the NVMe
bdev module and admin qpair.
example/nvme/perf and example/nvme/reconnect use I/O qpair but both
set qpair->async to false.
For the NVMe bdev module, I/O qpair is connected when creating I/O
channel or resetting ctrlr. If spdk_nvme_ctrlr_connect_io_qpair()
returns 0 for a I/O qpair, the qpair is in a poll group and is polled
by spdk_nvme_poll_group_process_completions() and a disconnected
callback is called to the qpair. Hence we do not need to add additional
polling for I/O qpair in the NVMe bdev module.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6e0aadcfd98e5cb77b362ef1a79e0eca2985f36e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11112
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If poll group is not used and if qpair is disconnecting,
spdk_nvme_qpair_process_completions() has to poll qpair until it is
actually disconnected even if ctrlr is failed.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6da84f1e35780d21480fbe6f07e76af3048a777b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11018
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Serparate spdk_nvme_ctrlr_disconnect() into nvme_ctrlr_disconnect()
and nvme_ctrlr_disconnect_done() to call nvme_ctrlr_disconnect_done()
after adminq is actually disconnected when disconnecting adminq
asynchronously.
The following patches will add a new flag is_disconnecting to struct
spdk_nvme_ctrlr and prevent us from setting ctrlr->is_failed to true
between nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done().
By this patch, nvme_ctrlr_disconnect() and nvme_ctrlr_disconnect_done()
are executed in the same context. So it is not possible to set
ctrlr->is_failed to true between nvme_ctrlr_disconnect() and
nvme_ctrlr_disconnect_done().
Hence nvme_ctrlr_disconnect_done() does not have to clear ctrlr->is_failed
again.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I18b5b68f37e27b54782691823edae9738c26faa1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10999
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change spdk_nvme_ctrlr_reset() to use spdk_nvme_ctrlr_disconnect(),
spdk_nvme_ctrlr_reconnect_async(), and
spdk_nvme_ctrlr_reconnect_poll_async().
Then remove the deprecated spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().
These changes simplify the following patches to make
spdk_nvme_ctrlr_disconnect() asynchronous.
Change-Id: Ia71e8e0ad5b2dff42b7423634f66de47863926e2
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10913
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a preparation to make nvme_transport_ctrlr_disconnect_qpair()
asynchronous.
For nvme_transport_ctrlr_disconnect_qpair(), factor out operations after
returning from transport's specific ctrlr_disconnect_qpair() into a helper
function nvme_transport_ctrlr_disconnect_qpair_done().
Then move nvme_transport_ctrlr_disconnect_qpair_done() into the end of
the transport specific ctrlr_disconnect_qpair().
Additionally remove the operation to overwrite the qpair state to
DISCONNECTED from nvme_transport_connect_qpair_fail() because
this is duplicated and nvme_transport_ctrlr_disconnect_qpair() is responsible
to make the qpair disconnected even after it completes asynchronously.
Change-Id: I9c8faa7039d306d3e31a8f51826755ce8840a8aa
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10851
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We were accidentally reading the spdk_nvme_status from the CPL; on one
test, this was contributing to 2% of cache misses.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I7c967690458a183799f8d835360800d3094c3131
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11849
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Trivial chore, cleanup multiple branches for subsystem->flags.ana_reporting
that can be collapsed wtihin spdk_nvmf_ctrlr_identify_ctrlr, which makes the
code easier to read.
Signed-off-by: Jon Kohler <jon@nutanix.com>
Change-Id: Icb2064f078d730d1f28f299582dc46254c7f6df3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11951
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When rq is used, rnr_retry_count is not set. As a result, the initiator
cannot connect to the target after some error. Set the rnr_retry_count
to avoid infinite retries.
Fix issue #2429
Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Change-Id: Iab03ea5ff10f5a9759c41f26589731aa92ef2f56
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11946
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Make use of code implemented in previous patches in the series
to get and set dynamic scheduler values.
Modifiy app.py and rpc.py to accomodate new changes and allow
user to specify scheduler parameters in the RPC calls.
Change-Id: I6173aefbf1d774b91b80ee5bce67eea80a2ab23d
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11449
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Creates a JSON on scheduler side to return after .get_opts is called
and parses a JSON on .set_opts call.
The JSON passed to dynamic scheduler on .set_stats is a copy of a
pointer already available during RPC framework_set_scheduler call.
Getting and setting scheduler stats via RPC calls is going to be
implemented in the next patch in this series.
Change-Id: I62880a71066a140c74336a5725e7b10952008e5c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11448
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_nvme_poll_group has followed spdk_nvme_qpair about how to
process I/O qpair deletion inside of a completion context.
spdk_nvme_qpair_process_completions() accesses qpair after
returning from nvme_transport_qpair_process_completions().
So this is reasonable.
On the other hand, if spdk_nvme_poll_group_process_completions()
can execute spdk_nvme_ctrlr_free_io_qpair() inside of a completion
context, the target qpair is ensured to be deleted after returning
from spdk_nvme_ctrlr_free_io_qpair(). Then the target qpair is
not accessed anymore in spdk_nvme_poll_group_process_completions().
Remove two variables, in_completion_context and num_qpairs_to_delete,
of spdk_nvme_transport_poll_group and the related code.
This change is really necessary to support the following case.
In the NVMe bdev module, a nvme_qpair has a qpair and a poll_group
channel. disconnected_qpair_cb calls spdk_nvme_ctrlr_free_io_qpair()
for the qpair and spdk_put_io_channel() to the poll_group_channel.
spdk_nvme_ctrlr_free_io_qpair() is executed after unwinding stack
but spdk_put_io_channel() is executed now. The callback to
spdk_put_io_channel() calls spdk_nvme_poll_group_destroy(). However,
spdk_nvme_ctrlr_free_io_qpair() is not executed. Hence
spdk_nvme_poll_group_destroy() fails.
Update the corresponding stub in unit test together.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Icd1f1daf049c6c7ffb28790fe87989a1060f8952
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11496
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We want to call disconnected_qpairs_cb only if qpair is actually
disconnected. When we disconnect qpair asynchronously, for qpairs in
the group->disconnected_qpairs list, we want to poll them until
actually disconnected and then call disconnected_qpairs_cb for them.
As a preparation, call disconnected_qpair_cb only for qpairs which is
in the group->disconnected_qpairs list.
For TCP and PCIe transports, disconnecting qpair will continue to be
synchronous for now.
So we change only RDMA transport.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ifaf6157e1e02fa13f52a66409c9e60fc814d71dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11495
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When generating a discovery log page, we will add entries
for the discovery subsystem for all listeners except the
one associated with the controller that generated the
log page command. We do this comparison using
spdk_nvme_transport_id_compare().
But this function compares the subnqn of the trid, and
the subnqn is not set in either of the trids that we
are comparing.
The listener's trid always has an empty subnqn, but
the source trid has an uninitialized subnqn when
we do the comparison. This means that sometimes the
subnqn may be empty (which always happens in debug
builds) but sometimes may contain garbage. This
means that sometimes an entry would be added to the
log, even for the trid of the discovery controller
that generated the command (meaning the discovery
controller would end up referring to itself which
is not allowed).
There is an even more subtle issue with this. If the
host reads just the log page header, the nvmf target
generates the entire log page, and just returns the
header contents. Let's say in this case, the source
trid has an empty subnqn, so we don't generate an entry
for it, and report numrec = X and genctr = Y. Then
the host reads the X log page entries. But now the
source trid is garbage, so a discovery log page entry
is returned, replacing one of the "real" log page
entries. And since genctr didn't change, the host
thinks the data is all valid, meaning there's a log
page entry for an NVM subsystem that ends up getting
dropped.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I96cfc566ddaf17153aec089bf3d9b3480bec3e4b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11933
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
spdk_sock_close() may not complete synchronously.
Then the following scenario is possible.
ctrlr_disconnected_qpair() calls spdk_sock_close(). Then any request may
complete and call _pdu_write_done(). qpair is still associated with a
poll_group and is inserted into the group->needs_poll list.
After ctrlr_disconnected_qpair() completes, disconnected_qpair_cb() is
called. disconnected_qpair_cb() calls spdk_nvme_ctrlr_free_io_qpair().
spdk_nvme_ctrlr_free_io_qpair() calls spdk_nvme_poll_group_remove().
spdk_nvme_poll_group() removes qpair only from the
group->disconnected_qpairs list.
Even after qpair->poll_group is nullified, qpair is still in the
group->needs_poll list.
Then spdk_nvme_poll_group_process_completions() polls all qpairs in
the group->needs_poll list and hits the assert.
Fixes the issue #2390
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I6ede8bd3b7b1a57a34ac61436159975ab6253fbe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11882
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
The root ports might have been configured by some other driver (e.g.
Linux kernel) prior to loading the SPDK one, so we need to clear it. We
need to before the scanning process, as it's depth-first, so when
scanning the initial root ports, the latter ones might still be using
stale configuration. This can lead to two bridges having the same
secondary/subordinate bus configuration, meaning that their config space
would map to the same memory area, which, of course, isn't correct.
This has manifested in issue #2413, where two root ports were configured
to use the same secondary bus. This caused an endpoint device to be
enumerated twice on two different root ports, with the first instance
being broken once the second port was configured by the SPDK driver.
Fixes#2413
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5ce0931a84c1d23ccadb93fe39e8155ff1281474
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11863
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will allow to use these functions without having to instantiate an
instance of vmd_pci_device. The following patch will use this to
perform some initial clean up before the scanning process.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Icff92a4a429b259bec13eb6b0c1581aadbaae24d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11862
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tom Nabarro <tom.nabarro@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently shadow doorbell updates are not counted; add statistics for
those, and rename the other statistic for clarity.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I211a77902e38265c99b15862034c6d022dc582a0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_pcie_qpair_update_mmio_required() is only called for QPs with
shadow doorbells enabled, so we don't need an additional branch here.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idf65f92deb0c6f93019892ef5a02bc28ba4c0f8c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11843
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_pcie_qpair_update_mmio_required() qpair argument is unused.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I0bd6897eb8e6a06f211cc599ab6045409bb43641
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11842
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The current error handling of `spdk_mem_map_notify_walk` has off by 2
issue. This issue can split one memory region to multiple smaller
regions when calling the callback to unregister the memory region.
Also, in case of failure, the 1 GB maps of the map failed to be freed.
RDMA doesn't support this behavior and support calling the callback only
once for each previously registered memory region.
Signed-off-by: Aviv Ben-David <aviv.bendavid@vastdata.com>
Change-Id: I65b667f2e84533f234a2e330b20e9ad9eef32854
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11219
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
That is a preparation for adding memory domains
pull/push functionality which works with iovec
structure instead of pointers
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I3d0db3e7b302b45b4e3bae3452bc5ffa3e62db8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11687
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
At this time only rte_vhost_user makes use of the virtio related
functionality. Which is used by vhost_scsi and will be used by
vhost_user_blk virtio transport.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ia159e14ce5f9a74185da9898713deeff76d14f1a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11100
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If 0 - UINT32_MAX or UINT32_MAX - 0 is substituted into a int variable,
we cannot get any expected result.
Fix the bug and add unit test case to verify the fix.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iad2ea681ad8ad234e70c7310b58785a999612156
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11837
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
The spdk_msg_mempool structure has a fixed size, which is not flexible
enough. The size of the memory allocation can now be changed in the
options given to the spdk_app_start function.
Signed-off-by: Alexis Lescouet <alexis.lescouet@nutanix.com>
Change-Id: I1d6524ab8cf23f69f553aedb0f5b0cdc9dde374b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11635
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Enable runtime scheduler changing in SPDK.
Currently we do not allow to switch between different
schedulers after "framework_start_init" rpc is called.
This is cumbersome, because user has to reinitialize
SPDK each time they want to make changes.
Change-Id: I97d723c6ba966d25e8e57ed023fc1d0826e1b402
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11404
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
To speed up qpair lookup, an idea to use the DPDK hash library was
proposed. But we have some concerns for the idea, we may see performance
regression for small configuration, lib/nvmf is depending on the DPDK
feature, and so on.
On the other hand, RB tree is simple to use and is not likely to have
negative side effect.
To use RB tree easily, cache QP num into struct spdk_nvmf_rdma_qpair.
Change-Id: Id4ea4e75692f86335b3e984c3fc128e187f22c72
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11798
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
nvmf_rdma_poll_group_add() added a rqpair to the poll_group->qpairs list
before its initialization.
If we change the poll_group->qpairs list from linked list to RB tree,
the qpair can be added to the poll_group_qpairs list only after its
initialization completes because QP number is necessary.
As a preparation, change nvmf_rdma_poll_group_add() to add a rqpair to
the poll_group->qpairs list after its initialization succeeds.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I42525f68c5efbd445f96da8f4f63d31f06f739e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11800
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When vfu_run_ctx() returns EBUSY, due to an ongoing quiesce, we did no
work, so should return SPDK_POLLER_IDLE.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I5953ab652f0adf22df81c94c4bece507833cece4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11810
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We will use nvmf library CSTS.CFS instead so that the client can
get this error status.
Change-Id: I42c248a7333d1f9c940bb29135c887a61c906bd4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11676
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
No logic code changes, we will define a nvmf controller migration
data structure in public header file, this is a preparation patch.
Change-Id: I0a98581b320ac9b12b7e4bf838222c5075168e64
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11493
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Plumbing for flags was added in prior pathces. This patch
introduces and respects the relevant flags for use with PMEM
aka durable memory through the accel_fw, IDXD, IOAT and SW
modules.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I792f31459e061d220965feced60e0c236d819a68
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9455
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is just plumbing the flags param. Use of it for PMEM
will come in upcoming patches.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I620df072aaad3f8062a0312bbea3da1bc3f911b9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9281
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A new public flag, SPDK_IDXD_FLAG_NONTEMPORAL, is introduced
that hints to DSA that it should bypass CPU cache. An example
use case of this would be where the target is PMEM. This flag
is not set by the low level library (so default is that CPU
cache is the target).
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I99fc62d6bfd6ec97985a86fc1ae5454a21679684
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11482
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously required flags were hardcoded in the low level library.
By having the user pass them in there is more flexbility and control.
This was driven by the need to add a new flag for pmem durability,
coming in a future patch in this series.
There is no change in functionality with this patch, just movement
of where flags are set and by whom and the plumbing of 'flags'..
Also note that some flags in scenarios that we know are required are
still set by the library.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I194278f9e3cec0886628585cf84bcc2eae635e0a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9449
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
- Added build system logic for checking mlx5 crypto PMD support. Added
required libs in make files.
- Changes in mlx5 reduce build system logic since both crypto and reduce
use common libs and libmlx5 related checks.
- Both mlx5 crypto and reduce require -libverbs.
Signed-off-by: Yuriy Umanets <yumanets@nvidia.com>
Change-Id: Ice1b88fe74bb04bf715ecaac70c4a8d4f3b5d782
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11620
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reverts commit 3bacd6653d.
Change-Id: I8dbaffc9f50cf9627720667644496cdaf4e81c3f
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11723
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Prior to this patch bs_user_op_abort() always
returned EIO back to the bdev layer.
This is not sufficient for ENOMEM cases where
the I/O should be resubmitted by the bdev layer.
ENOMEM for bs_sequence_start() in bs_allocate_and_copy_cluster()
specifically addresses issue #2306.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Icfb0ce9ca20e1c4dd1668ba77d121f7091acb044
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11764
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When using --with-dpdk=dpdk/install option, we need to check if the
librte_compress_isal was built in DPDK before adding it to the list
of libs. rte_compress_isal is not built by DPDK if libisal is not
installed in the system.
Signed-off-by: Yuriy Umanets <yumanets@nvidia.com>
Change-Id: Iceb4ebb8cee81aa4254e0a878653c8a8dd50cac6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11618
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
- In case that --with-dpdk=dpdk/install option is used, we need to link
with the proper IPSec_mb libs the DPDK was built with rather than using
the default location of IPSec_mb submodule located inside the SPDK dir.
- Check with pkg-config if we need to link with IPSec_mb. Find the proper
IPSec_mb library path for DPDK specified with --with-dpdk=dpdk/install
option.
- Use the same behavior for plain --with-dpdk.
Signed-off-by: Yuriy Umanets <yumanets@nvidia.com>
Change-Id: Iea56d20d20556d63cb7ffb5ce2ca78f2244796e1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11617
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
- DPDK may or may not decide to use libbsd. SPDK needs to find this out
and add -lbsd to the list of libraries to prevent linking issues in
case that --with-dpdk=dpdk/install option is used.
- Use pkg-config for proper detection if the libbsd is in use.
Signed-off-by: Yuriy Umanets <yumanets@nvidia.com>
Change-Id: Ie3de0363fefb9b7337394b00adc862839834f164
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11616
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
- Reduce the size of initial memory needed by OCF.
Number of allocator buffers equal to 16383 is tested to work
on 24 caches running IO of io_size=512 and io_depth=512, which
should be more than enough for any real life scenario.
This reduces initial OCF memory usage from 726 MiB to 392 MiB.
- Fix string handling for the name of the mempool.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Change-Id: I40063ab1897c479c25904ae4096c5dae3351f73b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10843
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To fully support interrupt mode for the vfio-user transport, we need to
arrange to wake up in one of two conditions:
- we receive a vfio-user message on the socket
- a client writes to one of our BARs
In response, we can process any pending vfio-user messages, as well as
poll the actual queue pairs.
As there is no way for a client-mapped BAR write to cause us to wake up,
interrupt mode can only work when mappable BAR0 is disabled. In that
case, each BAR write becomes a vfio-user message, and can thus be
handled by registering the libvfio-user socket fd with SPDK.
For the poll group poller, we enable interrupt mode for it during the
vfio-user ->poll_group_create() callback; this only works in the case
that no other transports without interrupt mode support are sharing that
poll group.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic784d01078b2006edb27892fc0613f934eca930b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10722
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
Keep track of g_vhost_user_init_thread, local to the
rte_vhost_user.c.
There is no need to track this in generic vhost layer.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1fd22e196a3091284f5f9c3c0c7c70a0e18514cb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11075
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Remaining functions that manage or interact with vsessions
are now placed in rte_vhost_user.
Renamed the functions appropriately with vhost_user_* prefix.
While here g_dpdk_sem was made static, since rest of references
from vhost.c was removed.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie9fbf5f08910c136711fb1dfab1b35a5488f0c25
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11025
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This functionality is specific for rte_vhost,
so move it to appropriate file.
Renamed to include vhost_user_* prefix.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I9630902f52d4944d0d18a39dfbff05945ce2bdba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11024
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Functions defined by set of callbacks in vhost_device_ops,
are going to be only used by rte_vhost.
This patch moves those functions into the file, removing them
from vhost.c
g_vhost_user_dev_dirname and _stop_session() are no longer
referenced from vhost.c, so can be removed from vhost_internal.h.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I233206fae3ad5b4549172ac4bd2b036df9ac548b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11023
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Note that this also works around a false positive in
gcc-11 of type -Wstringop-overread.
Fixes issue #2391.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib5301b9ef9fa3ead2a1a2318655533a8cfba03fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11709
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When generating the discovery log page, add entries for
the discovery subsystem, skipping the listener associated
with the command generating the log page.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id6a14a7d5cdce483f8f3c2eff1b4ededd40bc029
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11542
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently we accept connections to the discovery
subsystem on any listener that has been added to
any subsystem (not just the discovery subsystem).
This is not proper behavior, especially for TCP. TCP
defines port 8009 (not 4420) as the discovery port,
so the current behavior means that if NVM subsystems
are listening on port 4420, then the discovery
subsystem by default is listening there too.
For now, continue to allow connections, but print
a warning message when someone connects to the
discovery subsystem on a listener trid that wasn't
previously added.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I734bc49d1a21b2edfb675aef4b8551e2d0ccd4d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11539
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
According to NVMe over Fabrics spec number of SGLs supported by the
controller is reported in MSDBD. But it is also implicitly limited by
command capsule size (IOCCSZ) since SGL are passed in capsule.
This patch adjusts max_sges to capsule size if required. Adjustment to
MSDBD is also moved to transport layer because it is fabrics specific
parameter and is not valid for PCIe transport.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I44918eb949345c61242ca50a524d21d04b6ac058
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11669
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This confirms that the error path can return more efficient
without memcpy such as xattr->name.
Signed-off-by: Dong Yi <dongx.yi@intel.com>
Change-Id: Ic2ed28121ed76eda9d7b24ed6c4c95b0588817de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11654
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
One simple fix for nvmf_vfio_user_qpair_abort_request().
Current implementation mixed up request of abort cmd and the request
to abort, which cause problems.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: Ia0db9aa738e372789fc502ef877fd1c841c0a2e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11711
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The patch
nvme: Set dnr to zero for nvme_qpair_abort_reqs()
1b3172f726
did the change stated in the title.
However,
Revert "nvme/rdma: Correct qpair disconnect process"
c8f986c7ee
destroyed it for RDMA transport.
Additionally, we had still set DNR to 1 in nvme_qpair_init().
This patch fixes both.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Iee60ac24aa7e04cce0f394014c9d9afc9d2b56ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11644
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When function returned from the error handling the mempool on
'sess' was not released which lead to a memory leak.
Fixes issue #2393.
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Change-Id: Ida3651e9369fb5c4948969480d398a723b2cb6a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11714
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
We do the null check for 'fc_req', but already dereferenced it
before the check. Swap their position to avoid null dereference.
Fixes issue #2395.
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Change-Id: I33b9e6b51b54f6ada9c072cf7ab0acda2622472f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11721
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
A common pattern is:
if (foo->thread == spdk_get_thread())
cb(arg);
else
spdk_thread_send_msg(foo->thread, cb, arg);
for cases where it's important the callback runs on a particular thread,
but it doesn't matter if it's synchronous or asynchronous.
Add a new API to support this pattern, and convert over the current
instances.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idfbf77c02c9321c52e07181ffd8b0c437e1ab335
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11503
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
We can ask libvfio-user for the listening socket fd, and register that
for SPDK interrupt handling.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I8d0ba7a86403f2d0170b9359480f1fefc1036557
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10721
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
The accept poller only needs to run when vfu_attach_ctx() makes sense:
in other words, when we don't have a controller created.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Icef4e6184c9ae6d7951d015530a05132c4ba6994
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10720
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Now vfio-user owns its the accept poller itself, there's no reason to
loop across all endpoints: instead, the lifetime of the accept poller is
better matched by creating it in the ->listen() transport callback.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ia92e29b1cee263f1461f640cfdd27cdb674848fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10719
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
For the benefit of forthcoming vfio-user changes, register the poll
group poller prior to calling the transport create callback, and pass in
a pointer to the poll group itself.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idbc24126c9d46f8162e4ded07c5a0ecf074fc7dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10718
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Json writes this scheduler name using pointer, but it sometimes be null.
This case it exists, so add check for it.
Fixes issue #2384
Signed-off-by: Dong Yi <dongx.yi@intel.com>
Change-Id: I7c989c72cfbd53ea6b02d86457b29440484e5a37
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11677
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The above values were added with shared CQ feature, and they
are left when restore the CQs in destination VM.
Change-Id: Ib1f28dad833da31e571eb2e2f0b5c81f0bf05a3b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11419
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The CSTS register in source VM was left to migrate.
Fix#2362.
Change-Id: Ieef028578d2897b27a6ff594b16801462eb1b75e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11418
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The only release to not unlink hugepages after mmaping
them is for multiprocess. But if shm_id is not
specified, then we aren't using multiprocess. This
ensures that all hugepages get released when the
process exits, even if there is memory in those
hugepages that was not freed during process shutdown.
Make sure we don't enable both huge-unlink and
single-file-segments at the same time though, DPDK doesn't
support that.
Note that even when using multi-process, if hugepages
aren't released, they aren't really leaked. DPDK will
clean them up next time the application runs.
Fixes issue #2267.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I017bd4f7ed9cf6aaa141879539b099fb48f357f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10991
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Move some of this code into separate functions. There
is no change in functionality here - this just helps
reduce the size and complexity of an upcoming patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I72243f17312e66bb6ef2168b9b78076c9eb2e6f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11531
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The dif_enable is not cleared in _nvmf_rdma_request_free. When the
rdma_req is used again, the dif_enable is true while
dif.dif_ctx->block_size is zero. As a result, an infinite loop occurs in
nvmf_rdma_fill_wr_sgl.
Fix issue: #2380
Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Change-Id: Ic179855f7b257e39ed4a5f6705fbc9dea64210ae
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11646
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Since there is a pthread_mutex_unlock() in normal condition, another
pthread_mutex_unlock() in the "tmp != NULL" branch should be removed,
otherwise will cause a double unlock.
Fixes issue #2378.
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Change-Id: I6c80a9527dd60e0b7c1d3c54b6da371b31118f02
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11642
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
bs_back_dev_lba_to_io_unit() and bs_num_pages_to_cluster_boundary() are
unused inline functions. The last consumer (by the earlier _spdk_* name)
was removed in commit 6609b776.
Change-Id: Ib1babfed8002fb44451b337aa0db66c15a6805d2
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
bs_sequence_to_batch_completion has been unused since the removal of
other unused code in commit ba870c2e99.
Change-Id: Ifb60c65c1c68d1855b49eda4c57d99e983bca5ec
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11560
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
rte_dmadev was introduced in DPDK 21.11, and rte_vhost
is now dependent on it. So link rte_dmadev if we find
it and if CONFIG_VHOST is enabled.
Fixes issue #2374.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iccbf7cb897f51cbc9d545274d4d00a442b2fd353
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
For reference, this was deprecated in Aug 2020, commit
ID 511fe1553.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic0a16af9113b8b136271a2ce6a071bbc379261c4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11545
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This was a parameter on the nvmf_create_transport
RPC, and was replaced with max_io_qpairs_per_ctrlr to
reduce confusion on whether this number included the
admin queue or not.
nvmf_vhost test was using this deprecated parameter.
Change it to use -m (max_io_qpairs_per_ctrlr)
instead. '-p 4' would have been evaluated as 1 admin
queue + 3 I/O queues, but it's likely the intent
was for 4 I/O queues. This is a perfect example of
why this parameter was deprecated.
For reference, this was deprecated in June 2020,
commit 1551197db.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4364fc0a76c9993b376932b6eea243d7cefca9cd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11543
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
To supplement useful information in bdev trace record,
offset and len of each I/O have been added.
Signed-off-by: zhaoshushu.zss <zhaoshushu.zss@alibaba-inc.com>
Change-Id: I3e776144d16cb9eda2a9fb72b83d423ac3050f0d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11504
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
With async connect, we need to avoid the case
where the initiator is sending the icreq, and
meanwhile the application submits enough I/O
such that the request objects are exhausted, leaving
none for the FABRICS/CONNECT command that we need
to send after the icreq is done.
So allocate an extra request, and then use it
when sending the FABRICS/CONNECT command, rather
than trying to pull one from the qpair's STAILQ.
Fixes issue #2371.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If42a3fbb3fd9d863ee48cf5cae75a9ba1754c349
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11515
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Group fields such that those not used in the I/O path
are at the end of the structure.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I43eca1faacd29a5bf34be6ee644191d865cd42a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11514
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This macro will be used in an upcoming patch
that needs to construct an nvme_request structure
outside of the standard nvme_allocate() routines.
Examined x86 optimized assembly with this patch,
and there is no change.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0f6b8500e06b56edc33f437f351536cf857d13d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11513
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Split part of IO completion into 2 functions.
This patch reduces changes in the next patch
which will reuse some part of spdk_bdev_io_complete
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Iaeac81aa5208b4ca303f60410b6a54f8df13b069
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11519
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Move part of function _bdev_io_set_buf which sets
metadata pointer to another function _bdev_io_set_md_buf
Next patches will make copying of bounce buffer async,
metadata will be copied when data copy completes.
This patch makes next change simpler
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Iced45393f43f9c5a4818e4e9eadb3351583e0c00
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11518
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Although there are no use cases right now where a batch can have
mixed op types, there may be in the future and rather than have
one blow up because ops have different reserved fields and its
not valid to submit an op with a non-zero reserved field, go
ahead and zero these out like we do with descriptors in the non
batch case.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I6d1bb416dc84aa1f76407c76aedf0768dd003218
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We always set num_batches to the same value we set num_descriptors
so just make it explicit. Makes it easier to experiment with
different values when performance testing.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I648262001772c791a032d6cab38dc3b03c1d55c1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11354
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This has changed to control the number of read buffers allocated to the
group, but it is only valid to set this register if the device has
indicated it supports it. Further, the default value is what we want
anyway, so we can skip setting it altogether.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ic54672ea6cb16acc7613860e36d9f7033048bd98
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11484
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
All of the other structs and unions spell out register, so match the
style.
Change-Id: Ie502e80206305037d1518a1db590d89b7479abb4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11433
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is called GENSTS in the spec, so match that name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I0e8f917e13908f3920ab297e14cb3adee856eaa5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11432
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
SPDK can submit more commands to remote NVMf target than allowed by
negotiated queue size. SPDK submits up to SQSIZE commands, but only
SQSIZE-1 are allowed.
Here is a relevant quote from NVMe over Fabrics rev.1.1a ch.2.4.1
“Submission Queue Flow Control Negotiation”:
If SQ flow control is disabled, then the host should limit the number
of outstanding commands for a queue pair to be less than the size of
the Submission Queue. If the controller detects that the number of
outstanding commands for a queue pair is greater than or equal to the
size of the Submission Queue, then the controller shall:
a) stop processing commands and set the Controller Fatal
Status (CSTS.CFS) bit to ‘1’ (refer to section 10.5 in the NVMe Base
specification); and
b) terminate the NVMe Transport connection and end the association
between the host and the controller.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: Ifbcf5d51911fc4ddcea1f7cde3135571648606f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11413
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
According to NVMe over Fabrics specification (rev.1.1a) HSQSIZE sent
in RDMA_CM_REQUEST private data (ch.7.3.6.4) shall be the same as
SQSIZE later sent in Connect command (ch.3.3).
SPDK NVMe RDMA initiator adjusts SQSIZE to CRQSIZE received from
target in RDMA_CM_ACCEPT private data. Target is allowed to send
CRQSIZE < HSQSIZE if RNR retries are used. So, it is possible that
SQSIZE sent by SPDK will be lower than previously sent HSQSIZE. There
are targets validating this match and they reject connection from
SPDK.
Linux kernel NVMe initiator doesn't perform such adjustments and
connects well to such targets.
This patch aligns SPDK behavior with specification and Linux kernel
implementation.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I01968d1c07d284396fa5939932d85841351d7a45
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11350
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The controller data structure may be freed before subsystem resume done
callback, we can take endpoint as the input parameter to avoid this issue.
AddressSanitizer: heap-use-after-free on address 0x625000046100 at pc 0x00000082818f bp 0x7fff7b09bd10 sp 0x7fff7b09bd00
READ of size 8 at 0x625000046100 thread T0 (reactor_0)
#0 0x82818e in vfio_user_dev_quiesce_resume_done /spdk/lib/nvmf/vfio_user.c:2147
#1 0x782cc0 in subsystem_state_change_done /spdk/lib/nvmf/subsystem.c:634
#2 0xad047b in _call_completion /spdk/lib/thread/thread.c:2344
#3 0xabc48d in msg_queue_run_batch /spdk/lib/thread/thread.c:710
#4 0xac0670 in thread_poll /spdk/lib/thread/thread.c:926
#5 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986
#6 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920
#7 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958
#8 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#9 0x99884a in spdk_app_start /spdk/lib/event/app.c:643
#10 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75
#11 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
#12 0x407abd in _start (/spdk/build/bin/nvmf_tgt+0x407abd)
0x625000046100 is located 0 bytes inside of 8320-byte region [0x625000046100,0x625000048180)
freed by thread T0 (reactor_0) here:
#0 0x7f82219ff91f in __interceptor_free (/lib64/libasan.so.5+0x10d91f)
#1 0x837059 in _free_ctrlr /spdk/lib/nvmf/vfio_user.c:2976
#2 0x837327 in free_ctrlr /spdk/lib/nvmf/vfio_user.c:2996
#3 0x843541 in nvmf_vfio_user_close_qpair /spdk/lib/nvmf/vfio_user.c:3742
#4 0x7d1d91 in nvmf_transport_qpair_fini /spdk/lib/nvmf/transport.c:604
#5 0x7ad922 in _nvmf_qpair_destroy /spdk/lib/nvmf/nvmf.c:1055
#6 0x761362 in nvmf_qpair_request_cleanup /spdk/lib/nvmf/ctrlr.c:4026
#7 0x761906 in spdk_nvmf_request_free /spdk/lib/nvmf/ctrlr.c:4041
#8 0x75a931 in nvmf_qpair_free_aer /spdk/lib/nvmf/ctrlr.c:3576
#9 0x7ae626 in spdk_nvmf_qpair_disconnect /spdk/lib/nvmf/nvmf.c:1127
#10 0x83db36 in _vfio_user_qpair_disconnect /spdk/lib/nvmf/vfio_user.c:3433
#11 0xabc48d in msg_queue_run_batch /spdk/lib/thread/thread.c:710
#12 0xac0670 in thread_poll /spdk/lib/thread/thread.c:926
#13 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986
#14 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920
#15 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958
#16 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#17 0x99884a in spdk_app_start /spdk/lib/event/app.c:643
#18 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75
#19 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
previously allocated by thread T0 (reactor_0) here:
#0 0x7f82219fff16 in __interceptor_calloc (/lib64/libasan.so.5+0x10df16)
#1 0x837413 in nvmf_vfio_user_create_ctrlr /spdk/lib/nvmf/vfio_user.c:3010
#2 0x83bc68 in nvmf_vfio_user_accept /spdk/lib/nvmf/vfio_user.c:3313
#3 0xabfbd8 in thread_execute_timed_poller /spdk/lib/thread/thread.c:872
#4 0xac0c75 in thread_poll /spdk/lib/thread/thread.c:960
#5 0xac0ead in spdk_thread_poll /spdk/lib/thread/thread.c:986
#6 0x9a5b4f in _reactor_run /spdk/lib/event/reactor.c:920
#7 0x9a6442 in reactor_run /spdk/lib/event/reactor.c:958
#8 0x9a717c in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#9 0x99884a in spdk_app_start /spdk/lib/event/app.c:643
#10 0x407e82 in main /spdk/app/nvmf_tgt/nvmf_main.c:75
#11 0x7f822095ff42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
SUMMARY: AddressSanitizer: heap-use-after-free /spdk/lib/nvmf/vfio_user.c:2147 in vfio_user_dev_quiesce_resume_done
Change-Id: Icf5e5b360b9107a3c5eb960ae59b7fe10ace1c66
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11420
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are errors occur that uninitialised value created by a stack allocation when running unittest_accel and unittest_nvme_rdma with valgrind.
Change-Id: I4b48b472cc7c189cbcaf8ca772830a23118e7e17
Signed-off-by: Jaylyn Ren <jaylyn.ren@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10559
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Next patches enable memory domains async pull/push
functionality.
Previously buffer from internal bdev pool was passed
as func argument to bdev_io_get_buf_complete. Now
since bdev_io_get_buf_complete can be called
asynchronously, this buffer is stored in
bdev_io->internal.buf
Also move bdev_io_get_buf_complete up in file
to minimize changes in next patches
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I6e9d3b35dc85e0e88703dd24a4b4da837adc5b74
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11165
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
if ctrlr->listener was NULL, nvmf_ctrlr_get_ana_state() returned
inaccessible even if ana_reporting was disabled. Then the corresponding
initiator received unexpected ANA error and could not process it
appropriately.
Change nvmf_ctrlr_get_ana_state() to return optimized always if
ana_reporting is disabled.
Additionally, check if ctrlr->listener is not NULL before calling
SPDK_DTRACE_PROBE3().
Fixes#2335
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ib2376694cf89d85ec5687fba7e87439f494f30b0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11402
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
it shall be executed on ctrlr's thread not subsystem's
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I58c60525191085d3d6a583862ba5d71ea90940c7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11105
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When printing metadata pages, blobcli could print the start LBA to aid
someone that needs to debug with dd and od.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I380bd923dfcd1149e3f705dd0ec0ab46b1000019
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11260
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When blobcli is printing blob metadata, extent tables are now printed.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ie748a2f2b3fbc3e6e5ee06a0f2eb9bd491bfed46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11259
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When printing blob metadata via blobcli, descriptor types that do not
have full dump support should not be silently ignored. This prints a
message that indicates an unsupported descriptor type was encountered
so that the person debugging with blobcli knows that there is more
metadata present.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Id30b671fd9dee1ec12e10625eb2af4c1e43eda27
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11258
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When blobcli prints blob metadata, it will now Print invalid_flags,
data_ro_flags, and md_ro_flags when printing blob metadata. The
complete mask is printed as well as the meaning of each bit or set of
bits. If unknown bits are set, that will be indicated in the output
as well.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I743a843a5d23b0e81c04482304515ab3c3b4c7bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11257
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
nvmf_vfio_user_sq->size and ->qsize both hold the number of entries in
the queue; merge them.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I6c7c2984cbdf90079eec9222e1acbedb92207308
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11297
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Since we already allocate ->sg on a per-request basis now, drop the
->reqs_internal allocation in favour of allocating individual requests
and placing them on the ->free_reqs queue, which simplifies the need to
track the array. For request abort, we'll use the ->outstanding request
list, now we have it, to find the victim request.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I25227ccd2ab7e00c8a2e7b2e2af2dc3b073584cd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11427
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
On failure, we weren't cleaning up the poll group data properly, and in
one place, we were trying to remove ourselves from the tgt-> list prior
to being on it.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I9bbe5847b3703eba1ee1d762392ad3159a74ac8b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10717
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
There's no need to forward-declare this, when we can just place it
before its consumers, and this will also help follow-up fixes.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I201bd966371db76a3b789473041799bf55b13c95
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Use of %p in logging simplifies this code a little bit.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I8e5daa59a614b8bcde7d67d1e5cc6196923031a8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11244
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Avoid using the modulus operator in the hot-path cq_is_full(),
by aping how cq_tail_advance() is written.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Idbdf1715ab30d08233b38aa7691f0212ae93a542
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11445
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Avoid using the modulus operator in the hot-path sq_head_advance(),
by aping how cq_tail_advance() is written.
Suggested-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Id1e9d63a08e470344fdeb549d78ea505088b1a62
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11436
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
There is very little now shared between submission and completion
queues, so drop usage of this struct, folding its remaining members
into the relevant owning types.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I02195d1944ca9905ef03ddf2c099ddb806df70dc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11296
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Consistently wrap access to queue heads/tails, so it's easier to make
further changes. Adjust sqhd_advance() to match the head/tail naming of
the accessor functions, and move the definitions to be close to each
other.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: I197e230ecc4e67fe0207f29281d7e4ca946c22e1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11295
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add a struct defining the local mapping of a queue.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Id3bbdf72bfc082f4496748571bd2617bdafe4309
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11294
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In many cases, addressing bdevs by their UUIDs is often easier than
using their names, which can be somewhat arbitrary. For instance, the
NVMe bdev builds a name by addng the n{NSID} suffix to the controller's
name, while the UUID is filled with NGUID (if available).
The UUID alias is stored in the form defined by RFC 4122, meaning five
groups of lower-case hexadecimal characters. It's important to note
that bdev layer uses case-sensitive name comparison, so the user needs
to use the same textual UUID representation.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8b112fb81f29e952459d5f81d97fdc7a591730f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11395
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The initialization has no side effects, so it can be done earlier, which
allows for using functions that operate on these queues.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I36830d815b7b43687f369dba2a0999a6dcca5a14
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11394
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function takes a function callback to be used to delete a name from
the global bdev name tree. It makes it easy to delete an alias with or
without locking:
```
bdev_alias_del(bdev, alias, bdev_name_del);
```
or
```
pthread_mutex_lock(&g_bdev_mgr.mutex);
bdev_alias_del(bdev, alias, bdev_name_del_unsafe);
pthread_mutex_unlock(&g_bdev_mgr.mutex);
```
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ida2209d6618a4ce31a6f73da285626c3ecb658fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11393
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Additionally, added an unsafe version, bedv_name_del_unsafe, which can
be used while already holding the mutex. It'll make it easier to remove
an alias without taking a lock.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If986909cc52c9b9bdb6f429654b01b83b08b1ea3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11392
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In some scenarios, a split IO can immediately complete. For
example, a very large unmap operation to a newly thin-provisioned
blob has no operations to perform, so the batch for its operation
immediately completes.
But if it immediately completes, we can't recursively submit
the next split IO. So use variables in the context structure
to detect when an operation immediately completes, to allow
it to unwind and submit the next operation without recursing.
Fixes issue #2347.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8e4c121190c7d08152aa8de20cf6abc55b5edc46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11388
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
No functional change here, this only prepares this function for
some functional changes in the next patch. By adding the
do/while loop here we reduce the amount of whitespace changes
in the next patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I09d64fd1fb69ee232af1d298619c762e562fdc79
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11387
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This patch separates out rte_vhost code responsible for
vhost init/fini and vdev register/unregister from vhost.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie69ecd3b2659c805c9c0b0a0076996ef85c8fe71
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9535
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We did not have any practical limitation to support concurrent
execution of multiple abort commands.
NVMe specification recommends that implementations support a minimum
of four abort commands.
Let's follow the NVMe specification.
As stated in the head, we do not have any limitation, and we do not
have to check if abort commands exceeds ACL or not.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I31e066fadcb5d619d0c50c895c4cb64520b33513
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11232
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Refactor the code that dumps XATTR into a function. Call this function
for XATTR and XATTR_INTERNAL.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ic0cb32b14f7a34e030a48e1ea468ec63172e2bf1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11256
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Add the ability to open a blobstore in such a way that recovery happens
even if the superblock says it is clean.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I475e51beff24428d387446f7785e025294d2f014
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11253
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If we find a blob during the initial iteration during load
that doesn't have a name, we cannot just immediately
unload the blobstore, since the 'bad' blob is still
open. Instead finish the iteration, and unload the
blobstore (with failure status) after the iteration
is complete.
This is somewhat related to issue #1831. By ensuring
we can unload the blobstore, it closes the open
descriptor on the underlying bdev, which allows
the bdev subsystem to exit on application shutdown.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7ecd189842704bb809f25c60efa8f81dcf8ca79c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11352
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
When a blobstore is not clean, a message is logged at the notice
level. As other progress is made, messages are logged at the info
level.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Icfbe375faaa95d5be53864f7eb8a73e1ae7c5d01
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11251
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
While dumping the blobstore with blobcli, read the super block and bit
arrays. As each metadata page is dumped, indicate which bit arrays
reference the page.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ie023594343861d0fbf065c270424649ec715d8b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11247
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Blob IDs are sequentially assigned starting at 0x100000000.
When debugging with a small number of blob IDs, it is much
more intuitive to see blob ID 0x100000000 rather than blob
ID 4294967296. If blob IDs are displayed in hex, the things
that parse commands should also accept hex to facilitate
copy and paste.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: Ic71eaaf1987609b4f705d372ced4240650b12684
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11245
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
To allow SO_MINOR updates on LTS for the whole year it is supported,
the major version for all components needs to be increased.
This is to prevent scenario where two versions exists with matching
versions, but conflicting ABI.
Ex. Next SPDK release adds an API call increasing the minor version,
then LTS needs just a subset of those additions.
Increasing major so version after LTS, allows the future releases
to update versions as needed. Yet allowing LTS to increase minor
version separately.
Disabled test for increasing SO version without ABI change, as
that is goal of this patch. This check shall be removed with SPDK 22.05
release.
This patch:
- increases SO_VER by 1 for all components
- resets SO_MINOR to 0 for all components
- removes suppressions for ABI tests
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id1a5358882dc496faa5b0b5c9a63b326c378c551
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When an unexpected xattr name is passed to lvol_get_xattr_value(), no
error is returned to the caller. The one caller, blob_set_xattrs() via
the xattrs->get_value callback, makes the reasonable assumption that a
lookup that fails to find a value returns a NULL value. This updates
lvol_get_xattr_value() to match that expectation.
Signed-off-by: Mike Gerdts <mgerdts@nvidia.com>
Change-Id: I5c7a740f2757e6d8265ba2637afecb729acfcdd4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The new batching code needs to call the cb_fn for each of the
elements of the batch when a batch that hasn't been submitted
yet needs to be cancelled (due to an error in building it).
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I6f94b27dd7c64f756193ec3532de98b644b41d7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11212
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
In several functions. Busy handling also maans paying attention
to the rc when submitting a batch and not clearing chan->batch
unless the call was a success.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic45b10ade2ebdcd845dc33e54dd9c93068ceb98c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11221
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
With the new added property access API, we can send a internal
property access request to NVMf library, and we can use
it to reset controller.
Change-Id: Iee8b1146d9eb31bc98a9b297e5c635e43e6fdb12
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10952
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
VFIO in QEMU uses region 9 as the PCI passthrough devices' migration channel.
The format of the region 9 migration region is as follows:
------------------------------------------------------------------
|vfio_device_migration_info| data section |
------------------------------------------------------------------
QEMU will access vfio_device_migration_info to controll the migration
process.
For SPDK vfio-user target, we also implement the BAR9 via libvfio-user,
and we also define the NVMe device specific migration data stored in
data section of BAR9. QEMU doesn't care about the format in data section,
it will help us to gather the NVMe specific migration data in source VM and
then restore the migration date to data section of BAR9 in destination VM.
The core idea to implement live migration will following the device state
change which is controlled by QEMU. First QEMU will try to STOP the device
in the source VM, and set the destination VM to RESUME state, SPDK will save
NVMe devic state data structure to BAR9 in the source VM once the subsystem
is paused, then QEMU will read BAR9 in source VM and restore the content of
BAR9 in destination VM, finally in the destination VM, we will restore the
NVMe device state include BARs/PCI CFG/queue pairs in the destination VM.
Change-Id: I42e38f28c3ff59831be63290038b50d199d06658
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7617
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To avoid re-use of descriptors that may have fields set that are
reserved by the one being used now. For example:
If a batch desc is being built and was previously used by a copy
we need to clear out the dst_addr field or things will explode
as this is a reserved field for a batch.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I6ba50b76589e38a276683291f5ec2970c80e8aa8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11308
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When subsystem is destroyed, it removes its listeners,
however transport level listeners remain active.
This patch removes all transport listerners when
the transport is being destroyed.
Fixes issue 2353/
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica7bcb0052b626aa62d0da9049bb8f216027dc49
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11307
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
DPDK could be compiled as shared libraries by specifying
`--default-library=shared`. This is the default in packaged DPDK.
Building SPDK statically did not work with such DPDK builds,
since we always assumed the same type for both.
This patch makes detects the type of builds separately and
allows for any combination.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I40b81446cc1c02f18a0b986fb5f0a7a6e31de467
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6491
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
to a single command
Change-Id: Ic0ca65b7399f3cbc4153327d83de7db69de48709
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11209
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We saw this unexpected behavior by the current SPDK master.
Add the check to clarify this behavior occurs only when we use
Soft RoCE.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3a5eaa9064a0601c65139e7868898545926d0dbf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11225
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
This reverts commit eb09178a59.
Reason for revert:
This caused a degradation for adminq.
For adminq, ctrlr_delete_io_qpair() is not called until ctrlr is destructed.
So necessary delete operations are not done for adminq.
Reverting the patch is practical for now.
Change-Id: Ib55ff81dfe97ee1e2c83876912e851c61f20e354
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10878
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When doing live migration the migration BAR region is bytes stream
data, so here we use the helper function to save current controller
state into the stream in source VM and load it as internl data
structure from steam in destination VM.
We will remove the `unused` attrubute in next patch.
Change-Id: Ib44adb351c697b50b9220ce6943cc017137a6064
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10336
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When doing live migration, the destination VM will construct
ADMIN queue pair at the beginning, but the controller isn't
in READY state, we should not poll the ADMIN queue pair right
now. This is fine for normal controllers, normal controllers
will set ADMIN queue pair state in CC callback.
Change-Id: I0db36f75a463fb7476ee62323f9ed0c74c2451dc
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10621
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When doing live migration, there are some spdk_nvmf_ctrlr internal
data structures need to be saved/restored.
Change-Id: Ie39482e8c49765c36fc3700fbac4ce47ef306f29
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10058
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When doing live migration we need to restore the AER commands
in the destination VM, so here to provide an API to save
these CIDs and the transport layer can save the value.
After migration in destination VM, we should allocate
new AER requests based on CIDs in vfio-user.
Change-Id: I5881f833bbfacb0f030a2b135b4dd47726240378
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10040
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Fix issue: #2320
Only the primary process will do the unmap bar operation as for
the map bar operation.
The DevHandle is process specific and the issue here is the
secondary process's function pointer of DevHandle is not properly
set.
Change-Id: I95dddc76c6ce4be8775b6aaf54699002baffd3b9
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11216
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Using the latest DSA we aren't supposed to (a) touch WQ space that
we aren't configuring and (b) touch WQ config fields that we are
configuring even if we are configuring that WQ. So, this patch
will read in initial values of only the number of desired WQs
and update them accordingly before updating the HW.
Also updates a few vars to use shorter local variables consistently.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7641cdfc5ccc839e37a1d46d760248799a8fce1f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10981
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Found via inspection during spec review of latest HW. We were using the
wrong stride for the WQCFG regsiter when configuring but it just so
happened to be the right value for the current DSA version. We were
mixing up the size of the WQCFG register with the stride value used to
configure the next WQCFG regsiter as they are not contiguous in HW, we
need to read another capabilities bit to determine the address of the
next wqcfg to configure..
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I14d1ff95e0131fd30121aa955bfbc8c8fb3fc512
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10968
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Compliant with both current and next gen DSA.
Note: some fields in gencap were mapped incorrectly
previously, but this did not impact the SPDK driver
because the only times those values (max_xfer_shift
and max_batch_shift) were used were in asserts.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I9648184670f661166136e7898d0d8c7e07d8c746
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10966
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is a race condition between controller destruction and
subsystem state change, e.g. admin qpair may already be freed
when a namespace is added or removed. As result in function
poll_group_update_subsystem we may get heap-use-after-free error
Another problem is that some qpair's live time may exceed controller's
life time. To avoid it, start controller destruction process when the last
qpair finished the disconnect process (previously controller started
the descruction process before the last qpair starts to disconnect
and it could lead to raise conditions)
Fixes#2055
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ibc99b1d840e4796e1588cc217d65834bb556b909
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9995
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Using contructor/destructor to handle g_dpdk_sem will
help later in the series when splitting vhost fini
between vhost.c and virtio abstraction.
Otherwise multiple callbacks would be needed during vhost fini.
Ex. spdk_vhost_fini -> vhost_user_fini to stop the sessions ->
-> back to spdk_vhost_fini to remove vhost devices ->
-> vhost_user_fini to destroy the g_dpdk_sem
g_dpdk_sem will only be used from rte_vhost_user.c.
Until all references are moved, it is placed in vhost_internal.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0505b906621f0eb0cb1226f96a3b6cf49f66778f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11055
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
There is no need to zero out the g_vhost_core_mask on vhost_fini.
Removing it will help later in the series when splitting vhost fini
between vhost.c and virtio abstraction.
g_vhost_core_mask will only be used in vhost.c and any cpu_mask
shall be passed to virtio abstraction after going through
vhost_parse_core_mask. There is no need to make the
g_vhost_core_mask accessible for virtio transports.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ic936c2a8dd1bb6f93b6f6209ea48e3278b19b54e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11054
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
In later patches rte_vhost functions will be moved
to rte_vhost_user.c. To prepare for this,
iterator is used in place of accessing g_vhost_devices.
While here, followed the same style of iterating in
spdk_vhost_config_json().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1b73c00dfe1391f359421d044686e49a8c6c9176
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11022
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
g_vhost_mutex scope is only within vhost.c as
it should. Meanwhile there is an internal vhost API to
use this lock from any of the vhost files.
Later patches in the series move some functions from
vhost.c to rte_vhost_user.c, where using only the
internal vhost API locks will be better suited.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5916d4dc824ec980fa510fd3cbbd0c8e082d6611
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11021
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Creation of sockets is specific to rte_vhost, so it
functionality responsible for setting path for them.
dev_dirname is renamed to g_vhost_user_dev_dirname
and its definition is moved to rte_vhost_user.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I9bae67667b0f6624f2daf3244a048d10e94e553c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10631
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Changing the vsession coalescing setting is specific
to rte_vhost as such it should be moved the rte_vhost_user
that focues on rte_vhost specific functionality.
Renamed with vhost_user_* prefix to match the file.
Since the rte_vhost functions are still called directly from
vhost.c, temporarily they are added to vhost_internal.h.
Once implementing virtio transport abstraction is complete,
some will be removed and others will be replaced with
a generic callback structure.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I98b3746952cfe09fb724c49e4050efc0c42985a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10630
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
For some time already the DPDK rte_vhost interface was
accomodating for other types of devices than virtio-net.
rte_vhost_compat.c file contained the use of DPDK rte_vhost,
rather than workarounds. To make that clear it is now renamed
to rte_vhost_user.c.
This patch is first in series that reworks vhost library
with two goals in mind:
1) Refactor vhost and vhost-blk to no longer depend on rte_vhost.
All references to that API will be moved to rte_vhost_user.c.
2) Add a transport abstraction for virtio-blk devices.
vhost-blk will now be able to expose virtio-blk using multiple
implementations of the interface.
First one will be vhost_user that depends on DPDK rte_vhost library.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib6d4e4a6352069fa76e6b017ec203dab75f887b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11052
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
A somewhat hidden functionality was present in spdk_vhost_dev_find().
Caller could match a vhost controller by controller name (socket filename)
or by full path to the socket.
This function is used by vhost RPC too.
The functionality of matching by full path was not documented,
nor matches what is presented in spdk_vhost_dev_get_name()
or vhost_get_controllers RPC.
This patch removes this functionality as part of series
to enable non-vhost-user type controllers, which might
not use the path to sockets.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0e5ce75ac80ed8d1da962eabba86af69f59a43db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10436
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Previously we didn't post the response for CREATE IO SQ command
until the queue pair is connected finally, but for coming live
migration support, we will connect IO queue pairs in the destination
VM, and this function will also be called for this case, so here
we add a flag to indicate the CREATE IO SQ case.
Change-Id: Iab4c64a7ebb72bcffbfff712dc729c40eead7c7d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9464
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The miration region data structure is from `vfio_device_migration_info`
defined in `linux/vfio.h`, `vfio_device_migration_info` is in the 0th
offset of the VFIO_REGION_SUBTYPE_MIGRATION region, and in vfio-user,
we reserve first one page of BAR9 for this MMIO accesses.
libvfio-user already helps us to hide some implementation details
based on vfio migration specification, here we just use the two
fields to help the migration process.
Change-Id: I8917ba892bbfdfdf4f135f5d6b4923ab0e4a6250
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7628
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We will report the live migration region to VM via sparse
mmap, offset after 0x1000 is the NVMe device state data
structure, and offset start from 0 is the structure
vfio_device_migration_info defined by the VFIO driver.
All accesses between 0x0-0x1000 will use the MMIO callbacks,
and accesses to NVMe device state will use shared memory map
way.
Change-Id: Ib456fc61f587c1bffa8b38506b4480a6066abe87
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7627
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We will use the NVMe device state data structure to save/restore
a NVMe controller in source/destination VM.
NVMe device migration region is defined as below:
----------------------------------------------------------------------
| nvme_migr_device_state | private controller data | queue pairs | BARs |
----------------------------------------------------------------------
Change-Id: Idc73976e1de7f6da2da58e71db86df8cbb0d314d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7626
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For CREATE IO SQ command, we will defer to post completion
until the SQ was connected, we may call post_completion()
in different threads, so here we will send a message
to CQ thread when necessary.
Change-Id: I87a0f8982811c76ce8eb49db6a136f4cbe6e0a93
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11078
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We should disconnect ADMIN queue pair after shutdown
returned, or we may leak ADMIN socket resources after
free the controller data structure.
Fix issue #2289.
Change-Id: I956191fcd51cdcef5de2c3c7b15ffc70f22b040b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11133
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: <qun.wan@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
From SAM-4, section 5.13 (Sense Data);
“When a command terminates with a CHECK CONDITION status, sense data shall be returned
in the same I_T_L_Q nexus transaction (see 3.1.50) as the CHECK CONDITION status. After
the sense data is returned, it shall be cleared except when it is associated with a unit
attention condition and the UA_INTLCK_CTRL field in the Control mode page (see SPC-4)
contains 10b or 11b.”
SPDK does not set UA_INTLCK_CTRL to 10b or 11b, so we set the unit attention condition
immediately against a single IO or Admin IO after reporting it via a CHECK CONDITION.
Once the failed IO received at iSCSI initiator side, it will be retried. In the case of
resize operation, if there is no IO from iSCSI initiator side, the unit attention
condition will be delayed to report until the first IO is received at the iSCSI target
side.
Meanwhile, we clear the resizing (newly added) flag on our SCSI LUN structure after
first time we report the resize unit attention condition.
The kernel initiator won’t actually resize the corresponding block device automatically.
It will report a uevent, and then you can set up udev rules to trigger a rescan. SPDK
iSCSI initiator will automatically report the LUN size change.
Change-Id: Ifc85b8d4d3fbea13e76fb5d1faf1ac6c8f662e6c
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11086
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Convert the batch to the single command inside of it.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ia117175ef3f4a8290d313e0bdc794f6a3276e042
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11166
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Increase the batch size and with it the effective queue depth per
channel to 512.
Change-Id: Ide665e92d47ee753c141f34dd6a8bc4d040fe8db
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11031
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Transparently group independent operations into idxd batch operations
between polls. This increases the effective queue depth.
Change-Id: Ic09b21ed29aaefe2eccef9a6ae0e1b05990ef631
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10998
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This is no longer needed.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I11b7e9acbcf1239a0ad2f49169d7e3d5844a1b93
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11029
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
SPDK RDMA target reports msdbd=16, these addtitional
SGL descriptors are located in capsule. The user can
set ICD size lower than required for msdbd=16. This
patch verifies that ICD can hold all additional SGLs.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I875d40e14e6506c39169d084e56df7ca5d761209
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10686
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
We will assign each SQ with different poll group in round
robin way by default, this may cause race condition to
post completions to one CQ in different threads, so here
we will assign the SQs which share one CQ into same poll
group.
Also enable multiple cores NVMe compliance tests so that
to cover shared IO CQ case.
Change-Id: I9d7cc78aaedceed23986d9f89ed945e0eb337e09
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11115
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously we mixed SQ/CQ definition together, one queue pair
data structure may contain CQ,SQ or both CQ and SQ separately,
while here, we split the queue pair definition into SQ and CQ
respectively as code cleanup.
The NVMf library uses queue pair concept, but for vfio-user
case, each SQ created by VM is mapped to NVMf queue pair, so
we also change `connected_qps` to `connected_sqs` to reflect
the fact.
No actual code logic change in this commit.
Change-Id: I293ccbfbf054fe864d348fc56793dd1ccd366f6d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11036
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
libvfio-user will call quiesce callback when there are
memory region add/remove and device state change requests
from client, and in the quiesce callback, we will pause
the subsystem so that it's safe to do everything after
it, then after quiesce callback, we will resume the
subsystem. The quiesce callback is also used in
live migration, each device state change will quiesce
the device first.
Change-Id: I3a6a0320ad76c6b2d1d65c754b9f79cce5c9c683
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10620
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We will use the controller state to implement the coming
device quiesce feature, it's safe to do anyting when
a subsystem is in PAUSED state.
Change-Id: I3b466ed01848e668a1ffcea1d4f1466e971afa23
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10619
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Users may remove the listener while VM is connected, the endpoint is
associated with Unix Domain socket file, we should destroy the endpoint,
however, the controller maybe still active for now, because nvmf
library will help us to disconnect all queue pairs in asynchronous
way. Here we use the same way as the NVMf library to destroy the
controller when there is no connected queue pairs.
Fix#2246.
Change-Id: I0775d5294269d848d859968edafc8eaa1d89a32c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10379
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The controller may be freed eailer than endpoint, so we still
need to unregister the memory region from SPDK. The case
can happen when removing the listener while VM is connected.
Change-Id: I95d49cefdbff3e0bdea316fac824ef8b218fcd2c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10378
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
We should hold the transport lock to iterate endpoints.
Fix issue #2313.
Change-Id: I8e0539a51e843a3299908d9da7749fe9becb5e7e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11037
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
We met an issue that client got a NVMe completion with old SC
bit, so we add a memory barrier here to ensure the NVMe completion
is fully populated.
Fix issue #2323.
Change-Id: I7887d789a0acd3634a10aa7dc8de81a153137ae7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11076
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
These tracepoints don't include this parameter in their definitions.
This patch fixes the following assertion when the traces are enabled in
the thread library:
```
_spdk_trace_record: Assertion `0 && "Unexpected number of tracepoint arguments"' failed
```
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5159dbafd25c3150c90fa26c966dadb1fe239953
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11159
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Admin commands can be sent and polled from any thread, which also means
that the error injection queue on the admin qpair can be accessed from
multiple threads. Therefore, any modifications to that queue should be
done under the ctrlr lock.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ib1ed194405cb5b93f65a007b9749fd4433dc367d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11099
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Nightly build failing on Centos 7 machine
C compiler for the host machine: cc (gcc 4.8.5 "cc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)")
C linker for the host machine: cc ld.bfd 2.27-44
Host machine cpu family: x86_64
Host machine cpu: x86_64
Errors like:
idxd.c: In function ‘spdk_idxd_submit_crc32c’:
idxd.c:902:24: error: ‘prev_crc’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
desc->crc32c.addr = (uint64_t)prev_crc;
Signed-off-by: Pawel Piatek <pawelx.piatek@intel.com>
Change-Id: Ib40160b1974ecd3f1579566b6eb5d88e03b5bb2b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11082
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There is an error case that the block device didn't complete
outstanding IOs during the controller reset or shutdown, so
the NVMf library will wait until all the IOs returned from
the backend, however, so here we added a timeout timer, when
the time expired, we will try to reset the block device which
hold the outstanding IOs.
Fix#2194.
Change-Id: I8d0746335e1f20a09e6a9ea87730551808a898d1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9909
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
After a series of recent patches, introducing individual
tracepoint enabling, the "all" and "0xffff" parameters stopped
working (we call spdk_trace_set_tpoints which sets tracepoints only
once, but we need to iterate over all groups in a given mask).
Change-Id: Id31c15dd0f707777f839791566c10728723090ba
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11126
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Add/modify tpoints around io_device name in lib/bdev/bdev.c
and lib/thread/thread.c.
Deleted double spaces in commets of trace_defs.h.
Change-Id: I0e2f5118e68b1b329a422bde3400fd2273e7387e
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10687
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
We can do all of the configuration in spdk_idxd_get_channel, and the
configuration step was always done immediately after getting the channel
anyway.
Change-Id: I9fef342e393261f0db6308cd5be4f49720420aa0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10349
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
The driver still may use batches to implement some operations or for
efficiency reasons.
Batching my be resurrected in the future, but for now we need to do some
fairly extensive performance changes on the driver and eliminating all
of this unused/inactive code makes that much easier.
Change-Id: I92fcec9e4c7424771f053123d821cc57dba9793c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
These will be used internally by some of the other code paths.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Idc3c024cf1bf3d468f87176373ef97bf064ced8f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11033
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The batching API will be removed from idxd shortly.
Change-Id: I04cf61112f7831a9fb0fefc269706495761d0889
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11032
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
dynamically
This can be done once on allocation rather than every time the batch is
submitted.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8092f65f1b864cc3cc78db9fdee085d8bb0471df
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10293
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This only needs to be updated on the last step of the CRC calculation.
Change-Id: I0b41f33bfbbc195a857d1c39d9f8f7164d2bba8d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10292
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
poll_group_disconnect_qpair() is used only in a single place now
and transport_poll_group_disconnect_qpair() always returns 0 for all
transport.
Let's remove unnecessary processing for return code.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I45d7f8cea2117b3ec00028df234d1eb9ecc65713
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10677
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvme_poll_group_disconnect_qpair() is called only by a single place now.
We do not need the flag poll_group_disconnect_in_progress any more.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I8f9c0f14baa8fcb9b0637635a5bb3d34a8b11af5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_nvme_poll_group_remove() is available only for disconnected
qpairs now. Hence spdk_nvme_poll_group_remove() does not have to
check if qpair is connected and call nvme_ctrlr_disconnect_qpair().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I3b05246c4be6adfa3392b8f0e5ecaf274a8a7795
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10846
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Previously, when connecting qpair, we allocated stats per qpair if poll
group is not used or we set stats per poll group otherwise.
Then when deleting qpair, we freed per qpair stats if allocated.
However, if qpair is still not completely disconnected after removing
qpair from poll group, pqpair->stat is use-after-free and it causes
a segmentation fault.
To fix this issue, we set pqpair->stat to &g_dummy_stats instead.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ibf303e6db5176e93ed75cbe3a414bb923d6e3ab6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10845
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously, when connecting qpair, we allocated stats per qpair
if poll group is not used or we set stats per poll group otherwise.
Then when removing qpair from poll group, we cleared qpair->stats pointer.
However, if qpair is still not completely disconnected after removing
qpair from poll group, tqpair->stats is NULL and it causes a segmentation
fault.
Hence we set tqpair->stats to &g_dummy_stats instead of NULL.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: Ice6469627ce8d4bf4567f57c304759206b6432f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10844
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_ctrlr_disconnect_qpair() calls nvme_poll_group_disconnect_qpair() if the qpair
uses a poll group, and nvme_poll_group_disconnect_qpair() calls
nvme_ctrlr_disconnect_qpair() if the state of the qpair is not DISCONNECTING.
This relationship made the code very complex.
A few patches starting from this patch simplifies disconnect and free qpair
operations.
This patch swaps the ordering of nvme_ctrlr_disconnect_qpair() and
spdk_nvme_poll_group_remove() in spdk_nvme_ctrlr_free_io_qpair().
This ensures the qpair is disconnected when spdk_nvme_ctrlr_free_io_qpair()
calls spdk_nvme_poll_group_remove().
This enables us to limit spdk_nvme_poll_group_remove() to be available
only for disconnected qpairs.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0601a74f953a2efc4f177a51a4450baea33533d4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10670
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For the purpose to support shared IO CQ feature, we will construct
the queue pair data structure at the beginning, and setup SQ/CQ
separately in CREATE IO SQ/CQ routine.
Previously we will disconnect queue pair when got a DELETE IO CQ
command, now we disconnect queue pair when got a DELETE IO SQ command,
and in the disconnect completion callback, we will release the IO SQ
resources, there is a case that the VM will just RESET/SHUTDOWN
controller when IO queue pairs are connected, for this case, we
will also try to release CQ resources in the disconnect completion
callback.
`free_qp` function now is only called when destroying a controller.
Change-Id: I45ec679ddb63bdf1feeba5dc2bd39cae3ba4aa89
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10532
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently we only use round robin way to assign queue
pair to each poll group.
Change-Id: I8efaf3ef25402102dd1eaa7f7aa8bd8bbe071c25
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11114
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Acceptor poller is registered using rate value
from transport opts structure, but this structure is
initialized on generic transport layer when create()
function completes, so at this time acceptor poll rate
is 0.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I2138825f3ff9dd3cc0ccaa65e8d5c23aab338ad4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11095
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is necessary to failover another path when multipath is configured.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Change-Id: I0b6bcf63501e38f75efb4b0d6bec58abb4b67aef
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10250
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
FUSE has a limitation of 128KiB. Adding a check that returns ENOMEM for
ioctl and logs the error. Applies to both in and out buffers
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I9ce5fdc413b047a1ec074468be5abf433da26d7f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10855
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Adding metadata support for io commands. Currently metadata is ignored
even if present in the cmd struct. Making metadata adress
readable/writable depending on data transfer bits. Adding extra unit
test to make sure metadata fields are populated.
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I1d01974a6b2831c82b43e94073065d235eea429a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10854
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Modify admin passthru so that result field of passthru struct is always
populated. This should be safe since dw0 is either reserved or contains
command specific info. This is specifically meant for the namespace
management command when attempting to create a namespace. As per spec:
"Dword 0 of the completion queue entry contains the Namespace Identifier
created.". So for nvme cli and perhaps other application to see what is
the id of the namespace created there needs to be a way to pass the
information back.
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: Ide4effc126ad9eedac95b0700dd65041ed4b35b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10633
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
-Change cuse ioctl reply from status code to whole status field.
-Add negative test for nvme cli cuse: Power Managment on Namespace
Signed-off-by: Ahriben Gonzalez <ahribeng@gmail.com>
Change-Id: I55a88a4f5ace5040f79c05edfc0b8559905bdd2e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10602
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add information which tgroup_ids/_names are duplicated - currently
we only show the second argument of comparison.
Change-Id: Id3c61fc2d86b97e5513d7f5af9d0c5f66a358c5e
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10738
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Add support to enable individual traces through rpc commands
and modify jsonrpc.md to describe the changes.
Change-Id: I3664fc28f1c25a76eade4cff0a0ab1870172f8de
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10518
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The bdev fio plugin has a destructor function that
cleans up the initialization thread, and we can't
have it run after we've cleaned up DPDK or we get
seg faults.
The toolchains reserve priorities 1 to 100 for
internal usage, meaning 101 is the highest usable
priority level. We'll use this for the env_dpdk
destructor priority, meaning it would be the last
destructor to execute.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36718f9413267192d1c1dcec983a0f51b5d5b798
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11085
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This change introduces initial experimental wrappers for enabling/
disabling rte_pci_device interrupts and for getting event file
descriptor assosiated with an interrupt.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: Iba1ba1e57a3555001502859d0bb2c655c07bf956
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10502
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This is the only time where we're allowed to invalidate namespace
handles, so use this opportunity to release inactive ones.
Change-Id: I53626ddf30e48e04207078fe406ec6e02138ac9f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10103
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is the count of items in the RB_TREE, so put the two next to each
other.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib30bee12e65065dc414b55e85cfffa2026057e9f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10035
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We only populate active namespaces into the main namespace tree, so we
don't need a separate list of active namespaces too.
Change-Id: Iaf194f806cc1d9672f5567cff3dffafff3165069
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10034
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
These are no longer complex enough to warrant being separate functions.
Change-Id: I5f3c9fc904b768b6509283c4b7def686bab9a1d2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Since this is now sparsely populated, a tree is a better choice.
Change-Id: Ie66d913fa1d298de56a7d22ef55f0adf7f8803b8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10031
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Some subsystems report a very large maximum value for the number of
namespaces, but in essentially every case the subsystem is sparsely
populated with active namespaces. To save memory, don't allocate
objects for the inactive ones.
Change-Id: I4cbeb5a7a898d3c685f4a3a9ec4c2ce45efffb92
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9898
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Set the SQ/CQ size to 0 so that we will not try to remmap
the ADMIN queue pair in the memory region callback before
the ADMIN queue pair was enabled.
Change-Id: I739a2ec3abcb54b17f31f2bc120312cd02ffeef1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10531
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When deleting a CQ, we will use its reference count to check
how many SQs associate with it.
Change-Id: Ic82e50de0fa92d2f03119ac2cc90ef86a0ea375e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10530
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is a preparation to support shared IO CQ case, and we will
create/delete SQ/CQ separately, so define the queue state as the
first step.
Change-Id: Ie7b5807dc4aa5a2c117e15f61f3a9baa60135653
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10529
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add dtrace probes aroung qpair/controller/subsystem management
to help with debugging issue #2055.
Change-Id: I0b981bffadee3fe4172ad6916c059bf357959dde
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10237
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Chaining may be faster, but this is really an implementation detail of
the idxd driver. Push the decision on how to implement a vectored crc
down into the individual drivers and eliminate it from the generic
framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iedbdc5a6dbd3f7d1674d0a83f6827588f4b6b2fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10291
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This uses a batch with the fence flag for now. There are several other
implementation options that will be explored in the future.
Change-Id: I4f344d671400508de05f80b026d42f775c5b9588
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10289
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Compare two scattered memory regions
Change-Id: I6ce5c9e7bc1ee1ef0e9173c00e86628d43a1e41f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10287
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Each portion of the discovery log has a header which
includes a 'genctr'. This number indicates the
current generation of the discovery log. If this
number changes during the process of fetching the
discovery log in multiple chunks, wait for the
current fetch to complete, but then start over.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5f8623593b7f935eecc37a98daf92e7d8c0dd566
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10813
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Return if outstanding_commands > 0. This reduces
indentation for the rest of the code in the
function and simplifies the diff for an upcoming
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I49f09eff7c0908829819e6b797c922211c56e7db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It makes it easier to read the logs, as the state values are printed as
integers.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I70a9e8860401c18e9305a5fc5771df0bc564d337
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10800
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds support for using zero-copy operations to execute IO
requests in the TCP transport. Of course, they're only used if the
underlying bdev supports them. Additionally, only requests with no
in-capsule-data can be executed using this mechanism.
Added several new states to accommodate for the difference in a way
zero-copy is handled. Also, these flows very depending on the type of a
request (read or write). It stems from zero-copy semantics: to perform
a write we need to wait for zcopy_end completion, while for reads
zcopy_end can only be submitted once we send all of the requested data
to the host.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie02b494c24bc1acc98557cb4b02e867abf9064e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10796
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Zero-copy requests are kept on the outstanding queue for the whole
duration of the request - from the initial zcopy_start submission to the
completion of zcopy_end. This means, that there's a period in which a
request doesn't wait for a completion from the bdev layer, but is still
on the oustanding queue (after zcopy_start callback, before zcopy_end
submit). If a qpair gets disconnected while a request is in this state,
we need to manually force its completion, as otherwise it might hang
indefinitely (e.g. waiting for host data).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I53731b8e363b725efa564ca3c7d89b46f5fb2a24
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10793
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
The zero-copy requests can also be queued when a subsystem is paused, so
we need to properly resume and submit them by using zcopy_start.
Since only requests that haven't received the zero-copy buffer (i.e.
before zcopy_start was called) can be queued, we don't need to bother
with checking zcopy_phase.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ie629688f6961eb2ae05741df496720b91be4d80d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10792
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If there is still some inflight IO which prevents
vhost_session_stop_done(), stop_poller can try
within 4 seconds, and then call vhost_session_stop_done
with -ETIMEDOUT.
This can avoid endless blocking in ctrl pthread if there
is no response from vhost session or its backend bdev.
Then spdk vhost target can still serve all other vhost
devices and operations besides the error one.
Change-Id: I2fc78b4da926c936a2e42dc0e66ce1c60001330d
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10393
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It ensures that we decrement io_outstanding counter for requests for
which zcopy_start failed. Also, removed a note stating that such
requests are reverted to regular IO path, as this is not the case.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I8eaee88abfd94b73614b367fef9bb938c9962617
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10791
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since spdk_bdev_zcopy_end() cannot really fail (it only fails if we pass
a bad bdev_io), we can simplify the nvmf zcopy_end functions by making
them void and always expect asynchronous completion.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I6e88ac28aba13acadea88489ac0dd20d1f52f999
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10790
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since this path now supports sending zero-copy, use it for zcopy_start.
Additionally, it makes it possible make zcopy_start void, as it reports all errors
asynchronously via request_complete(), and remove some of the duplicated
error checks.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I41f43ce1651432d9a7d74e3680d4a3f780128a1d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10789
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If a request gets resubmitted and is completed immediately (i.e. the
processing function returns SPDK_NVMF_REQUEST_EXEC_STATUS_COMPLETE), the
upper layer needs to be notified via spdk_nvmf_request_complete().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I236d6e68d1d7d83afdaa30d8bb07e2b133f43155
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10788
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Additionally, the NVMe completion status is now updated and the IOs are
queued if the bdev layer doesn't have enough IO descriptors. It makes
the zcopy operations behave similarly to the other IO operations.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I455ae781e32aa6e60d144d2c91f109bd8be46664
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10787
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It makes their names consistent with the bdev API.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I314051f0980b46959d6560aa25885f13b4c28f2a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10786
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The zcopy_start requests are now executed through
nvmf_ctrlr_process_io_cmd. It makes the zero-copy share checks with the
regular IO path.
Note, that zcopy_end doesn't utilize this path and is directly submitted
to the bdev layer, as it doesn't need to perform these checks (they were
already verified in the accompanying zcopy_start).
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic9ffecb15e7c351a9d60e731cc711d6500b845db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10785
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will allow the zero-copy requests to share more code with the
regular IO path.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Idb57447694903a56d984f95aa2f1a3bddc3e4e82
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10784
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It will make it possible to submit zero-copy requests through
spdk_nvmf_request_exec().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ibc14fe77cd477b11ed55d1350a7486caaad81add
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10783
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The code should never reach these functions for requests using
zero-copy.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If9f30e05a43b340a982604d5b985242d63ce252b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10782
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's more descriptive that way, as it's clear the function works on a
single request. Also, passing a request instead of zcopy_phase makes it
more convient to use.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If4d7b087511e128f3590ac7b3b5adcb8ace12003
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10781
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
It makes it possible for the user to specify whether a transport should
try to use zero-copy to execute requests when possible.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I40a92b0d7a6707f4c9292795f380846acb227200
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10780
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These set_level/get_level functions and REGISTER(log),
have nothing to do with log_flag.c, why we put them there.
I think we might as well put them to log.c.
And "define MAX_TMPBUF 1024", repeated.
Change-Id: I5ade71b923d61446a5f81f0d2f26fdc4a3057f02
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10923
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We already use destructor functions in env_dpdk to do
some cleanup at process exit, so let's also add one
to call rte_eal_cleanup. This ensures all hugepage
files are freed before the process exits.
Fixes issue #2267.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8c93f503d77f35717b3d18a63ea49b31789dbc00
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10983
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Let user pass a name of tracepoint group. Currently the only way to
enable traces with '-e' option is to pass the tpoint mask, which is
cumbersome. This patch modifies our API to accept strings as parameters.
Example:
-e nvmf_tcp:2,thread
enables nvmf_tcp's second tracepoint and the whole thread tpoint group.
Modified spdk_trace_enable_tpoint_group() - it will be also used in
the changed form later in the series to accept tpoint mask when using
RPCs to activate/deactivate traces.
Change-Id: I6b02363cce3b44b0b578877bc2505f5a4e2fffdd
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10818
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make trace_create_tpoint_group_mask() an external function.
This is going to be used in following patch.
Change-Id: I06cd1652bb30abddd49536bc76ec134a01121537
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10830
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
we don't remove the socket fd from socket group when
nvmf_tcp_poll_group_add() return error, and when
closing the socket there is an assertion.
This was found via llvm_nvme_fuzz via TCP transport.
Change-Id: Ib4ab6fc3fc5e2bc6a9545f6ce854bae8f1157fd5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10849
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If blobs held in a blobstore are opened a lot, lookup
by RB_TREE will be much more efficient.
Change-Id: I7075b95c597a958e7bb10890f803191309532021
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10917
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
We recently improved qpair disconnect process and added assert
if we get a completion without any error when a qpair is disconnected.
However unexpectedly we saw this case very often when we ran the test
test/nvmf/host/multipath.sh for the real hardware in the test pool.
So we remove the assert and change the ERRLOG to INFOLOG.
Fixes one of the issues in #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: Iedbf7e0afa5025da6a810043ba95348ba5b856b3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10901
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We have very frequent failures when we run test/nvmf/host/multipath.sh
in the test pool.
Call stack showed nvmf_stop_listen_disconnect_qpairs() accessed
qpair->ctrlr even if qpair->ctrlr was NULL.
nvmf_stop_listen_disconnect_qpairs() did not check if qpair->ctrlr is
not NULL before accessing qpair->ctrlr->subsys.
When a qpair is added to a poll group, qpair->ctrlr is cleared to NULL.
The test code test/nvmf/host/multipath.sh executes multiple
reconnects for path error.
So a conflict might occur between adding a qpair to a poll group and
disconnecting a qpair in a poll group.
In this case, it may be acceptable even if we disconnect a qpair whose
qpair->ctrlr is NULL. It will be better than SIGSEGV.
Fixes one of the issues in #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Change-Id: I308fcb886dd410d01e3361c1850dec9a8eacbccf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10860
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If nvme ctrlr is resetting or initializing, free_io_qids
bitmap is already freed or not created yet. In that case
an attempt to create IO qpair leads to segmentation fault.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I6a97bf81d5a568db20d23b3f88cf01e994ba42e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10827
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
In current implementation RDMA qpair is destroyed right after
disconnect. That is not graceful qpair shutdown process since
there can be requests submitted to HW and we may receive
completions for already destroyed/freed qpair.
To avoid this, only disconnect qpair in ctrlr_disconnect_qpair
transport callback, all other resources will be released in
ctrlr_delete_io_qpair cb.
This patch is useful when nvme poll groups are used since in
that case we use shared CQ, if the disconnected qpair has WRs
submitted to HW then qpair's destruction will be deferred to
poll group.
When nvme poll groups are not used, this patch doesn't change
anything, in that case destruction flow is still ungraceful.
However since CQ is destroyed immediately after qpair,
we shouldn't receive any requests which point to released
resources. A correct solution for non-poll group case
requires async diconnect API which may lead to significant
rework.
There is a bug when Soft Roce is used - we may receive
a completion with "normal" status when qpair is already
disconnected and all nvme requests are aborted. Added
a workaround for it.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0680d9ef9aaa8737d7a6d1454cd70a384bb8efac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These definitions will be used in the next patch to check if
device is rxe
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Icc073344103991ff24fc3bb88a1ceb9867de6f6a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10727
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This patch aims to introduce a change in enabling
tracepoints inside SPDK. Currently every hit tracepoint will
be stored inside an internal buffer, what is inconvenient when
looking for certain information (eg. starting IO to record
some tracepoints, stopping the IO and having the tracepoint
buffer flooded with irrelevant information before copying the
contents connected with IO operations).
The tpoint mask option (-e) has been extended with ':' character.
User may now enter tpoint mask for individual trace points
inside chosen tpoint group.
Example: "-e 0x20:3f", where "0x20" stands for tpoint group,
':' is a separator and "3f" is the tpoint mask.
Change-Id: I2a700aa5a75a6abb409376e8f5c44d5501629877
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10431
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Some devices may support SRQ depth lower than defaulut
value 4096
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I58da0ac268a6d4c4a7e3b500ae37b8fad4810e17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10654
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Otherwise, this field is left unassigned and the host receives some
garbage cid.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If1e1fe8c7543bcedfbb897200696e05b71c57e0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10770
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When aborting a request in a NEED_BUFFER state, we set it's completion
status and remove it from the pending_buf_queue. Since it's no longer
on that queue and there's no completion it's waiting for, we need to
manually kick.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I1272d441aec3b3090cd8c143a2112a8a6866fcf0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10769
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The tracepoint passes qpair pointer as an argument, while not specifying
it in its definitions, which makes the following assertion to fail:
trace.c:83: _spdk_trace_record: Assertion `0 && "Unexpected number of tracepoint arguments"' failed.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I315e9bf0465db7033ac0f1169536c459ac4e9250
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10761
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Currently if we remove a listener from a subsystem, we
disconnect *all* qpairs that have the same transport ID
as the listener being removed.
Fix that, since we should only disconnect qpairs from
controllers associated with the subsystem that had the
listener removed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6cf7422d14f23bf02ba6c4b034b172870694b3e6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10690
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There is situation that num_extent_pages is zero and original pointer is
also NULL, the realloc() could return a Not NULL pointer.
Related UT has been added and updated.
1) In the default allocation (num_clusters == 0), the extent_pages is not allocated as expected.
2) In the thin provisioning allocation (num_clusters != 0), the extent_pages will be allocated if extent_table is used.
More related information as below:
The crux of the problem is that according to POSIX:
realloc: "If ptr is NULL, then the call is equivalent to malloc(size)"
malloc: "If size is 0, then malloc returns either NULL or a unique pointer value that can later be successfully passed to free"
blobstore was relying on realloc(NULL, 0) always return a unique pointer value, and not NULL. This is not portable behavior.
Change-Id: Ibc28d9696f15a3c0e2aa6bb2371dc23576c28954
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10470
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This id can be used as the 'portid' for discovery
log entries. Previously we were putting the entry
index in the portid field which was incorrect.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f373585fe671ba7e69eb8e07f603f8e8ac1e270
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10589
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This API is a helper for getting the full discovery
log page from a discovery controller. It will read the
log page header to get the total number of entries,
allocate a buffer for all of the entries, and then
issue a series of get_log_page commands to read each
4KiB worth of entries.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I02666ef5adcb9fc8825a221655811ace708f97b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10564
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_ctrlr_construct_namespaces
We can just reallocate here to be more efficient.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8cfc87da23aee6c05ff83aea2165683dddba1dbd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10688
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the one place this was called, we can call nvme_ns_construct
instead. There's no harm in re-fetching the identify pages.
Change-Id: I91292ff9650bdc7edd5588a05837b671dcac1922
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10102
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
No code changes. Move these up so they can be used by some of the
regular command submit paths in future patches.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib8e54d47f7df35771b6c89d7c49d5182cae79e47
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10285
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_ioviter_next will walk through two iovecs and yield pointers
to common length segments. For example, given a source iovec (siov) with
4 1KiB elements and a destination iovec (diov) with 1 4KiB element, the
following will happen:
first spdk_ioviter_next:
src = siov[0].iov_base
dst = diov[0].iov_base
len = 1KiB
second spdk_ioviter_next:
src = siov[1].iov_base
dst = diov[0].iov_base + 1KiB
len = 1KiB
third spdk_ioviter_next:
src = siov[2].iov_base
dst = diov[0].iov_base + 2KiB
len = 1KiB
fourth spdk_ioviter_next:
src = siov[3].iov_base
dst = diov[0].iov_base + 3KiB
len = 1KiB
fifth spdk_ioviter_next:
len = 0
This is a useful utility for performing operations where both the source
and destination are scattered memory. As an example and a test vehicle,
spdk_iovcpy has been updated to use this internally.
Change-Id: I7e35e76d38e78d07ea1caf6282d0dfc02182aa83
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10284
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The host may have specified a hostnqn to use to connect to
a discovery ctrlr, so we can't just use the default ctrlr
opts to connect - we need to call the probe_cb (if there is
one) to get any options that the host may have specified.
Tested by using discovery_aer tool, creating a subsystem and
listener, and then adding a host on the target side that matches
the hostnqn specified to the discovery_aer tool.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07266e984e0094d3a768e6a0d5ea3a3bd71e32ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10547
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In NVMF Revision spec 1.1a, discovery log should be updated
when removing hostnqn of subsystem.
Update unit test to check the discovery log when removing
hostnqn and destroying subsystem.
Signed-off-by: Peng Lian <peng.lian@smartx.com>
Change-Id: I51c597a2493295a677a7aa68e4f13a887f7e1140
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10668
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Macro PCI_CFG_SPACE_EXP_SIZE isn't defined in kernel 4.9.x, so
we use a fixed value here instead.
Fix issue #2282.
Change-Id: Ife1444e919e4c1dc9328437002c15befe2f393e7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10691
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This change implements mechanism to allow user to
define multiple tpoint masks separeted with a comma
(e.g. 0x400, 0x8).
This is going to be used in the next patch to implement
enabling of individual tracepoints inside a tracepoint group.
Change-Id: I963f89684aa62b6e1dde57e22ddf835aa2c89f05
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10536
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
it is possible that some specific transport doesn't support
NVMF_MAX_ASYNC_EVENTS (although it is a recommended value by spec)
with that change it is possible to reduce aerl on transport specific
layer so it can be advertised correctly during identify controller
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ife6465b5324fb39f9b343c6f42b860e9dd1164b2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10422
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Not every transport requires accept poller - transport specific
layer can have its own policy and way of handling new connection.
APIs to notify generic layer are already in place
- spdk_nvmf_poll_group_add
- spdk_nvmf_tgt_new_qpair
Having accept poller removed should simplify interrupt mode impl
in transport specific layer.
Fixes issue #1876
Change-Id: Ia6cac0c2da67a298e88956734c50fb6e6b7521f1
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7268
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
On aarch64 platforms, doorbells update from guest VM may not be seen
on SPDK target side. This is because there is memory type mismatch
situation here. That is on guest VM side, the doorbells are treated as
device memory while on SPDK target side, it is treated as normal
memory. And this situation cause problem on ARM platform.
Refer to "https://developer.arm.com/documentation/102376/0100/
Memory-aliasing-and-mismatched-memory-types". Only using spdk_mb()
cannot fix this. Use "dc civac" to invalidate cache may solve this.
Profiling data did not show big performance degradataion.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I9a18718f8c4307b3007b18c32ab02e6796548958
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This RPC lists all PCI devices attached to an SPDK application. Each
device is identified by a BDF and contains a buffer with a copy of its
config space.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I852f421fde105d975458f8e63b8da4f92ed2c69b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10652
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
This function serializes a buffer as a hex string.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I09ab93bc626f6f6543b7c1ef033bcf807050862a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10651
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
These APIs are not safe, since they do not hold the
pci device lock across calls, which can cause problems
if a device is inserted or removed while handles
returned by these APIs are being used.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I01a80f26d0a0ca4cdfc7181359932b38da8dd43a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10659
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This is a safer alternative to spdk_pci_get_first/next_device,
since those APIs do not hold the lock between calls.
Future patches will remove those APIs, and change callers to
use this new API instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I71c7e8c1feb9112da8be32a8056b30e105e30463
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10655
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
If a subsystem has no listeners, then there is no need
to update the discovery log when adding a host, or setting
a subsystem to allow all hosts.
This eliminates some unnecessary discovery log update
notifications, especially when setting 'allow any hosts'
on a subsystem immediately after it is created (and before
it has any listeners).
Update unit test to check the adding a host to a
subsystem without listeners does not rev the genctr.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I63dab5df564269e574bb925890088f52063aa378
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The discovery log isn't updated when a subsystem is created
or deleted, it's only updated when a listener for a
subsystem is added or removed.
So remove the nvmf_update_discovery_log() in the subsystem
create and delete paths. They just generate extra AER
completions that potentially cause the host to do unneeded
work.
Note that if a subsystem is deleted with active listeners,
the subsystem delete path will remove each of the listeners
before deleting the subsystem itself. So the discovery log
will still get updated when those listeners are removed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id01bbfa3b24d3e1279a614a2fd60be41387a03b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10545
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The NVMe bdev module enables asynchronous IO QP creation by default, after
calling `spdk_nvme_ctrlr_alloc_io_qpair` and `spdk_nvme_ctrlr_connect_io_qpair`,
the queue pair is in connecting state at the beginning, then users may call
`spdk_nvme_ctrlr_free_io_qpair` immediately, and the common layer will
change queue state to NVME_QPAIR_DISCONNECTING and NVME_QPAIR_DESTROYING,
so in function `nvme_pcie_ctrlr_delete_io_qpair` the workaround to wait
for create cq/sq callbacks will not be called, instead of using the common
layer queue state here, we should use the internal `pcie_state`.
Fix#2245.
Change-Id: I801caf26563464b135035bf7fa2f63def13de9f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10445
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Batching will be made available for DSA specifically through the new
idxd_perf tool.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ic51d9ad3692074805b1ffa705cea8be35737c778
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9846
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The NVMe bdev module will support two features, delayed reconnect and
delete after multiple failures of reconnect to improve error recovery.
The recently added two APIs, spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async(), were not good enough.
spdk_nvme_ctrlr_reset_ctx was not necessary. It had only a pointer to ctrlr.
Using a pointer to ctrlr directly saves us from undesirable malloc error
processing.
Separate spdk_nvme_ctrlr_reset_async() into spdk_nvme_ctrlr_disconnect()
and spdk_nvme_ctrlr_reconnect_async(). spdk_nvme_ctrlr_disconnect()
disconnects ctrlr including disconnecting adminq.
spdk_nvme_ctrlr_reconnect_async() moves the ctrlr state to INIT.
Then rename spdk_nvme_ctrlr_reset_poll_async() by
spdk_nvme_ctrlr_reconnect_poll_async().
Finally deprecate spdk_nvme_ctrlr_reset_async() and
spdk_nvme_ctrlr_reset_poll_async().
The following patches will change the NVMe bdev module to use these new APIs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id1d6858dcdc5fc2e9db0a6ebf3f79cab4f9bbcb7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10091
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Future patches will remove some of the more complex conditions
between different configure flags. As a result duplicate entries
might be present in DPDK_LIB_LIST.
Just for tidiness of the DPDK linker args, the DPDK_LIB_LIST_SORTED
is added. Using sort function removes duplicate entries in the list.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I318fd0cebbd30a80d281175b7d48bb3249abb841
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
This library was never used, remove it.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5d0255f4b9ddbe98b349b4253f87e5332fe7057f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10526
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK supports only maintained LTS versions of DPDK.
For SPDK 22.01 this means DPDK 20.11 and 21.11.
This patch removes paths for earlier versions of DPDK.
There is no need to check if library is present for the following:
- rte_telemetry was added as rte_eal dependency in DPDK 20.05
- rte_kvargs was added as rte_eal dependency in DPDK 18.08
- rte_pmd_aesni_mb, rte_pmd_isal, rte_pmd_qat were removed in DPDK 20.11
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I30c4cdb0fe0634db50bc34d7d6c232806ff49960
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10525
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
At this time raid5 bdev does not depend on rte_hash in any way.
Meanwhile NVMe-oF Fibre Channel transport does.
This patch reflects that in the mk file for env_dpdk.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2ba3e016337866f80fc7a6043cef87bf33cf2373
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10523
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The nvmf library will use INTEL VID/SSVID/IEEE values by default,
each transport can overwrite them if needed.
Change-Id: I9dad521c4d080b6f0cc1aaeb4b5d5f6863c6846d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10095
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also we don't treat exceptions when getting INTEL log pages
as a fatal error, the initialization will still contine.
Change-Id: Ic2fd2be510fde2679c1546482934d0a180266936
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10341
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When NVMe passthru command (IO or admin) fails on submission (e.g. it
is not supported), set DNR bit in completion status field. There is no
sense in retrying the command in this case.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I55960c128bd9fc31f6defef0b9832259a71684b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8578
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If NVMe admin passthru command is not supported by underlying bdev,
set status code in NVMe completion to INVALID_OPCODE.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I29c4e1f8263b76b27c199cfd2d9b2474432ec70b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10517
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The originally detected problem is that SPDK NVMf target fails command
with invalid opcode with status code INTERNAL_DEVICE_ERROR instead of
INVALID_OPCODE. All unknown commands on IO queue are passed to
underlying block device layer as NVME_IO type. It is not checked if
this type of commands is supported and, when command fails,
INTERNAL_DEVICE_ERROR is set as status code. If command fails on
submission, status code is set to INVALID_OPCODE which is more
relevant.
This patch adds check if command type is supported to
bdev_nvme_*_passthru functions. If not supported, it is failed with
ENOTSUP.
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com>
Change-Id: I4d7f7639da17dd3b1dc3eee7eb1b4a4f876117a2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8567
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
SPDK shouldn't use `free` to free the memory allocated by rte_zmalloc_socket.
Otherwises, the vhost-blk/scsi will continuously crash.
In this patch, SPDK don't free the dpdk allocated memory,
DPDK will free it finally. Add a flag to indice the resubmit handle.
Change-Id: I85fd84b7d27a091830006a0f84d541c48290cbb3
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10383
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Move the CONFIGURE_AER state before SET_KEEP_ALIVE to
make sure that we run the CONFIGURE_AER state for
discovery controllers.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia4e24f6507c43e3fece06b9161ff8e0b8fa0e97d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We will actually run the CONFIGURE_AER state for
discovery controllers in a future patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib114beb886ab4b9214e4525479eb5ec7e038e5d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10331
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
in case of failure groups shall be destroyed
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I8933f4128a7a3361bbb55d6a9c08a540521e5bda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10435
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Doorbell offset starts from 0x1000 is defined by the NVMe
specification, so rename it to remove `VFIO_USER` prefix.
Change-Id: Ie34b12b3d2618f9b0ad0cf7ccbb103ad2c900f47
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10364
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Calculate supported maximum number of queue pairs based on
BAR0 size, this value isn't allowed to change at runtime, also
define BAR4/5 based on number of MSIX vectors.
Since the maximum number of queues is a large value(512), so we
still define a default value when starting, users still can
overwrite this value with a number no greater than 512.
Change-Id: I1b4b6bdf2ff9d129c8bdd493ffdf0a51f8772d51
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10334
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
libvfio-user will save a copy inside the library.
Change-Id: If7bb052b03fb92e46abe50fa945b812d149ef01d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10363
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reverts commit d9561c444f.
This patch is incorrectly iterating the CPU mask assuming it is
contiguous. However, rather than fix it, let's just let the kernel
scheduler place the thread where it thinks is best. It's going to prefer
idle cores anyway. So reverting is the simplest way forward.
Change-Id: I7b66cce7bfb6ddb108aa7576f508aa3b02b79138
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10475
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These can produce a lot of output, which doesn't really give any
additional information.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I572cd203d61c717ce6400f67ef27ec1d7bb54c0c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10414
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
adding transport to tgt should be the last step
also there is an issue before change i.e. if calloc failed then
transport remains on the list
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Iaf29cfc7b0f535d40160c6fdf9ef6a7e6bfb127c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10429
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
For error case, just set ctrlr->num_aers to 0, and
then the loop won't execute at all. This avoids an
extra call to nvme_ctrlr_set_state() and simplifies
the code a bit.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iff7bbf6e03d18b5f553b9e8527b4c803db583917
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10330
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Discovery services using the SPDK nvme driver may
use long-lasting connections that detect AER completions
to determine when there are changes in the discovery
log. This means that we still need to send keep alives
on discovery controller admin queues. So move the
SET_KEEP_ALIVE_TIMEOUT state immediately after
IDENTIFY, and run the SET_KEEP_ALIVE_TIMEOUT state
even for discovery controllers.
Note, we need the IDENTIFY's KAS value to properly
set the keep alive timeout, so we have to keep the
IDENTIFY state before SET_KEEP_ALIVE_TIMEOUT.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6403c28fb72d42629c5f9009a89c4bfd44d162
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10329
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Keep alive is valid for discovery controllers, so don't overwrite
the value requested with zero in nvme_fabric_ctrlr_scan().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7dcda6ebf4ab1c8a9085e4e3a02b814d8e586a97
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10328
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Next patch in series will look up the struct spdk_sock_group_impl
from spdk_sock_group in spdk_sock_get_optimal_sock_group().
Since this is third place it will be used, make this function common.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I43d6472016782e78709c1d52aa74abf594e5bfe6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10347
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When no optimal poll group exists for a qpair,
assignment for round robin happens in spdk_nvmf_tgt_new_qpair().
RDMA transport implments the logic for this assignment in
nvmf_rdma_get_optimal_poll_group().
TCP relied on the spdk_nvmf_tgt_new_qpair() instead.
This resulted in race condition when looking up and assigning
optimal poll groups - see #2113.
To remedy that, TCP now follows the same pattern as RDMA.
Next patch will improve the sock map lookup to fix the #2113.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I672d22ac15d06309edf87ece5d30f8e8d1095fbb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10270
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There is a bad memory corruption where the code for CUSE attempts to write
one byte (with value 0x1) after the memory is freed.
Context:
When the CUSE device is unregistered, the poller thread is signaled with
fuse_session_exit(), which writes the value 1 to fuse_session::exited.
The poller thread then detects with fuse_session_exited() that it must exit
the routing and finally destroys its own fuse session with
cuse_lowlevel_teardown() before it exits.
However, FUSE may also call fuse_session_exit() for its internal purposes.
I'm not sure exactly under what conditions that happens, but I added
some trace messages and I could clearly see that the CUSE thread exits
before it was requested to exit in cuse_nvme_ns_stop().
If the poller thread early-exits, it would destroy its own FUSE session
(and free the memory) before fuse_session_exit() gets executed, causing
the memory to be corrupted with a single byte of value 0x1.
Reproducer:
The bug can be reproduced by resetting the FUSE session to NULL after it
is destroyed. This will cuse_nvme_ns_stop() to crash with a segmentation
fault in fuse_session_exit() because it tries to access a NULL pointer.
static void *
cuse_thread(void *arg)
{
[...]
free(buf.mem);
fuse_session_reset(cuse_device->session);
+ cuse_device->session = NULL;
pthread_exit(NULL);
}
This fix:
The fix I suggest is to destroy the FUSE session with
cuse_lowlevel_teardown() after the thread is joined.
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I47202891a358f139506845110b012f840974b6fc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9931
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_nvme_ctrlr_free_io_qpair can be called when
qpair is already disconnected. In that case qpair's
state is changed to NVME_QPAIR_DESTROYING and
transport's ctrlr_delete_io_qpair callback is
called. RDMA and TCP transports call
nvme_transport_ctrlr_disconnect_qpair in
the callback and since qpair's state is
not DISCONNECTED or DISCONNECTING, qpair
is disconnected for the second time.
If spdk_nvme_ctrlr_free_io_qpair is called
when qpair is in ENABLED state than nothing
changes, qpair will be disconnected before destroy.
PCIE/vfio_user don't implement transport disconnect
callback, so they are not affected.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I23e11856ecafb51669acf4a3118be049c11eecda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10326
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I58794b7b946eeb8ff82512905af0a296e3b534aa
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9817
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For DSM command, the NVMe drive may take a long time to finish it,
if we set a small timeout value for DSM command, the bdev/nvme module
will try to reset the IO queue pair when timeout happens,
in `spdk_nvme_ctrlr_free_io_qpair`, we will abort the outstanding
IO requests first, then in the `nvme_pcie_ctrlr_delete_io_qpair`,
we will poll the CQ for any requests that have been completed by
the NVMe controller, if there are NVMe completions in the CQ,
we will finish them again, thus double completions happened.
Here we rename `nvme_qpair_abort_reqs` to `nvme_qpair_abort_all_queued_reqs`,
so the common layer will just abort queued request, and let each
transport to abort outstanding requests case by case.
Fix#2233.
Change-Id: Icae6214239160c615418cb514fc51cfe77b59211
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10233
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a parameter which determines the owner of the
map - target or initiator. It allows to set different
access flags when creating Memory Regions
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I0016847fe116e193d0954db1c8e65066b4ff82bf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10283
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The unit test test_nvme_cuse_stop() manually creates 2 cuse devices
and executes nvme_cuse_stop(). Problem is that the Fuse session is
never initialized for those 2 cuse devices, causing cuse_nvme_ns_stop()
to access 'ns_device->session', which is a NULL pointer.
This bug is detected by ASAN as follows:
==77298==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000180 (pc 0x7fdac6d7d40e bp 0x000000000000 sp 0x7fff74768320 T0)
==77298==The signal is caused by a READ memory access.
==77298==Hint: address points to the zero page.
0 0x7fdac6d7d40e in fuse_session_destroy (/usr/lib64/libfuse3.so.3+0x1640e)
1 0x40dc7a in cuse_nvme_ns_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:851
2 0x40df59 in cuse_nvme_ctrlr_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:923
3 0x40f103 in nvme_cuse_stop /home/vagrant/spdk_repo/spdk/lib/nvme/nvme_cuse.c:1094
4 0x415803 in test_nvme_cuse_stop /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:393
5 0x7fdac724c1a6 (/usr/lib64/libcunit.so.1+0x41a6)
6 0x7fdac724c528 (/usr/lib64/libcunit.so.1+0x4528)
7 0x7fdac724d456 in CU_run_all_tests (/usr/lib64/libcunit.so.1+0x5456)
8 0x415a4e in main /home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut.c:415
9 0x7fdac62351e1 in __libc_start_main (/usr/lib64/libc.so.6+0x281e1)
10 0x403ddd in _start (/home/vagrant/spdk_repo/spdk/test/unit/lib/nvme/nvme_cuse.c/nvme_cuse_ut+0x403ddd)
The fix is to call fuse_session_destroy() only if the fuse session is != NULL.
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
Change-Id: I41881243227d83e8d1e6b90e72c1b6d62ccd98d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10225
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We allocate from the head, so it's better to free to
the head too for better cache utilization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c32244f446bd7a1df12eefc81245b3ef7e24070
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10193
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We allocate tasks from the head, so it's better to
free them to the head too for better cache utilization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I67c23e3d89cda16f94b1770eada5465015ddb6ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10192
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The NOTICELOGs really clutter the output during
application start - it's better to make these DEBUGLOGs
instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ae37d5d057d7b972017befbc0834de414b9710b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10191
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This limitation doesn't really take effect currently,
since the typical number of slots per channel isn't
bigger than MAX_COMPLETIONS_PER_POLL. But there's
no reason for this limit anymore - we should always
poll as many completions as we find.
It's better to remove this now, in case we have
configs in the future with higher number of slots
per channel.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5d90d41e5142622b79d9765fbc62da1516e2b8be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10189
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
spdk_bdev_io_get_nvme_status() does not follow NVMe abort command
about return values.
NVMe abort command sets completion status to SUCCESS both for success and
failure cases and differentiates only the bit 0 of cdw0.
lib/nvmf do not use spdk_bdev_io_get_nvme_status() but checks only
success or failure at completion.
So there is no issue now but let spdk_bdev_io_get_nvme_status()
follow NVMe abort command. In future, the user of spdk_bdev_abort()
may use spdk_bdev_io_get_nvme_status().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic4a08056bd8a1aee4c400f72ef5de7c68e23990b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9977
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
The pointer to struct spdk_nvmf_ctrlr is used to save mandatory
controller registers to the migration region.
Also rename some ctrlr/qpiar to vu_ctrlr/vu_qpiar.
Change-Id: Ifcb862bf4543a9df3c62d3b0a57b7f93228ccaba
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
1. use the transport lock to protect transport endpoints list.
2. don't use mixed errno and -1 as the return value, use -1 for all error cases.
Change-Id: I657fa06a6d82ee8dbeefaa3397df2285ba574b80
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9579
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When destroying controller, we will disconnect each connected qpair,
and in the spdk_nvmf_qpair_disconnect() call, qpair_fini() will also
try to hold the same lock, so existing vfio-user implementation assume
that qpair_fini() will not be called in the same context. Patch
https://review.spdk.io/gerrit/c/spdk/spdk/+/8963 remind me that
vfio-user has this issue. While here, we add one more thread poll
to avoid such issue.
Change-Id: I83b82ddcce3eb54c724291223e794dcb53a08059
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9998
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For live migration support in vfio-user transport, we need to pause
the subsystem when starting migration in source VM, then after
migration, the subsystem is in paused state, when exiting the
application, we will call spdk_nvmf_subsystem_stop() at last,
and existing code will assert this case.
Change-Id: If5214c45973b27f6092c4a6d71ede336e54d89e8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9407
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
The recent changes merged multiple Data-OUT PDUs within the same
sequence into a single subtask up to 64KB.
However, they were not enough.
For a large write operation, the hardware iSCSI HBA host sent an immediate
data whose size was not block size multiples and then more solicit
data through R2T exchanges.
One example for a 64KB write operation was as follows:
host sent SCSI Write with 5792 bytes and F = 1
target replied a R2T
host sent Data-OUT with 15880 bytes
host sent Data-OUT with 11536 bytes
host sent Data-OUT with 2848 bytes
host sent Data-OUT with 11536 bytes
host sent Data-OUT with 5744 bytes
host sent Data-OUT with 12200 bytes and F = 1
The hardware iSCSI HBA host can decide the size of the unsolicited data
but the SPDK iSCSI target can require the host to send the solicited data
whose size is block size multiples.
Hence we merge immediate data to the following R2T data if the immediate
data is not more than 64KB and more R2T data come.
Add another test case to check if the fix works for the above example.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4906b4e1a8b61e08862f4ccc27a6caf165126530
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9708
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This clean up will make the following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1ad288ec16aec69a168e0f3019b68e11132b65c9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9707
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Issue: spdk_top tracked pollers by the poller name string and the
thread_id they are running on. This shows incorrect stats when
multiple pollers exist on the same thread with the same name.
Solution: Added a unique poller id for each poller on a thread and
to allow spdk_top to track pollers by thread_id and poller_id.
Signed-off-by: Michael Piszczek <mpiszczek@ddn.com>
Change-Id: I1879e2afc9a929d1df9e8e35510f0092c5443bdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5868
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Wait until the namespace is attached, where it does this operation
again. As of this commit it doesn't really matter because it is just
filling in some values in a structure and if it does it twice it's not a
problem. But later when we only allocate active namespaces, we do not
want to allocate the namespace twice.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ie28653b178975d1ca80bf71ca6b5095224f1c5d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This function destructs a single namespace and removes it from the
controller.
Change-Id: I4b7b3576beda85c9ddad4e0f2db6d1964fa72b82
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10024
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was sometimes used as the maximum array index and sometimes as the
maximum count. Make it consistent everywhere and give it a better name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I518efd99a7d36584624490b0b3497bb6e81ce9ac
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10101
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
list
The list should always be null terminated, but add an additional layer
of buffer overrun protection.
Change-Id: Iee31057fdca5ec4a6177615dff5171e5cb07984e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10027
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Eugene Kochetov <ekochetov@yandex.ru>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This quirk was already applied to the 0x0A53 SSD, but is
likely needed on 0x0A54 as well.
Possible fix for issue #2231.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I36a48ab92d1698a411472f714b5108413bbc3c56
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10162
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the bdev-zone API, there are a few functions that takes a zone_id:
spdk_bdev_get_zone_info(), spdk_bdev_zone_management(), and the
spdk_bdev_zone_append() functions.
The way a zoned application is usually written is that it starts off
by getting the zone report for all zones (zone_id will be sent in as 0),
and then the application will keep the whole zone report in memory.
Therefore, an application usually have access to the zone_id/zslba for
all zones. However, there are cases, e.g. when getting an error on write,
where the completion callback will only have the lba of the write that
failed.
Add a helper function that can be used to get the zone_id/slba for a
given lba. Having this helper in bdev-zone will avoid SPDK applications
needing to provide their own implementation for this.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I978335f87f7d49bc33aed81afcaa6d9f0af8a1e4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10180
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
DPDK vhost will call `new_device` when the VRINGs are
queue paired(virtio-net) or all the VRINGs are started.
However, for virtio-blk/scsi, SeaBIOS will only use one
VRING queue, DPDK added a workaround patch to add
`pre_msg_handle` and `post_msg_handle` callbacks to let
devices other than virtio-net to process such scenarios.
In SPDK, we will start the device when there is one valid
VRING, so there is a case that SPDK and DPDK have different
state for one device. For a virtio-scsi device, SeaBIOS will
only start the request queue, and in the BIOS stage, SPDK will
start the device but DPDK doesn't think so. If users killed
SPDK vhost target at the moment, in `session_shutdown`, SPDK
will expect DPDK to call `destroy_device` to do the cleanup,
but DPDK won't do that as it thinks the device isn't started.
Here in `session_shutdown`, SPDK will do this first, it's OK
that DPDK will call another `destroy_device` for devices that
have the same state both in SPDK and DPDK.
Fix issue #2228.
Change-Id: Ib76dd54c8fa302ffe6da9b13498312b7d344bbfe
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10143
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
_stop_session() is called while holding the global vhost lock,
and in the caller we do release the vhost lock, so even for the
error return from device backend, we don't need to release it
in _stop_session().
Change-Id: I08fef64f900bb42ee68bf02b4c4f1406e903a8a6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10142
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
`struct spdk_vhost_dev vdev` in `struct spdk_vhost_scsi_dev` can be
unregistered in `vhost_scsi_dev_remove`, so we can't use it
anymore in other places after `vhost_dev_unregister`.
Ideally `state->remove_cb` should not take the `vdev` as
the input parameter either, but I don't find it's used
anywhere, so leave it unchanged.
==29555==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000006df0
READ of size 2 at 0x602000006df0 thread T0 (reactor_0)
#0 0x7f3c246c0f0a (/lib64/libasan.so.5+0x9cf0a)
#1 0x7f3c246c3c15 in vsnprintf (/lib64/libasan.so.5+0x9fc15)
#2 0xa55cfa in spdk_vlog /spdk/lib/log/log.c:158
#3 0xa5596f in spdk_log /spdk/lib/log/log.c:110
#4 0x842e43 in remove_scsi_tgt /spdk/lib/vhost/vhost_scsi.c:208
#5 0x851508 in vhost_scsi_dev_remove_tgt_cpl_cb /spdk/lib/vhost/vhost_scsi.c:1149
#6 0x8383f1 in foreach_session_finish_cb /spdk/lib/vhost/vhost.c:1144
#7 0x9d3223 in msg_queue_run_batch /spdk/lib/thread/thread.c:703
#8 0x9d73fe in thread_poll /spdk/lib/thread/thread.c:919
#9 0x9d7c3b in spdk_thread_poll /spdk/lib/thread/thread.c:979
#10 0x8812fe in _reactor_run /spdk/lib/event/reactor.c:920
#11 0x881bf1 in reactor_run /spdk/lib/event/reactor.c:958
#12 0x88292b in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#13 0x873ff9 in spdk_app_start /spdk/lib/event/app.c:585
#14 0x408044 in main /spdk/app/vhost/vhost.c:105
#15 0x7f3c23691f42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
#16 0x407add in _start (/spdk/build/bin/vhost+0x407add)
0x602000006df0 is located 0 bytes inside of 8-byte region [0x602000006df0,0x602000006df8)
freed by thread T0 (reactor_0) here:
#0 0x7f3c2473191f in __interceptor_free (/lib64/libasan.so.5+0x10d91f)
#1 0x8369f2 in vhost_dev_unregister /spdk/lib/vhost/vhost.c:1024
#2 0x84f32d in vhost_scsi_dev_remove /spdk/lib/vhost/vhost_scsi.c:913
#3 0x83cdb7 in spdk_vhost_dev_remove /spdk/lib/vhost/vhost.c:1494
#4 0x83ed66 in vhost_fini /spdk/lib/vhost/vhost.c:1644
#5 0x9d3223 in msg_queue_run_batch /spdk/lib/thread/thread.c:703
#6 0x9d73fe in thread_poll /spdk/lib/thread/thread.c:919
#7 0x9d7c3b in spdk_thread_poll /spdk/lib/thread/thread.c:979
#8 0x8812fe in _reactor_run /spdk/lib/event/reactor.c:920
#9 0x881bf1 in reactor_run /spdk/lib/event/reactor.c:958
#10 0x88292b in spdk_reactors_start /spdk/lib/event/reactor.c:1060
#11 0x873ff9 in spdk_app_start /spdk/lib/event/app.c:585
#12 0x408044 in main /spdk/app/vhost/vhost.c:105
#13 0x7f3c23691f42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
Change-Id: I511c4316a838cd92961d57c9193d384acd49d760
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10141
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously task->current_data_offset was updated by add_transfer_task().
However, the following patches will merge unsolicited data and solicited
data into a single subtask. It will be possible that add_transfer_task()
is called but subtask is not submitted. As a preparation, extract
updating task->current_data_offset into iscsi_pdu_payload_op_scsi_write().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5262bb883fa2a081be1f087181de98d4c3c24d69
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9706
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When data segment size is 64KB and data digest is enabled, if
data segment and data digest are split into different two packets,
- pdu->mobj[0] became full first when reading data semgment,
- pdu->mobj[1] was allocated but unused and data digest was read.
In this case, two SCSI write tasks were submitted by mistake and
the second SCSI write task had no data.
Fix the bug in this patch.
When iscsi_pdu_payload_read() is called and pdu->mobj[0] is full,
allocate pdu->mobj[1] only if any of data segment remains to read.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9a0c36c05f90092c3c2122a7eb91e10976830b40
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9965
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We had not considered a case that incoming data to the second data
buffer was split into multiple TCP packets when merging incoming data
up to 64KB.
We do not change the unit test because we already have data check
and it is very hard to include partial read into the data check.
However, it is very usual that incoming data is split into multple
TCP packets. The feature to merge incoming data up to 64KB will be
actually enabled in the following patches. So we rely on the I/O test
to verify this fix.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I50d702d6c118bc16f0767845136e14414ccdf813
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9736
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The internal device list isn't used anywhere, and will cause ASAN
error because we didn't remove the entry from the device list when
destructing controller.
Change-Id: Ie97bf10ca44ff773a8bc5f0476611b3844ef901a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10109
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There is no need to sum SPDK_CEIL_DIV(length, io_unit_size) and
req->iovcnt as the later is always zero (assignment in spdk_nvmf_request_get_buffers).
Checking SPDK_CEIL_DIV(length, io_unit_size) is enough.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: I0fea1d86706d83e4dd9083e6f0ce09e4e4b033a8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10089
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The specification says "host specifies an offset (i.e., LPOL and LPOU)
that is greater than the size of the log page requested, then the
controller shall abort the command with a status of Invalid Field
in Command."
Offset is used (if needed) to retrieve specific records of
Discovery Log Page, so we don't check it for Discovery Log Page.
Change-Id: I76ce929600b9f2ca9b69397d25f339d55729e6d3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10093
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This can be used for multipath validation.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba62c5e90b22a9a85078fbc783d3a7273c029cde
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10137
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
CAP.CQR (Contiguous Queues Required) is always 1, so we should
return invalid field when PC bit is 1 and return Invalid Interrupt
Vector if Interrupt Vector is too big.
Also fix the issue that just creat/delete a CQ.
Change-Id: I7cc64a5946a1ed3161448fca8b433d08e5fee715
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10080
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
PCI event module currently requires use of SO_RCVBUFFORCE socket option
which is restricted to CAP_NET_ADMIN. Retry with SO_RCVBUF for non-root
(unprivileged) processes where this capability is not available.
Return -ENOSPC if receive buffer is not of sufficient size.
Fixes issue #2224
Signed-off-by: Tom Nabarro <tom.nabarro@intel.com>
Change-Id: I0bed1b1eac0c7e8601d3d172d8027380ec8be391
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10126
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We already provides the API `spdk_nvme_ctrlr_is_fabrics` to return
input controller is fabrics controller or not, but it needs a controller
data structure as the input, so here we add another API to do the same
thing and it takes the transport type as the input, with this change,
both nvme and nvmf library can use the API.
Also we should treat UINT8_MAX(255) as valid fabrics transport type.
Change-Id: Ib62e7d3eca3da1ddb1a4cc55b0b62e274522f1ce
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10059
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This makes it more clear why reading a JSON
configuration file failed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9dae907c839d48068044b09f34b89a2de51cd811
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10057
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
ftl_dev_dump_bands accumulates a total in a local
variable, but the final value never gets used.
So just remove the variable completely.
Found with clang-13.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7a92f6bfa4ae56fc4d8189c887bf0f6d4a05d759
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10055
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
As per the nvme specs,
If OPTPERF is set to ‘1’ indicates that the fields
NPWG, NPWA, NPDG, NPDA, and NOWS are defined for this namespace and
should be used by the host for I/O optimization
Setting NPWA, NPDG, NPDA same as NPWG and NOWS same as MDTS
Fixes#2197
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: Ic769a21b6821fa731eeae83e7d30c380e8092e37
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10007
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
User may configure opts.max_qpairs_per_ctrlr, so use
opts.max_qpairs_per_ctrlr instead of using fixed default
NVMF_VFIO_USER_DEFAULT_MAX_QPAIRS_PER_CTRLR.
Also do not allow users to configure max_qpairs_per_ctrl >
NVMF_VFIO_USER_DEFAULT_MAX_QPAIRS_PER_CTRLR.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: Id9f1f796559f3bb8b2d3f5031606050470681b99
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9994
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We generally shouldn't do ERRLOGs based on bad
inputs from the host, so change some of these to
DEBUGLOGs instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id42a8e51964e0238cd2841be8ac1d4ccaa1d150d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10037
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Details of the changes here:
918fd2f146.
They mainly target the aesni_mb driver which was moved to ipsec_mb and
bump the minimal supported version of the ipsec to v1.0.
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: Ica3b4fd66684751939159511845eb6ac6f7d5205
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10003
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
When doing controller reset and shutdown, we may change the
CSTS.RDY and CSTS.SHN even there are pending IOs in the IO
queues, so here we add a timer in the reset and shutdown
callback, it will change the status when there are no
connected IO queues.
Fix#2199.
Change-Id: I3a54d30b257973661b269ad5e37637490f9390f4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9908
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
SPDK nvmf target reports all listeners on all subsystems
in discovery pages, kernel target reports only subsystems
listening on a port where discovery command is received.
NVMEoF specification allows to specify any addresses/
transport types. Ch 5: The set of Discovery Log entries should
include all applicable addresses on the same fabric as the
Discovery Service and may include addresses on other fabrics.
To align SPDK and kernel targets behaviour, add filtering
rules to allow flexible configuration of what should be
listed in discovery log page entries.
Fixes#2082
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ie981edebb29206793d3310940034dcbb22c52441
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9185
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
The qpair's state member is only 3 bits of a uint8_t,
and the in_completion_context bit is another bit in that
same uint8_t.
We know that the qpair's state is only ever updated by
one thread, but it is possible that the state could
be modified by one thread, while another thread
is modifying in_completion_context.
in_completion_context is only modified by the thread
that is polling the qpair (or the qpair's poll group).
But with async mode, another thread that has a qpair
on the same PCIe controller could poll its adminq and
reap the SQ completion for the qpair that's owned by
the other thread.
So do *not* set the generic qpair state to CONNECTED
from the SQ completion callback. Instead just set
the pcie_state to READY, and let the thread that owns
the qpair detect the qpair is READY and set the state
to CONNECTED itself.
Fixes issue #2157.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9efc0c954504f1841e1c3890ae78211ad0d1990e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9975
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
This reverts commit 3b1f13ef29.
It seems like this particular commit is causing failures on the
CI side related to the following issue:
https://github.com/spdk/spdk/issues/2214
Reverting for now to make the CI stable.
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: I7f32c612b8549477de0b9079d114cec4ec9bda59
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9986
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is a race condition between controller destruction and
subsystem state change, e.g. admin qpair may already be freed
when a namespace is added or removed. As result in function
poll_group_update_subsystem we may get heap-use-after-free error
Another problem is that some qpair's live time may exceed controller's
life time. To avoid it, start controller destruction process when the last
qpair finished the disconnect process (previously controller started
the descruction process before the last qpair starts to disconnect
and it could lead to raise conditions)
Fixes#2055
Change-Id: I11612979eb914a5fdcc7b9e3c812bf1e450b6120
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8963
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Matt Dumm <matt.dumm@hpe.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Some of our tpoints have a name exceeding 24 characters.
Althought this is not problem for SPDK, it might cause
confusion, because error messages are printed.
Tpoints regstered inside fc.c had their _REQ_ part removed,
since it was used in all of them.
Fixes#2208
Change-Id: I598eb9c1d252d8ca6c83f82e564a6b53037936f4
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9963
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
DPDK will report error when detaching the PCI device in secondary
process, because SPDK will return -1 in `pci_device_fini`, so
here we will reset the `attached` flag before that.
Also return the errno instead of -1.
Change-Id: I3efa4d97ceab504215faeb9d3d80a694bdd6014c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7944
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
When intr mode is enabled, it will be common that
poller is unregistered during interrupt processing.
Since poller unregister is a delayed operation,
mark it in spdk_thread object, and reap unregistered
pollers out of poller execution.
Fixes#2143
Change-Id: Ieb61fc7685f85af5c15e833dd1dd56f8c97a3b12
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5770
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch makes use of the changes from previous patch and to
show connections between bdev events and tcp events.
Change-Id: If28c256d74a9a5d581ee4d8292a08fc061fee968
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9622
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Currently we do not have any way to connect traces from different
modules in SPDK. This change modifies our trace library
and app/trace to handle adding relations between trace points
and a trace object.
Additionally this patch adds classes and fields to structs
inside trace.py to prepare it for future patches implementing
printing relation information.
Change-Id: Ia09d01244d923957d589fd37e6d4c98f9f7bbd07
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9620
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
There could be an outstanding for_each_reactor operation
in progress when the application shuts down. If we don't
wait for that operation to complete, we'll get a memory
leak for its context.
So when stopping the reactors, before setting the
state to EXITING, do one final for_each_reactor to
ensure any outstanding for_each_reactor operations are
flushed. Once this final for_each_reactor operation
is complete, we can set the state to EXITING safely.
Also use a global mutex and flag to ensure no more
for_each_reactor operations can start while the final
for_each_reactor is in progress. If a new operation
is requested, we'll just ignore since the reactors are
shutting down anyways.
Fixes issue #2206.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6d86eb3b0c7855e98481ab4164af82e76c009618
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9928
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
nvmf_ctrlr_cc_shn_done() and nvmf_ctrlr_cc_reset_done() are
almost same, so consolidate them together by adding a flag.
Change-Id: Ib7714d31a40f9d8d344ec2630c083c5d76dac8a1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9907
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Currently either HW Engine Channel or SW Engine Channel will be used.
In the case that HW Engine Channel is used while does not support related
operations like IOAT for CRC, it will shift back to the SW Engine's handle.
So that this is an issue that it still refers to the HW Engine Channel
while needs SW Eninge Channel to handle.
This patch introduces the SW Eninge Channel and always initializes there
in case that HW Engine does not support some operations.
Related UT also added to simulate the case the IOAT does not support CRC
and then SW Eninge needs to properly handle it.
Change-Id: I4ecdcd09ab669a616b37c567b45b1e6499800ec9
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
In some cases a single virtually contriguos memory
buffer can be translated to several chunks of memory.
To make such translation possible, update structure
spdk_memory_domain_translation_result to use a pointer
to iovec.
Add a single iov structure or cases where translation
is always 1:1, it will make easier translation callback
implementation. For RDMA transport translation of address
is always 1:1, so treat iovcnt other than 1 as an
error.
Change-Id: I65605575d43a490490eba72c1eb19f3a09d55ec6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9779
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Instead of a union with domain type specific
parameters, store an opaque pointer to user
context. Depending on the memory domain type,
this context can be cast to a specific struct,
e.g. to spdk_memory_domain_rdma_ctx for RDMA
memory domains.
This change provides more flexibility to
applications to create and manage custom
memory domains
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: Ib0a8297de80773d86edc9849beb4cbc693ef5414
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9778
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Push operation complements existing pull
operation and allows to implement read data
flow using memory domains.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Change-Id: I0a3ddcb88c433dff7a9c761a99838658c72c43fd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9701
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's too strict to fail the controller when there are no free requests.
Change-Id: I0a66ff2d294a2fd9326506ea50af4213aaaf8e92
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9848
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
We already support Set Features with Host Behavior, so here also
add the support in Get Features.
Change-Id: I27d973e81fd6be89cc67ad559439334fc1087c9e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9818
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: John Levon <levon@movementarian.org>
nvmf_subsystem_get_qpairs RPC handler may cause the program
could not exit normally (e.g. ctrl c). Reason is that,
spdk_get_io_channel() will be called during the getting
qpairs stream, which will add refcount value for each
existing channel. When end target, channel cannot be
destroyed since refcount be added additionally and
its value could not be subtracted to 0. As a result,
the program will hang in the process of exiting.
So here we don't need to allocate a new channel, just use
the exist one.
Signed-off-by: zhaoshushu.zss <zhaoshushu.zss@alibaba-inc.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: I08b4678edaa9404b8e8af125ebae572b31edf77e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9881
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
- remove metadata updater
- handle 'zero' flag in mempool allocator
- adapt ocf_mngt_cache_start() to new OCF API
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Change-Id: I34afd856cc1306ffe305f71a445e7474c9b0a2d9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9129
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
OCF requires separate mempool for each request size from rage 4k-128k. Mempools
are meant for both IO and OCF's internal requests.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Change-Id: I3b37d287bbd6f963a24007c8485002f414fb8a7e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7774
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Adds context to currently existing traces inside bdev.c file.
These are going to be used later to match with traces from
nvmf layer to enable io tracing.
Change-Id: I599a60412f39144cdd306315a184b2000a61286e
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9497
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
This is to help with binding trace objects together and
for the convenience (all trace definitions are in one place
instad of being scattered accross multiple files).
Change-Id: Ib15bc9c2eeee9c4d0816bcee509ab69f3f558e19
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9574
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Enable tracing of tcp qpairs in lib/nvmf/tcp.c.
Change-Id: I692e74a972dcddd0bff193f1703470926e28b4db
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9288
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
This change aims to help with tracepoint handling in the future.
Instead of printing each trace directly with given values, we
will use this function to pass status change value.
Change-Id: Icc7f2863703899f818f0a2d5f49b69aa4e26a62c
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9690
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
The new name suits better to the following "data push"
operation
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ic3249f65de203f375477f8e87b0749b9502d165c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9878
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For GET/SET FEATURES command, some feature IDs will have data buffer
while some don't have, so here we will return the buffer length base
on the feature ID. The length is defined by the specification.
Change-Id: Ie9798585fc74544b77998aeebc2a20614c1c25f0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9689
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It is a bit confusing for the variable names to not match
the parameter names for spdk_bdev_io_get_nvme_fused_status().
These changes should make it a bit simpler to understand.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I833bf2a75cfec7b86da2d8a234f800de3510c2b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9867
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Add the paired spdk_json_write_named_uint8|16 function
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ib7ee9ae4dbe9a4e9cfa28750f0b9a0af597d260c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9788
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previous patches (5363eb3c) tried to work around the
32-bit unmap and write_zeroes LBA counts by breaking
up larger operations into smaller chunks of max size
UINT32_MAX lba chunks.
But some SSDs may just ignore unmap operations that
are not aligned to full physical block boundaries -
and a UINT32_MAX lba unmap on a 512B logical /
4KiB physical SSD would not be aligned. If the SSD
decided to ignore the unmap/deallocate (which it is
allowed to do according to NVMe spec), we could end
up with not unmapping *any* blocks. Probably SSDs
should always be trying hard to unmap as many
blocks as possible, but let's not try to depend on
that in blobstore.
So one option would be to break them into chunks
close to UINT32_MAX which are still aligned to
4KiB boundaries. But the better fix is to just
change the unmap and write_zeroes APIs to take
64-bit arguments, and then we can avoid the
chunking altogether.
Fixes issue #2190.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I23998e493a764d466927c3520c7a8c7f943000a6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9737
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>