iscsi_sgl_append_with_md had not increment _iscsi_sgl::iov
by appended count of iovecs. Hence if data digest is enabled,
data segment will be overwritten by data digest.
This patch fixes the bug.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibcdacb883b2b97ad86cfc39a035c76264090401d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456451
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Data digest computation should take extended LBA payload but
could not do yet.
This patch adds an new API spdk_dif_update_crc32c() to compute
CRC-32C value for extended LBA payload.
In the next patch, spdk_dif_update_crc32c will be used in iSCSI
target first.
Change-Id: I327f384bb7dfd8b68279b0acec0ee78a40264a26
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456171
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DIF context had been got for every read of data segment but
DIF context is not changed and so getting DIF context once
when allocating data buffer is enough and is done in this
patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0f386ab594a0d9076fa0206d5fc240f5c790181d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456772
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The only caller iscsi_conn_handle_incoming_pdus accesses the returned
pdu only when the return code is 1. So we can remove update of
_pdu for other cases in spdk_iscsi_read_pdu.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I54b9f050c3b45c87f4797a90d7606638d6c821ad
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456771
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Instead of just return "-1", the proper error detail will be
returned and reported out.
Change-Id: Ief900494081ddc9ae6329ee7a56723d9cb5efe13
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456937
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
According to the TP 8000 spec in Page 26:
Maximum Number of Outstanding R2T (MAXR2T): Specifies the maximum
number of outstanding R2T PDUs for a command at any point in time
on the connection.
This patch makes the current host driver implementation support one r2t.
We cleanup the code to do the right advertising to the target in the
icreq and avoid attempts to deal with multiple rt2s.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: If06ad2e8bde31c2fd7e1c3739f651fb64040e3a9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455750
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In prep for full QAT compression support later in this patch
series. dpdkbuild/Makefile slightly refactored for readability,
x86 crypto check removed as it pre-dated checks we now have in
configure.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Iaaaf51b9eb5e18840f47d2d4f431c5a6e8c420ee
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456408
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If the qualified core wasn't found, then we should compare
spdk_env_get_core_count() with i, instead of lcore.
Signed-off-by: Li Feng <fengli@smartx.com>
Change-Id: Ie92f56712b7f0e51636008fe12fff5584b6be8ab
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456415
Reviewed-by: wuzhouhui <wuzhouhui14@mails.ucas.ac.cn>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
According to https://tools.ietf.org/html/rfc3720#page-196,
Error detail of "Authorization failure" is for the case that
the initiator is not allowed to the given target.
Change-Id: Ie628c4c857e965aa6694399a9832ce0501e50745
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456826
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
According to the TP 8000 spec in Page 26:
Maximum Number of Outstanding R2T (MAXR2T): Specifies the maximum
number of outstanding R2T PDUs for a command at any point in time
on the connection.
Note that by the spec, the target may only support single r2t
(which is the minimum possible), it doesn't have to use multiple r2ts
even if the initiator supports that. So remove the maxr2t and
pending_r2t variable in the tcp qpair structure.
In the original design, we think that maxr2t is the maximal active
r2t numbers for each connection. So if the initiator sends out maxr2t=16,
it means that all the commands of a qpair can use such number of R2T pdus.
So we need to wait for the available R2Ts for the request when the maxr2t
reaches the maximal value. But it is the wrong understanding of the spec.
In fact, each command has its own number of maximal r2t numbers, then we
do not need to use the wait method for R2T method anymore. So we remove
the state TCP_REQUEST_STATE_DATA_PENDING_FOR_R2T. Futhermore, we adjust
the related SPDK_TPOINT_ID definition.
In current patch, the target will support one active R2T for each
write NVMe command. Thus, we remove the function spdk_nvmf_tcp_handle_queued_r2t_req.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7547b8facbc39139b4584637ccc51ba8b33ca285
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455763
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The current nvme_tcp_qpair_disconnect behaviour
is not exactly correct, we do not re-initialize
the state of some data structures of the tqpair.
And this caused the coredump.
Purpose: Fixes#808.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I4d2cad8fc0712dbebfc2f3e52373cbe3b9908bf7
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456755
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
MAX_VMD_TARGET defines size of array that need to be passed
to spdk_vmd_pci_device_list()
Change-Id: Ib2a33fe50072036e6a8f1709ac4e3ee82c1bb3f1
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456480
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The memory API has been refactored. It is not possible anymore to
register a memory region more than once. This has been introduced in
this patch: https://review.gerrithub.io/426085
In case of vhost with vvu transport, it often happens that two
consequtive vhost memory regions are mapped to virtual addresses that
lie within the same 2MB address range. This means that the vhost memory
regions may not be 2MB-aligned in the process virtual address space. As
a result, the `FLOOR_2MB()` of those addresses gives the same address.
Thus, we end up trying to register the same 2MB memory range twice.
This issue does not appear in case of AF_UNIX transport. Vhost memory
regions in case of AF_UNIX transport are hugepage backed. Therefore, the
mmapped virtual addresses of those memory regions are always
2MB-aligned. On the contrary, in case of vvu transport, the vhost memory
regions are segments of the PCI memory address space of the
virtio-vhost-user PCI device. This MMIO space is mapped in its entirety
by the DPDK vfio interface along with the other PCI BARs. Ultimately,
the vhost memory regions correspond to offsets in this mmapped PCI
memory region and thus there is no warranty that the mmapped virtual
addresses are 2MB-aligned.
This issue is fixed by skipping the already-registered 2MB memory
regions.
Change-Id: I62c9c257e6f172c894cd3454d2cbeee1986e6189
Signed-off-by: Nikos Dragazis <ndragazis@arrikto.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/441057
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Some existing NULL ptr checks was never hit so
there should be removed.
Change-Id: I80f4e1f063564f2a6915d0a1c548bcf22c4b3075
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456477
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Removed all printf() usages and replaced it with
SPDK log macros.
Change-Id: Ibe536a9bb696ebf3ad7dd1f402b050eca0dfd375
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456476
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Added assertion checking if next element on list
is valid.
Change-Id: I9f4d969dc84e5fbee9d72d764f57fbd9480ab197
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456774
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
spdk_iscsi_read_pdu has been used only in a place,
iscsi_conn_handle_incoming_pdus. iscsi_conn_handle_incoming_pdus
classifies return code of spdk_iscsi_read_pdu to
SPDK_ISCSI_CONNECTION_FATAL, 0, or 1.
Using 0 instead of SPDK_SUCCESS in spdk_iscsi_read_pdu matches
iscsi_conn_handle_incoming_pdus and is done in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I231c2e70a094c26af2c8eb60d77b92d880674901
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456770
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
DPDK rte_ring_enqueue_bulk() has free_space parameter to return
the amount of space in the ring after enqueue operation has finished.
This parameter can be used to wait when the ring is almost full and
wake up when there is enough space available in the ring.
Hence we add free_space to spdk_ring_enqueue() and spdk_ring_enqueue()
passes it to rte_ring_enqueue_bulk() simply.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9b9d6a5a097cf6dc4b97dfda7442f2c4b0aed4d3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456734
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The length should be no larger than the remaining_size.
For example, The remaining_size(firstly, assigned by payload_size) is 128KB,
and user's sgl length is 1MB. Since we already split the I/O, so we should
not use the original length(1MB), but use the remaining_size.
Fix issue reported by: https://github.com/spdk/spdk/issues/808
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I0a7d0f2282c8ad0e253d8de7091b6c5b87018e9a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456760
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously, if the return value of nvme_tcp_qpair_process_send_queue
is not zero, we directly return but not continue receiving the pdu.
But this is wrong, we should only handle the case when the
return value is negative.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I83453733f5a3e3350a0461b4cb0bc409fde32fea
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455899
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Stumbled on this one trying to fix new scan-build errors for clang 8 on
fedora 30. scan-build fails because there are paths where this can
result in a null pointer access.
Change-Id: I408376e7eab79ab6d99c35cc662f5737e3518e71
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456671
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added a utility function for the cleanup related operations.
Change-Id: I4e49dd9c2da899a5bda289bc36778b636af9c5df
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456599
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The function spdk_bdev_io_get_io_channel() was added,
but remained unused. Couple places done the exact same thing,
so now they just use it directly.
Change-Id: Ifa332ce3a489256ccf2f5d0dd9c74a3215c544ce
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456435
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The function spdk_bdev_io_get_thread() was added,
but remained unused. Couple places done the exact same thing,
so now they just use it directly.
Change-Id: Ia25abf57eb88c8df47c83e4edd761a9e02cc1618
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456436
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The metadata buffer was allocated in multiples of blocks (as well as
aligned to a block). It's not necessary and is a waste of space.
Change-Id: Icb13c664e82b23ea09bec16fa3c18fa98b722d57
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455922
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The current location of the metadata pointer should be calculated based
on the metadata size, not block size.
Change-Id: I90177557cf4e194cd77adfbfe1c5a6e0e6df7967
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455921
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We will need to put the recently completed nvme_request
object on the qpair's STAILQ. We don't reference any
real data from the nvme_request in the completion path
since we've already stashed the cb_fn and cb_arg in
the nvme_tracker. But we will need to reference the
STAILQ_ENTRY to put it back in the qpair's STAILQ, so
prefetch that cacheline.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id76122afe4150c84a61fbe38bc874f10d606b3b3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There's no need to set this every time we allocate
a request.
While here, fix a typo near where we needed to modify
the unit test to remove the qpair assertion.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8af41a6c483415950f625d1ed2ef46088b75a622
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456270
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Handle IOs with metadata being transferred in a separate buffer if the
device underneath supports it.
Change-Id: I38887a24d2aad51f674a840367b5dcdeda2d5a8b
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451467
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch adds *_with_md family of functions allowing for IO with
metadata being transferred in a separate buffer.
Change-Id: I842d5a00a532cf5d0b0f0738535ea46903674140
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451465
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is no need to keep this functionality in separate
files.
Change-Id: Ie998324cbcfcdc384f912fe36124b082774c0dc8
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456475
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
vring notification mechanism is transport-specific. At present, vhost
dataplane code in `lib/vhost/vhost.c` triggers guest notifications with
`eventfd_write()` system call. But this is an AF_UNIX specific
notification mechanism. This patch replaces `eventfd_write()` with the
existing generic `rte_vhost_vring_call()` function that is part of
DPDK's librte_vhost public API.
`rte_vhost_vring_call()` takes a vring_idx as an argument to associate
the `struct spdk_vhost_virtqueue` instance with the relevant `struct
vhost_virtqueue` instance. We introduce a new `vring_idx` field in
`struct spdk_vhost_virtqueue` to enable this association. This field is
initialized in `start_device()`. In addition, a stub for
`rte_vhost_vring_call()` is added in the vhost unit test file.
SPDK's internal `rte_vhost` copy will not be updated in order to support
the virtio-vhost-user transport. However, an `rte_vhost_vring_call()`
function is introduced in SPDK's `rte_vhost` in order to have a solid
API. This function is just a wrapper of `eventfd_write()`.
Change-Id: Ic93e25cd3f06e92f04766521bc850f1ee80b8ec8
Signed-off-by: Nikos Dragazis <ndragazis@arrikto.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454373
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Request may be submitted several times via nvme_qpair_submit_request
function, such as request in queued_req queue being re-submitted.
With enabling timeout feature, nvme_qpair_submit_request compares
request->submit_tick to zero to check if this is the first submission
for this request. If true, record submit_tick for this reuqest.
So request->submit_tick needs to be set zero in allocation.
Change-Id: Ie3f420aa337802c5ad3962c3fdcd680dec1ccdcb
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This assert is not needed anymore.
Change-Id: I7bc85c980c2e120b105b35763b9bea5960564acc
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456590
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Keep an array of iovecs tied to each IO. Internal IOs can have
iov_cnt > 1 without having to allocate additional memory. User's iovec
can now be modified too (e.g. when splitting the request).
Change-Id: Iec73c6dad59acdc331a1461fd7feac085138a8de
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455525
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Replaced single IO pointer with a queue. This allows multiple requests
to be queued when the writes cannot be scheduled.
Change-Id: I0cde50e7378108a6aab4ef6fe601cb854244f5d5
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455524
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
On the user IO path, the flow always follows the same procedure:
allocate ftl_io and initialize it using ftl_io_user_init. There's no
point in separating these two actions.
Change-Id: I578347ccc7c85e5945368dd6bf76c58985af3939
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455523
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There's no point in synchronously returning an error from ftl_io_write
/ ftl_io_read as they're also reported in the IO's completion callback.
All errors are now reported through io->status.
Change-Id: I4adf4e13221b63715625276042e1172cd63b8f9b
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455521
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Select the method of choosing next PPA to read inside ftl_submit_read,
as it can be deduced from other arguments.
Change-Id: I6cbd37c2c6d7f073bf735491b4960d72e327c4e6
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455520
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Some of the internal IOs split the contiguous data buffer into multiple
iovecs to separate the requests. It's unnecessary and can be replaced
by checking the position within the data buffer.
Change-Id: I9255ea0072fee6c4e6a7dca21e4008780bc45610
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455519
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Advance parent's IO position when children positions are advanced.
There's no need to call ftl_io_advance twice for the same request
anymore.
Change-Id: I1f7f04a3a83575b11e7bf0b13c0c8f3bf98b5522
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455518
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This is a small optimization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib593908d3aeb17aac55be06b8e3be42e28a23061
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456268
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Existing code in spdk_bdev_write_zeroes_blocks() will call spdk_bdev_free_io()
for the error case, which will cause assertion because the bdev_io isn't
submitted to the backend yet, so we will check the condtion first to
avoid the error case.
Change-Id: If27d78217f709a3315e74c00869d345abd6b9a69
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453491
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We were printing out the wrong value here.
Change-Id: I7b5f4eaca41317a69167ad5413a1b1913e9c0842
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456278
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This module simply sits on another virtual bdev
and adds a simulated average and p99 latency to that drive.
Change-Id: Ie9fc91e27585fd0636cb7dc845cb41744bf24625
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Using AVX512 or AVX2 ends up being a small pessimization.
I think AVX works better for copies when there are
multiple cachelines to copy. I see a 2-3% improvement
in high IOPs benchmarks when reverting to SSE.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3d70a1e359e98cec2a9da41ccf9af2de9baa5868
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456247
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Profiling showed these weren't getting inlined - so add
the inline keyword to make sure it happens. This helps
improve performance a bit.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia86edccc9163258efdcddcce6989a71fb180caf6
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456099
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
At 10M IO/s, we see a lot of CPU cycles wasted getting
the next tracker into cache. If we only get one
completion at a time, this is unavoidable, but when
there are multiple completions pending, we can prefetch
the second tracker while processing the completion for
the first.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9de702bee3719e4494eec6f05b09be3672f1e0ac
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456097
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Prior merge contained all of the code EXCEPT for the user-callable function.
Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Change-Id: I1cb7105ab85ffae8ed4f600261fed86c9c778893
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456282
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
All the SCSI commands have two states: allowed and conflict,
based on different reservation types and I_T nexus, the SCSI
module will check each commands before sending to the backend
device.
Change-Id: Ib05cece1c237873360f85c68d139362200d2d47e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436097
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Purpose: Make the variable definition consistent
with the same variable in the target side.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ibc4ff92b6346f0a1ad803dcb79d041289f5648b2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455807
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
QAT driver will not load with anything less than 128, may need to
adjust after performance testing.
Change-Id: I4949e6d97e7ed997d14b907dac00cc288dcc331b
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456069
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
mbufs are freed with a single call to the first mbuf in the chain
unless we hit an error before we chain them. So free them correctly
when chained and loop through the array one at a time and free them
that way if not.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I80ebb86bc2147b42db89d24f8e985cacf7d4bceb
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455019
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We need to move sub_reads++ statement earlier. Otherwise if
the read fails, the sem_wait(&channel->sem) call number
is not correct.
Change-Id: I05281a74bef78e4f80ce8d202b47849d1e6c2009
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453788
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Orden Smith <orden.e.smith@intel.com>
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Change-Id: I4a3c4d1ca8d0b18201edf99a67a3d75ca7ab8153
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449499
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The purpose of is patch is to track whether there is
req allocation failure for those file metadata related
operation in order to find the bug.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I171a5b7fca214675fb1356ca650a01eeb20b4c8c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454829
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
_dif_sgl_fast_forward covers _dif_sgl_advance. Hence this patch
merges two into _dif_sgl_fast_forward and renames it to
_dif_sgl_advance.
Change-Id: I4949f3028b08cec8df8c9037c983d840d0a1eaaa
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456184
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch add support for VMD driver object.
New PCI device ID for VMD device was added.
Change-Id: I47bd8772a15ad370a14b7cc9460a177c91e6dd6a
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Signed-off-by: Orden Smith <orden.e.smith@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455545
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie32a1bb144c239b923b5cbb9e608a7dfc9c05208
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456076
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In Fedora release 28, plug in nvme device and run setup.sh,
the uevent is like this:
UDEV [1060.112118] add /devices/virtual/vfio/81 (vfio)
ACTION=add
DEVNAME=/dev/vfio/81
DEVPATH=/devices/virtual/vfio/81
MAJOR=509
MINOR=1
SEQNUM=8544
SUBSYSTEM=vfio
USEC_INITIALIZED=1060111894
UDEV [1060.122089] bind /devices/pci0000:d7/0000:d7:00.0/0000:d8:00.0 (pci)
ACTION=bind
DEVPATH=/devices/pci0000:d7/0000:d7:00.0/0000:d8:00.0
DRIVER=vfio-pci
ID_MODEL_FROM_DATABASE=PCIe Data Center SSD (DC P3700 SSD [2.5" SFF])
ID_PCI_CLASS_FROM_DATABASE=Mass storage controller
ID_PCI_INTERFACE_FROM_DATABASE=NVM Express
ID_PCI_SUBCLASS_FROM_DATABASE=Non-Volatile memory controller
ID_VENDOR_FROM_DATABASE=Intel Corporation
MODALIAS=pci:v00008086d00000953sv00008086sd00003703bc01sc08i02
PCI_CLASS=10802
PCI_ID=8086:0953
PCI_SLOT_NAME=0000:d8:00.0
PCI_SUBSYS_ID=8086:3703
SEQNUM=8545
SUBSYSTEM=pci
USEC_INITIALIZED=1060121805
Have tested several kernel versions such as v3.10, v4.10, v4.15, v4.19.
We didn't see an event which is like this:
ACTION=add
DRIVER=vfio-pci
Change-Id: I7299a2fb4d634edaa6bab3412ee8f363f66aae6f
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452053
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Move actual submission code to _spdk_bdev_io_do_submit, used by
both normal submission path and QoS path.
Previous patch(review.gerrithub.io/c/442127) adds the missing
bdev_io->internal.in_submit_request flag to QoS submission path.
But QoS submission path doesn't handle nomem_io yet.
This patch makes QoS submission path handle nomem_io in the same way
as the normal path and extracts actual submission code into do_submit
function, so that further modification of the submission logic will
apply to both paths automatically.
Change-Id: I41fa88d239c3a2bd9783d812826e32e7c887818d
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455252
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Make the old name a deprecated alias.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibbf50676e0d989b67121e465fc140f94faec46ed
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453033
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We flush a cache buffer once it's filled. When the write
for that cache buffer has completed, we look to see if
there's more data to flush. Currently if there's *any*
more data to flush, we will flush it, even if it's not
a full buffer.
That can hurt performance though. Ideally we only want
to flush a partial buffer if there's been an explicit
sync operation that requires that partial buffer to be
flushed. Otherwise we will end up writing the partial
buffer to disk, and then come back and write that data
again later when the buffer is full.
Add a new unit test to test for this condition. This
patch breaks one of the existing unit tests which was
designed specifically around a RocksDB failure condition.
Change that file_length unit test to now write exactly
one CACHE_BUFFER, which still tests the general logic
making sure that we don't confuse the amount of data
flushed with the value written to the file's length
xattr.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I83795fb45afe854b38648d0e0c1a7928219307a2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455698
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Data can get implicitly flushed as cache buffers are filled. But
the length xattr is only written in response to a sync or close
operation. So we cannot just look at the amount of data flushed,
and ignore the sync operation if all of the data written has been
flushed - we still need to write the length xattr.
This also adds a unit test which reproduces the original problem.
Fixes issue #297.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icca6ef4d1544f72e9bc31c4ee77d26b4b7f0cce4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455692
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Interpret volume->open() options as our vbdev_ocf_base struct.
This is used during metadata probe procedure, when there is no
vbdev configurations created, so we need to pass base structure
as option.
This patch is a prepartion for persistant metadata support,
and it does not change current behavior.
Change-Id: I0b7435df1692d8b3028931c6c9fc50d2d84b2557
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Hotremoving a bdev exposed as a vhost-scsi LUN which
has an allocated io_channel is broken now. By detaching
LUNs from their SCSI devices at the start of LUN
hotremoval, we can end up with the LUN's descriptor
completely leaked.
Upon hotremove, each LUN fires a hotremove_cb to the
upper layer (iscsi/vhost) and starts a poller to
periodically check if lun->io_channel has been closed
already. While iSCSI allocates and frees io_channels
for particular LUNs separately using
spdk_scsi_lun_allocate_io_channel(), vhost allocates
and frees io_channels for the entire SCSI device using
spdk_scsi_dev_free_io_channels(), as it knows there's
only one LUN per SCSI device.
Since LUNs are removed from SCSI devices at the start
of LUN hotremoval, the spdk_scsi_dev_free_io_channels()
is not able to close the io_channel of the child LUN.
That LUN is no longer referenced there, and the function
returns immediately without any error.
By reverting this patch, we practically defer removing
LUN from SCSI device until the bdev descriptor is closed,
that is, until that spdk_scsi_dev_free_io_channels()
and the subsequent spdk_scsi_dev_destruct() in vhost
is called.
Historically, spdk_scsi_lun_free_io_channel() was
introduced after vhost-scsi already supported hotremove,
so before we can introduce a change like this one we
have to refactor vhost to use the new APIs first.
This reverts commit 497997bc60.
Change-Id: I58a9988a22694f5b5b927173098ac8218270462e
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455371
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Kaminski <pawelx.kaminski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Padding requests are issued in multiple batches at the same time
when shutdown occurs. This allows for faster removal.
Change-Id: Iea40d2418bedbd7cf3c6865e5eb8f85871db13cd
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454578
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For the latest TLC NAND, one write buffer unit (rwb batch)
needs to be spread over three PUs instead of being allocated
to a single PU for better sequential read performance
since the optimal write size(ws_opt) of 3D TLC NAND is
3 times bigger than the optimal read size(rs_opt).
I added num_interleave_units in 'struct spdk_ftl_conf'
to configure the number of interleaving units per ws_opt.
If num_interleave_units is set as 1, the whole of the ws_opt
blocks are placed sequentially around each PU.
If num_interleave_units is set as N, the 1/N of the ws_opt
blocks are staggered. So consecutively numbered blocks
are separated by ws_opt / num_interleave_units.
The sequential read performance is improved from 1.9GiB/s
up to 2.97GiB/S with this patch on our system. No performance
degradation is observed on sequential writes or
4KB random reads/writes.
Please refer to the Trello card for more details.
https://trello.com/c/Osol93ZU
Change-Id: I371e72067b278ef43c3ac87a3d9ce9010d3fcb15
Signed-off-by: Claire J. In <claire.in@circuitblvd.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450976
Reviewed-by: Young Tack Jin <youngtack.jin@circuitblvd.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is the same intention as the patch for iSCSI in this series.
This change will be helpful to extract common part into a specific
helper library if necessary in future.
Change-Id: I1ce36b424ff2afb85f998149f4ef0d7153290802
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455621
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Recently DIF library refined SGL create operation by unifying
size and used count into unused count. This patch applies the
good practice in DIF library to create SGL in NVMe/TCP.
The next patch refines names of related function and variables
to be consistent in NVMe/TCP.
Change-Id: I1e73310c0e3650ede53672d76071a6c37dba82c1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455473
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Recently DIF library refined _iov_ctx to support multiple iovecs
for DIF insert and strip operations. iSCSI library is the first
user of DIF insert and strip, and so rename _iov_ctx to _iscsi_sgl
to match DIF library. Names of related helper functions and
a member variable are changed together.
This change will be helpful to extract common part into a specific
helper library if necessary in future.
Change-Id: I3be3b6c3c476b0309ffafdecf3181fb2bb19abc6
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455620
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch adopts the good practice in DIF library, and unifies
array size and used count into unused count.
Change-Id: Ifde15d1fa66dbd2578da771b5500e45a84664f63
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455475
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In some cases user may want to flag blob for removal
then do some operations (before removing it) and while
it happens there might be power failure. In such cases
we should remove this blob on next blobstore load.
Example of such usage is delete snapshot functionality
that will be introduced in upcoming patch.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I85f396b73762d2665ba8aec62528bb224acace74
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch moves _spdk_blob_set_thin_provision function
higher in the file as it will be later used during
blobstore load.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ife37ef8c69b88903646b2002b3561101c1eb5135
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455488
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
A PERSISTENT RESERVE OUT command with PREEMPT service action action is used to:
a) preempt (i.e., replace) the persistent reservation and remove registrations; or
b) remove registrations.
Change-Id: I906b09057e5ddef0ea2fe180bb5fff6e9e0a04f6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436092
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Any application client can release the persistent reservation and remove all
registrations from a device server by issuing a PERSISTENT RESERVE OUT command
with CLEAR service action through a registered I_T nexus.
Change-Id: Ie21885f7f9632b9da77e877fbce7c51b2e0b47a7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436091
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
An application client releases the persistent reservation by issuing a
PERSISTENT RESERVE OUT command with RELEASE service action through an
I_T nexus that is a persistent reservation holder. The RELEASE service
action will release the persistent reservation and do noting to
existing registrants. Unit Attention isn't supported by SPDK so here
we will not establish any unit attention condition.
Change-Id: If0df3b3fc4e89ffa78d827e2b31cf5c61aa149db
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436090
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
An application client creates a persistent reservation by issuing a
PERSISTENT RESERVE OUT command with RESERVE service action through
a registered I_T nexus with RESERVATION KEY and TYPE, and persistent
reservation has a scope of LUN, only one persistent reservation is
allowed at a time per LUN.
Change-Id: I052d0ddd439dd9994c0793218dd5c5d6ed176d2c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436089
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
To establish a persistent reservation the application client shall first
register an I_T nexus with the device server. An application client registers
with a logical unit by issuing a PERSISTENT RESERVE OUT command with REGISTER
service action or REGISTER AND IGNORE EXISTING KEY service action.
Specify Initiator Ports (SPEC_I_PT) bit and All Target Ports (ALL_TG_PT) bit
are not supported for now, the registrants belong to the I_T nexus who sends
the command.
Also Activate Persist Through Power Loss (APTPL) bit will be supported in
following patches.
Change-Id: If057a764a4bfa73017f98048c94b697dbfa4b4a1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436088
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We throttle the number of data_in operations per
connection. Currently after a read is completed,
we try to send more data_in operations since one
has just been completed.
But we are trying to send more too early. The data_in_cnt
doesn't actually get decremented until after the PDU is
written on the socket. So this results in a case
where data_in_cnt == 64, and all 64 read operations
complete before any of those 64 are actually transmitted
onto the TCP socket. There are no more read operations
waiting, so we won't try to handle the data_in list
again, and if none of these 64 resulted in a SCSI
command completing, then the initiator may not send us
any more read I/O which would have also kicked the data_in
list.
So the solution is to kick the data_in list after the
PDU has been written - not after a read I/O is completed
back from the SCSI layer.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia01cf96e8eb6e08ddcaaeff449386e78de7c5bc5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455454
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The datain handling code path will set the LUN to
NULL if it finds a task's LUN has been hotremoved.
This could happen before the iscsi hotplug routine
actually gets a chance to run. If this happens,
one of these tasks doesn't actually get freed, and
then will be freed after the lun is closed -
causing a segfault in the bdev layer since it may
have a bdev_io associated with it.
Found by running the iscsi_tgt/fio test after
applying the next patch in this series.
There's more work needed in this hot remove clean up
path - currently we are just freeing a lot of PDUs
rather than completing them with error status when
a LUN is hot removed. But let's tackle that
separately.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8d27f0c7a79ae91cb6504e5ff6ffc8e346c9e54c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455460
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>