Purpose: to make the timeout work for NVMe TCP transport,
we miss this for TCP transport.
Change-Id: Iab4af988cc4796b4d6d98430453f3dbce1fcf313
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445117
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch refactors driver init and in doing so eliminates the mem
leak described in the GitHub issue. Also it is now consistent with
how the pending compression driver does init.
Fixes#633
Change-Id: Ia2d55d9e98fb9470ff8f9b34aeb4ee9f3d0478f5
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442896
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We should not add addtional check since we already have this
option in timeout_cb function, the addtional check is unnecessary.
Change-Id: I77c89303155e0c14072a1838994f9e76a0ffc0f4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is used to implement this function.
Since we need to call nvme_tcp_req_complete in this
function, so we need to adjust the location of the
nvme_tcp_rep_complete funtion.
Change-Id: I5fc3693aec8dc166ac1eb03babcd2d73d7b00e63
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444489
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
In this patch series, spdk_bdev_scsi_read and spdk_bdev_scsi_write
became almost identical. Hence squash them into spdk_bdev_scsi_read_write.
Change-Id: Ibbaddf74c1bf2dac37a0133eac27086af650a061
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444780
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This is in a effort to consolidate SCSI read and write I/O
for the upcoming transparent DIF support.
Previously conversion of bytes and blocks are done both in
SCSI layer and BDEV layer. After the patch series, conversion is
consolidated into SCSI layer.
Change-Id: Ib964a41ec22757f2a09cea22f398903f78d0781f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444779
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This is in a effort to consolidate SCSI read and write I/O
for the upcoming transparent DIF support.
Previously conversion of bytes and blocks are done both in
SCSI layer and BDEV layer. After the patch series, conversion is
consolidated into SCSI layer.
About conversion from bytes to blocks, we don't expose bdev API
spdk_bdev_bytes_to_blocks and but create private helper function
_bytes_to_blocks because we will use not block size but data
block size when we support transparent DIF feature.
Change-Id: I37169c673479c92e027e2507a0e54a1e414b43e1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The last parameter xfer_len of spdk_bdev_scsi_read is not used,
and of spdk_bdev_scsi_write is used only to check task->transfer_len.
Hence remove the last parameter xfer_len from spdk_bdev_scsi_read/write
and extract the check operation from spdk_bdev_scsi_write and insert
it into spdk_bdev_scsi_read_write.
Additionally, remove a debug log because xfer_len is not passed to
spdk_bdev_scsi_write anymore. Hopufully, this will not degrade any
maintainability.
On top of this, factoring out the operation to convert byte to
block in spdk_bdev_scsi_read/write be done.
Change-Id: I35faca269a9c4a7f15d27e8e61b6a1b809a36b3f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This helps ensure it gets inlined in the spdk_vtophys
code path, now that spdk_vtophys is defined in the same
compilation module.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0d0d9bba4295f0d9a7c0657834aa5d39f3b682d8
Reviewed-on: https://review.gerrithub.io/c/445354
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
CPU profiling on workloads with intensive vtophys
operations (i.e. very small CB-DMA transfers) exposed
overhead introduce by spdk_vtophys having to call
spdk_mem_map_translate in a different compilation
unit. Let's just move the vtophys.c contents into
memory.c so that spdk_vtophys can inline
spdk_mem_map_translate and avoid this extra overhead.
This of course breaks the memory and vtophys unit
tests, so some additional changes are needed there
to keep everything linking correctly.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I295ed5f441d3eec7abdbc9d881c49d2174ec9f48
Reviewed-on: https://review.gerrithub.io/c/444975
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Previously, we can -p + hex value(e.g., 0x1) to assign the master core
and start the NVMe-oF or iSCSI target app.
However now it is not supported and prints error. I checked
the code, it only supports transformation with Decimal format,
so chaning the base to 0 to make it supporting other formats.
Change-Id: I82510ba0cef47b5593484b4fd3490f85c93cf6a5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444830
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Do not set the completion_update bit except on
the last descriptor built before the dmacount doorbell
is written. This allows much better batching of
completions (to match batching of the submissions).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idd0281fb2e9e1ad2eb0f65f097c54fc051dfd935
Reviewed-on: https://review.gerrithub.io/c/444974
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add spdk_ioat_build_copy and spdk_ioat_build_fill
which mirror the existing spdk_ioat_submit_copy
and spdk_ioat_submit_fill. These new functions
*only* build the descriptors in the ring - they do
not write the doorbell. This enables batching
which can significantly improve performance by
reducing the number of MMIO writes.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia3539f936924b7f833f4a7b963d06ffefa68379f
Reviewed-on: https://review.gerrithub.io/c/444973
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will enable batching of doorbell writes in
future commits. For now, just make the API public.
This is the first in a series of patches that
drastically improves performance for high queue
depth CB-DMA workloads. Some basic tests on
my Xeon E5-v3 platform shows about 4x improvement
for 512B transfers.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia8d28a63f5020ae8644c1efdec7f68740bb6920c
Reviewed-on: https://review.gerrithub.io/c/444972
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
To enable the timeout function.
Change-Id: Id5c40848957743683b6a5c2d085e7f777f14497d
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444803
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use NBD_SET_SOCK to check whether the nbd device is setup
by other process or whether nbd kernel module is ready
before other nbd ioctl operations. This can avoid bad
influence to the nbd device setup by other process.
Change-Id: Ic12acbfddb8c4388e25731c39159b1ce559b8f23
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444805
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The ioctl NBD_SET_SOCK can return EBUSY on conditions not
only the kernel module hasn't loaded entirely yet, but
also the nbd device is setup by another process, which will
lead the poller's infinite polling.
This patch will wait only 1 second if device is busy.
Change-Id: I8b1cfab725cba180f774a57ced3fa4ba81da2037
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444804
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There is no need to lock g_ftl_bdev_lock when unregister a ftl_bdev.
Besides, the destructor of ftl_bdev will lock it again.
Change-Id: I99870483183879d9422584dbac6e154f605daea8
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/444794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Added check before write submission to indicate if
LBA was update in meantime. In such case don't set band's
metadata and rwb entry cache bit. Previous implementation
invalidates such address during write completion and could
cause that inconsistent lba map was stored into disk.
Change-Id: I4353d9f96c53132ca384aeca43caef8d11f07fa4
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444403
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We assumed io_channel allocation always succeeds, but
that's not true. Doing I/O to any vhost session that
failed to allocate an io_channel would most likely
cause a crash.
We'll now detect io_channel allocation failure and
print a proper error message. The SCSI target for
which the channel allocation failed simply won't be
visible to the vhost master. All I/O to that target
will be rejected.
We should probably report the error to the upper
layer and either prevent the device from starting
or fail the SCSI target hotplug request. But for now
let's just prevent the crash.
Change-Id: I735dfb930d8905f70636a236b4fa94288d0aaf3a
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_ctrlr_submit_admin_request() will access admin queue, and we
should hold ctrl->ctrlr_lock when access it.
Change-Id: Iff576fe5e14e854eb38dbc64d6c6d9ec1ba17056
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/444793
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also use the same style condition check for secondary process
with PCIE type.
Change-Id: I93c83126145255887914ef5efea1a493c8f7f767
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444492
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The helper function spdk_get_data_out_buffer_size() is a little
confusing because it does only returning macro constant
SPDK_ISCSI_MAX_RECV_DATA_SEGMENT_LENGTH.
The macro constant will be configurable and so the helper function
is not sustainable.
Replace the helper function simply by the macro constant.
Change-Id: I4ec300f61783da7bb712512603c2dd80987ec702
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When hotplug feature is enabled by NVMe driver, users may
call delete_nvme_controller() RPC to delete one controller,
however, the hotplug monitor will probe this controller
automaticlly and attach it back to NVMe driver. We added
a skip list, for those user deleted controllers so that
NVMe driver will not attach it again.
Fix issue #602.
Change-Id: Ibbe21ff8a021f968305271acdae86207e6228e20
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444323
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Error logs in nvmf_rdma_dump_request lead to report error about
address points to the zero page, add judgement to return.
this issue occurs in heavy load fio testing.
Change-Id: I50302be88b3af53f718e3800aa16df7c506ca4e8
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441110
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
User can create a probe context to probe and attach controllers
asynchronously, the controllers will be added to the context list
for the first step, then users can poll the context until the list
becomes empty.
Change-Id: I3a96e2d8a9724332ff15542f78f9553fdab505e2
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442664
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Existing NVMe driver uses a global list g_nvme_init_ctrlrs
to track the controllers during initialization, and internal
function will start each controller in the list one by one
until the list is empty. We introduce a probe context
and move the global list into the context, with the context
we can enable asynchronous probe API in the next patch, also
this can enable parallel probe feature.
Change-Id: I538537abe8c1a4a82fb168ca8055de42caa6e4f9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/426304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously, function spdk_nvme_probe_internal() will probe
NVMe controllers and then bring up probed controllers
into the ready state after that. Broke up original two parts
with probe and start stage, this will help us to introduce
a probe context in the next patch.
Change-Id: Ie0c55a6a5463fb437f84349b0b2b33a217ba63e0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/426303
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It iterates over the list and polls each one. However,
in practice the list still contains just one thread for
now.
Change-Id: I9bac7eb5ebf9b4edc6409caaf26747470b65e336
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440763
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This is the inverse of spdk_thread_get_ctx.
Change-Id: I81541ff1687cfea358cb7046caf69982c38f6a38
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444455
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Schedulers can use this region to store required information.
Change-Id: I93efb44f1a534596f6285bbe014579311fe011e7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444454
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is much simpler and avoids the problems with requiring
it to run on a thread.
Change-Id: I811444c5a15d292356703beccc17e505d55d7678
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443645
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The thread scheduling mechanism is being rewritten and this
won't be used in the new system.
Change-Id: I829e8118ed0a10480bd86934b45e68fcb810931a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444453
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Althrough SPDK already provides a API to users which
can process runtime timeout NVMe commands, but it's
nice to have another API here, SPDK NVMe driver can
use it to break the endless wait. Also use the API
first in the initialization process, because we don't
want to add another initialization state with Intel
only supported log pages.
Change-Id: Ibe7cadbc59033a299a1fcf02a66e98fc4eca8100
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444353
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
From TP8000 spec 7.4.7,
"In response to a C2HTermReq PDU, the host shall terminate the connection.
If the host does not terminate the connection in an implementation specific
period that does not exceed 30 seconds, the controller may terminate the
connection on its own".
It means that the timeout is designed for: when the target is
sending out C2hTermReq, if the host does not terminate the connection,
the target should terminate the connection.
PS: For detecting the malicous connection without sending response
(such as no response of R2T PDU) which should be another patch.
Change-Id: I586dbb235d99aeab5d748a19b9128cd8b0cef183
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently, SPDK does not support vfio in no-IOMMU mode. However, it
seems quite easy to extend the vtophys code to add support for this.
vfio in no-IOMMU mode does not support DMA remapping. This implies that
physical DMA addresses are used instead of IOVAs.
This patch checks whether the vfio no-IOMMU mode is enabled using
function rte_vfio_noiommu_is_enabled() from the DPDK RTE vfio interface.
In this case, physical addresses are used for the DMA mappings. This is
the same code path for the DMA translations as when the uio is used as a
kernel driver.
Change-Id: I6fb3c849a345c6f2f2b4141dddb8c17be2581495
Signed-off-by: Nikos Dragazis <ndragazis@arrikto.com>
Reviewed-on: https://review.gerrithub.io/c/441061
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Keep all of the thread library interactions in one file.
Change-Id: Iecb20d3767190b5da105a29670ead9e192d03257
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440761
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This prevents issues where spdk_thread_poll may report
that it did not useful work (for the one poller it ran),
causing the system thread to go to sleep.
Change-Id: I7a4842d5e399758c19268aee343a001ccfc88a3a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440598
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The persistence feature can't support for now, but as the features
are mandatory for reservation, so add the two function here, and
we can enable it with future patches for power loss persist feature.
Change-Id: Ic358eda00058809bbfd6984b0861f8b6b5aabecd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438213
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When this structure was brought up to the generic layer, the tcp
transport was using max_io_size and the rdma transport was using
io_unit_size. In the interest of conserving memory, we should use
io_unit_size instead of max_io_size.
Change-Id: I2633306fcbfd8c3d557445959c745cb2d9a0999e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We should never be going over these limits in the respective transports,
but add asserts to check this during testing.
Change-Id: Ifcaa82ccf58546a38020b31df54ee5d1d9822b8b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442777
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It is possible for spdk_nvmf_poll_group_add to fail. In this case we
need to tear down the qpair in the same way that we do in the new_qpair
function.
Change-Id: I17abdec2646d2b7f9ed07c9b9b3e74d3d0991903
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443472
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This intermediate state is unused and meaningless. the qpair transitions
into this state right before calling a synchronous operation and then
transitions to active as soon as that operation completes successfully.
If the operation did not complete successfully, we were leaving qpairs
in this weird intermediate state when for all intents and purposes they
had reverted to an uninitialized state. Keeping qpairs in the
uninitialized state until they have been added to a poll group creates a
meaningful distinction between states that can be actionable from the
transport level.
Change-Id: I6de9bc424b393b6fff221aa2f4212aaa91488629
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443471
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Connections in the uninitialized state haven't been added to a poll
group yet, so submitting dummy requests to them will be pointless since
they will never be polled. We need to reject the connection and destroy
the qpair immediately.
Change-Id: Id5dd711882e1ae7c13ae32c06da2285186b00a1b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443470
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since there are multiple events/conditions that can trigger a qpair
disconnection, we need to funnel them to a single point of entry. If
more than one of these events occurs, we can ignore all but the first
since once a disconnect starts, it can't be stopped.
Change-Id: I749c9087a25779fcd5e3fe6685583a610ad983d3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For devices that support fewer SGE elements than our default values, we
need to adjust the I/O unit size so that we don't ever try to submit
more SGLs than we are allowed to.
Change-Id: I316d88459380f28009cc8a3d9357e9c67b08e871
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This prevents us from overrunning the send queue.
Change-Id: I6afbd9e2ba0ff266eb8fee2ae0361ac89fad7f81
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443476
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This value was not being decremented when we got SEND completions for
write operations because we were using the recv send to indicate when we
had completed all writes associated with the request. I also erroneously
made the assumption that spdk_nvmf_rdma_request_parse_sgl would properly
reset this value to zero for all requests. However, for requests that
return SPDK_NVME_DATA_NONE rom spdk_nvmf_rdma_request_get_xfer, this
funxtion is skipped and the value is never reset. This can cause a
coherency issue on admin queues when we request multiple log files. When
the keep_alive request is resent, it can pick up an old rdma_req which
reports the wrong number of outstanding_wrs and it will permanently
increment the qpairs curr_send_depth.
This change decrements num_outstanding_data_wrs on writes, and also
resets that value when the request is freed to ensure that this problem
doesn't occur again.
Change-Id: I5866af97c946a0a58c30507499b43359fb6d0f64
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Pass IO flags to NVMe write IO and verify PI error when PI error
occurs.
To know the location that caused PI error, checked read with disabling
PRCHK is necessary and is used in this patch.
Change-Id: Id90fb90c4b3ca95840785a4443ff98d637ceb247
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443189
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently struct bdev_io holds io_channel that the I/O was submitted
on through bdev_io::bdev_channel, but bdev_io::bdev_channel is private
in bdev.c and cannot be referenced in other files.
Hence add an new API spdk_bdev_io_get_io_channel API to get io_channel
coniveniently.
Change-Id: Ic2e2fde845d324f7a1637e3c75080727a62de5ec
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443843
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Pass IO flags to NVMe write IO and verify PI error when PI error
is detectec.
For write I/O, PI error will be already contained in write data
buffer, and no extra I/O is necessary.
Change-Id: I2f2359c4201aded7abccb182c39c00b25ff0bd5f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443188
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Including bdev_module.h and using spdk_bdev_unregister_cb instead of
spdk_delete_passthru_complete will follow other bdev modules.
This patch doesn't change any behavior.
Change-Id: Ia236ea37ae22ed5c7740b02d1c5bd37491b9cf9a
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444166
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Using passthru virtual bdev's name instead of base bdev's name as
io_device's name will be meaningful. This patch doesn't change any
behavior.
Change-Id: I33f7aa78c60cd1d9f6a7b36280441bc559f44857
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444165
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Subsequent patches will implement PI verification when PI error occurs,
but PI verification will be different between read and write.
Subsequent patches will set IO flags for normal read and write but
will not set IO flags for checked read.
Current nesting stack,
bdev_nvme_readv/writev
-> bdev_nvme_queue_cmd
-> spdk_nvme_ns_cmd_readv/writev
-> bdev_nvme_queued_done
makes these changes difficult.
Hence this patch inlines bdev_nvme_queue_cmd into bdev_nvme_readv/writev,
adds separate completion function bdev_nvme_readv/writev_done, and
removes enum direction.
This patch doesn't cause any functional change.
Change-Id: I2f97ff21245539c690490d0fc4134d2e0049eddd
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443187
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
PI check flags is not set to NVMe controllers created by hot plug
handler automatically. Document this behavior for clarification.
Change-Id: I9590d0cb7f53a24c33afd706e222065893d23cb4
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444012
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Add "prchk:reftag|guard" to the 3rd item of the TransportID row
in [Nvme] section.
apptag is not supported yet as same as JSON RPC.
These two patches cannot control hot added NVMe controllers, but
we should not set prchk options to hot added NVMe controllers
automatically. Hence the next patch will document this behavior
explicitly.
Change-Id: I74a73ac52779aa50c5b45e20ffb61002e95f33ef
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The next patch will use the string "prchk:reftag|apptag" as
per-controller prchk options for .INI config file.
Hence add helper functions for them beforehand.
Change-Id: I58c225cc36cc84bf594f108e611028996b5eedb9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443834
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Add prchk_reftag and prchk_guard to construct_nvme_bdev RPC.
In spdk_rpc_construct_nvme_bdev, create prchk_flags based on them
and pass it to spdk_bdev_nvme_create, and in spdk_bdev_nvme_create,
pass it to create_ctrlr.
A single option enable_prchk may be enough but add separate options
for reftag and guard to clarify that apptag is not supported yet.
The next patch will make per-controller PRCHK options configurable
by .INI config file.
Change-Id: I370ebbe984ee83d133b7f50bdc648ea746c8d42d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443833
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add prchk_flags in struct nvme_ctrlr and set it at creating of
the corresponding controller, and copy it to each bdev of the
controller.
Change-Id: Ie971a0c1539b5419de9e5168ed47ac0e579be2c5
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443186
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Bdev don't support APIs that passes metadata not interleaved with
logical block data. So, return error explicitly when creating NVMe
bdev with separate metadata for now.
Change-Id: I0776e72232c8e7758ad11b405e7e4914e779d131
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444011
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Metadata location and DIF type are set only if there is metadata, and
DIF location is set only if DIF is enabled.
Change-Id: Ib684b54332820446ff1a0b609f5b4e0b3d42f2f9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The patch is used to fix issue:
https://github.com/spdk/spdk/issues/638
Reason: For supporting sgl, the implementation of
function nvme_tcp_pdu_set_data_buf is not correct.
The translation is not correct for incapsule data
when using SGL. In order not to do the translation
via calling sgl function again, we use a variable
to store the buf.
Change-Id: I580d266d85a1a805b5f168271acac25e5fd60190
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444066
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Currently, the SPDK_BDEV_REGISTER_MODULE() macro uses __LINE__
to generate functions like spdk_bdev_module_register_187().
Typically, this is not a problem as these functions are not called directly
rather, they are only used as constructor functions to load the bdevs during
system startup.
There are languages however, (e.g rust) that require these functions to be
referenced explicitly to prevent them from being removed during the linking phase.
In order to reference them, having the names predictable (and potentially
changed per commit) makes things easier.
Change-Id: I15947ed9136912cfe2368db7e5bba833f1d94b15
Signed-off-by: gila <jeffry.molanus@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/443536
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_thread_poll()
This is an optimization if the calling function already knows the
current time.
Change-Id: I1645e08e7475ba6345a44e0f9d4b297a79f6c3c2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443634
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
strip_size as an rpc param is now deprecated and can be removed
in a future release. Either strip_size or strip_size_kb can be
used but only one of them or the rpc will fail.
Internally we maintain both fields because strip size always
comes in as KB but we convert it to blocks so having both elements
makes it clear for developers what they're looking at.
JSON output includes both strip_size and strip_size_kb.
Fixes#550
Change-Id: I5dc51e8af22eae3d56af8f8d37a564dbaae228fa
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437873
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DPDK 19.02 requires this mempool to be allocated via
crypto-specific function which returns rte_mempool.
To keep the amount of #ifs minimal, we'll use rte_mempool
unconditionally.
Change-Id: I3a09de41e237e168580bb92b574854e291e68a74
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443785
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We setup the qpairs on module init but never
released them. Some memory was leaked, although since
it was allocated with rte_malloc() it couldn't be
picked up by ASAN.
rte_cryptodev API offers rte_cryptodev_queue_pair_setup()
to setup a qpair, but there's no equivalent function to
release it. We have to access the rte_cryptodev structure
directly and call a qpair release function ptr that's
stored inside. It seems very very hacky, but the entire
rte_cryptodev structure is a part of the public API and
the global array of all such devices is an exported
symbol.
Change-Id: I17ac73d1098ca9a92d2dfd52e0f905e2c2b5488f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The typical rdma qpair disconnect function goes through the function
_nvmf_rdma_disconnect_retry. When this function was introduced, it was
discovered that we could receive a qpair disconnect event for a given
qpair before that qpair had been assigned to a poll group. In order to
ensure that the disconnect procedure completed properly, we waited on
the current thread in _nvmf_rdma_disconnect_retry for the qpair to be
assigned a poll group before we finally disconnected. see rdma.c:2250.
Since _nvmf_rdma_disconnect_retry was not necessarily called from the
poll group's thread, we relied upon the assumption that the group
variable would never be set back to NULL. See the comment on rdma.c:
2243.
However, in _spdk_nvmf_qpair_destroy we were setting the group back to
NULL. This operation can result in the following set of operations
across multiple threads that prevent a qpair from ever being fully
destroyed.
1. thread 1: receive a disconnect event - call nvmf_rdma_disconnect
2. thread 1: from nvmf_rdma_disconnect call
spdk_nvmf_rdma_qpair_inc_refcnt - setting rqpair->refcnt to 1.
3. thread 2: call spdk_nvmf_rdma_poller_poll.
4. thread 2: in spdk_nvmf_rdma_poller_poll reap a completion with an
error status which causes us to call spdk_nvmf_qpair_disconnect -
rdma:2846
5. thread 2: spdk_nvmf_qpair_disconnect calls _spdk_nvmf_qpair_destroy which sets
qpair->group = NULL
6. thread 1: from nvmf_rdma_disconnect we call
_nvmf_rdma_disconnect_retry which checks if qpair->group == NULL. If
that is the case, we assume that the qpair has not been assigned a group
yet and send ourself a message to call _nvmf_rdma_disconnect_retry again. see rdma.c:2253
7. thread 2: from _spdk_nvmf_qpair_destroy we call
spdk_nvmf_transport_qpair_fini which results in a call to
spdk_nvmf_rdma_close_qpair. which sends dummy send and recvs to the
qpair.
8. thread 2: we call poller_poll and get completions for both the send
and recv dummy requests. This results in a call to
spdk_nvmf_rdma_qpair_destroy.
9. thread 2: spdk_nvmf_rdma_qpair_destroy checks rqpair->refcnt and when
it sees that it does not = 0 (see step 2 above) it returns without
freeing the resources. see rdma.c:629
10. thread 1: we keep churning in _nvmf_rdma_disconnect_retry sending
ourselves messages because rqpair->group is going to be null. Thread 1
never reaches line 2257 where it sends a message to call
_nvmf_rdma_qpair_disconnect. _nvmf_rdma_qpair_disconnect is the function
that decreases the rqpair->refcnt and allows us to make forward progress
on destroying the qpair.
I encountered this issue while trying to disconnect from our target
using the kernel initiator with an x722 NIC. I think the timing on this
bug comes out with that specific configuration because come of the calls
in the disconnect path on thread 1 fail causing it to take longer giving
a chance to the second thread to delete the qpair.
There are really two issues at play here. We don't have a single point
of entry for disconnecting RDMA qpairs, and we rely on the qpair->group
variable never being set back to NULL. This patch addresses the second
issue, and the next patch in the series addresses the first.
Change-Id: I65395d0bbb67edfa7bad2ddc70906606c3d83781
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Includes the required DPDK dependencies for SPDK block Reduce aka
Compression.
Change-Id: Ic1ea3cbeb9373a7700f6f0c2a3194d65d6a34a41
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/429523
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is for DIF check types.
Add enum spdk_dif_check_type to DIF library.
Add a field dif_check_flags to struct spdk_bdev and add
spdk_bdev_is_dif_check_enabled to bdev APIs.
Added enum is intended to improve usability. If no enum, the
caller will have to get raw data of flags and mask each bit.
Change-Id: Ia46a37a9684dc968dcc51963674f0a9963e0cd4d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443339
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is for DIF settings.
Add fields dif_type and dif_is_head_of_md to struct spdk_bdev and
add APIs spdk_bdev_get_dif_type and spdk_bdev_is_dif_head_of_md to
bdev APIs.
The fields dif_type and dif_is_head_of_md are added to the JSON
information dump.
Change-Id: I15db10cb170a76e77fc44a36a68224917d633160
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443184
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Next patch will introduce enum spdk_dif_check_type for user to
know easily if checking DIF field is enabled or not.
This patch renames bitmask macros from SPDK_DIF_*_CHECK to
SPDK_DIF_FLAGS_*_CHECK to avoid mis-interpretation .
Using FLAGS was derived from SPDK_NVME_IO_FLAGS_PRCHK_* in
include/spdk/nvme_spec.h.
Change-Id: I89e155d047352f54091c14b9251464cd3a72a162
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443338
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To support DIF, bdev will need to expose the following information:
- Metadata format
- Block size
- Metadata size
- Metadata setting (interleave or separate)
- DIF settings
- DIF type 1, 2, or 3
- DIF location
- DIF check types
- Guard check
- Reference tag check
- Application tag check
This patch is for the metadata format. Subsequent patches will do for the DIF
setting and DIF check types.
Add fields, md_len and md_interleave, to struct spdk_bdev and add APIs,
spdk_bdev_get_md_size and spdk_bdev_is_md_interleaved, to bdev APIs.
The fields, md_len and md_interleave, are added to the bdev JSON infomation dump.
DIF will be used only in the NVMe bdev module and the upcoming virtual
DIF bdev module first. But additional required storage by md_len and md_interleave
will be very small and they are simple. Hence add them to struct spdk_bdev simply.
Change-Id: I4109f6a63e6f0576efe424feb0305a9a17b9b2e8
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443183
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The timeout is set to 0, so it never waits anyway. But
this should be 0.
Change-Id: I8b4058017a91b647ea9324f1474a732921c389f0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443647
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This doesn't fix any bug, but it makes more sense to leave the qpair
in the NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY state until it
receives at least one byte.
Change-Id: Ic5f34a733a80b58f65a1334fae7e07dbded2b3d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The `len` field wasn't used at all and `reserved` is
no longer needed after we removed the paddr in the
previous patch.
This effectively cuts down spdk_mobj struct size by half.
Change-Id: Ica39f3a30e14ec1275a87d827dc41df5df9cf623
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443483
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The physical addresses in iSCSI are completely unused
as iSCSI does not perform any DMA on its own.
Change-Id: I350037b708a9f36f423e6ca6f7c822d8b6b95116
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443482
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We explicitly checked for one of the strings in the
parsed RPC request even though it's required for the
entire request to parse successfully. The extra check
is now removed.
Change-Id: I19c446786e4ac88b88f14e18dc5258f31b1a87f1
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Since we no longer use external events and we access
all vhost devices synchronously, we no longer need
to dynamically allocate our RPC request contexts. They
can be put just on the stack.
Change-Id: Ie887607b67451aba4f3404c4b9551e6424335beb
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440380
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Removed their various usages inside the core vhost code
together with the external events themselves. External
events were completely replaced by spdk_vhost_lock()
and spdk_vhost_dev_find().
Change-Id: I1f9d0268c27a06e2eecab9e7d179b1fd54d4223d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440379
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Replaced them with inline code that performs exactly
the same but is shorter and easier to follow. External
events were replaced by spdk_vhost_lock() and
spdk_vhost_dev_find().
Change-Id: Id46a619c592c20a573664b54efc097489e9bb893
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440378
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently infrequent cases in request completion path are marked as
unlikely. This patch applies that to submission path.
These cases are infrequent and marked using unlikely marco:
a. The sq tail reaches the end of queue.
b. The sq tail equals to sq head. (never happen if FW runs correctly)
c. The qpair is admin queue.
Change-Id: I8b873a18615788f2efbf7c683aad710c7007a082
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/443451
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The management channel was used in the RDMA transport prior
to the introduction of poll groups and made its way over to
the TCP transport when it was written. Eliminate it in favor
of just using the poll group.
Change-Id: Icde631dd97a6a29190c4a4a6a10a0cb7c4f07a0e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
This was only temporarily required for polling. With
a per-group aio ctx, it isn't needed anymore.
Change-Id: Ie59b50a4700f0f99dea470f857d187ac656dd229
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443467
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We only need one aio context for the entire set of channels
sharing a thread.
Change-Id: I1143247901586efe50530b28323ddb923bc6b242
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443314
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is marginally more convenient.
Change-Id: I9989d687b80051ccb2e07edc5e1efdbca75e8716
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This will be used later.
Change-Id: I12b07756a13d03a34c9705306d720c1db7ecb15c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443312
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This wasn't actually necessary. The next patch in this series will
change the way aio is used such that only one aio context is
polled for the entire group of channels on a single thread.
Change-Id: I05c4d824d9c63a51c8a2d608d84c184f249f66d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443311
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This isn't used just yet, but will be necessary temporarily
during this patch series.
Change-Id: I7f04426c27e3fe0417e2f60bac28217fa44c0cb2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443310
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Move it next to the other channel definition.
Change-Id: I9ec33c135836d3dc326abe4ce7588e7a2eff77d4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443309
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These didn't need to be visible.
Change-Id: I337a02802cac4431b4abd9a922408d4147801565
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443308
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Small static function only called from one place, so
just inline it.
Change-Id: Ibc54f790da55dd1635d81181208b1d506550ca9c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443307
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It does not need to be in the header file.
Change-Id: I5c489de81e48b11d02b66cbdd6d9ac05eae16429
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443306
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
max_read_depth should be based on max_qp_init_read_atomic, or the
maximum number of read values that the initiator will accept as
outstanding.
The device attributes object contains values for both the initiator
(remote side) and the target (local side). All attributes with the name
init in them are meant to correspond to the initiator. The
qp_read_atomic value represents the number of reads and atomic
operations that can have this device as the target. qp_init_read_atomic
represents how many read operations the initiator has said that we can
have outstanding that have the initiator's rdma device as the target.
Since this number represents how many outstanding reads we will send to
the initiator at once, we should use the qp_init_read_atomic value.
Change-Id: Iacc044e8321080de8accd9128ac3777bbb948afc
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442409
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
ftl_process_reloc should process free_queue in first place
(this will start read operations) and then process write queue.
Change-Id: I3a44b3651cc1526f8a024330472f94aa8d818193
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443403
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id44f9de4500ec2be45aa4203c5945b1501fbdb21
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443236
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This function gets used as a function pointer, which
seems to keep the compiler from trying to inline the
function. Stack manipulation was showing up in the
perf profile pointing to this. Marking the function
as inline gets it actually inlined in the hot I/O
path.
Improves bdevperf microbenchmark from 78M to 85M IO/s.
Cores are virtually identical - 11.4M on core 0 and
10.4-10.6M on remaining cores.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iadced071dfc07fc09db6da3571c930988b2dc3fd
Reviewed-on: https://review.gerrithub.io/c/443278
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This keeps the hottest structures at the head of the
cache and helps improve performance.
Improves microbenchmark (8 null bdevs on 8 lcores,
bdevperf seq read with qd=1) from 67M to 78M on my
Xeon E5-v3 system. Core 0 performance remains about
the same (10.7-10.8M) but others cores improve from
around 8.0M each to 9.4M.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia3ccf94ab39b6f911127f0bd1016e352027b11fc
Reviewed-on: https://review.gerrithub.io/c/443277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2bad16b6649c279448a3c662ab7b035dbe0a4bfb
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443251
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The ocssd spec and buildtime-check already ensures
sizeof(struct spdk_ocssd_geometry_data) is 4096, so we can use
struct spdk_ftl_dev::geo as buffer directly.
Change-Id: Id7a52f978d80284fe941d9f5d7bc7219518871e8
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/443069
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
According to the current implementation, the functions that called by
bdev_ftl_init_bdev() will not call callback if they return errno.
Besides, the caller of bdev_ftl_init_bdev() (e.g.
spdk_rpc_construct_ftl_bdev()) don't expect callback be called if callee
return errno.
Change-Id: I5f36d5332ac66db65bb2090e9625a73b1107306b
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/443068
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is no need to hold g_ftl_bdev_lock when calling bdev_ftl_create.
Besides, the functions (e.g. bdev_ftl_add_ctrlr) that called by
bdev_ftl_create will lock g_ftl_bdev_lock again.
Change-Id: I74751822364e16c58a3065dc78f8a4dce157e925
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/443066
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Vhost external events no longer do any asynchronous
calls, they only lock the vhost mutex and directly
call the provided function. The mutex encapsulation
isn't worth the additional complexity of splitting
each vdev-handling code into multiple functions, so
we expose low-level APIs that should eventually
replace external events entirely.
Instead of:
```
static int do_something_cb(struct spdk_vhost_dev *vdev, void *arg)
{
struct my_data *ctx = arg;
/* access the vdev and ctx */
free(ctx);
}
struct my_data *ctx = calloc(...);
rc = spdk_vhost_call_external_event("my_vdev", do_something_cb, ctx);
if (rc != 0) { /* err handling */ }
```
We can now do just:
```
spdk_vhost_lock();
vdev = spdk_vhost_dev_find("my_vdev");
if (vdev == NULL) { /* err handling */ }
/* access the vdev any context data */
spdk_vhost_unlock();
```
Change-Id: I06e1e149d6dd006720b021d3bef8d9b7bfaeceaa
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440377
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
_allocate_bit_arrays() needs vol->backing_dev set which was being done
after the call.
Change-Id: Ic8c36c98aee94fbd8230273638011b948cd95675
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443048
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is only needed within the c file. It doesn't
need to be in the public header.
Change-Id: I0e072ea5eddc6edc84faecee9ef50fb2c20dbb24
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442426
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is a holdover from before poll groups were introduced.
We just need a per-thread context for a set of connections,
so now that a poll group exists we can use that instead.
Change-Id: I1a91abf52dac6e77ea8505741519332548595c57
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442430
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The READ and ATOMIC in the comment above are capitalized, so
make this all caps too.
Change-Id: I49fae2ceb826b22953d9b26d42b95f17e2dac617
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442427
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
request.c didn't have much code, so let's collapse
it into ctrlr.c and make that the place where all
software emulator of the NVMe controller, including
request handling, is done.
Change-Id: Id7c98010cb222a414a5aa0b78bfb299a0ffc418f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440592
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously, all I/O commands were implemented by simply
passing them to the bdev layer. Now, some I/O commands will
be emulated. Prepare for that by moving the code for this
function to ctrlr.c, where the emulation will occur.
Change-Id: Id34e5549e5ce216d602fb347b4506fbd324eed4e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440591
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This was previously very unmap specific. Make at least the top level
DSM call more general purpose by eliminating the unmap_ctx.
Change-Id: I9c044263e9b7e4ce7613badc36b51d00b6957d3a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440590
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These are left over from the removal of virtual mode over a year ago.
Change-Id: Ia797c4570bf9090346ff22ab9c7d719a78d023d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440589
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This was only used by the target, and it didn't actually need it.
Change-Id: Ibcef410165efdc16077da24419580ed51b087d70
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This type was actually two entirely different types for
the initiator and the target, so just make it void.
Change-Id: I15512d9d4efd790dce0fa4323b7230de66144bc6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
After passing the check of protective mbr, there is a high probability that
this bdev is in gpt format. If parsing primary table fails, read the secondary
table and try to get partition info from it. When parsing secondary table
successfully, add a warning log to notify users that primary table is broken.
Change-Id: I4f16edcdd57b9cde8d8cc74ec88ba95b97bd6b63
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/441201
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Modify existing code of parsing primary partition table to support parsing
the secondary.
Main difference of these two tables is that they have inverse buffer layout.
For primary table, header is in front of partition entries. And for secondary
table, header is after partition entries. So add helper functions to extract
header and partition entries buffer region from primary or secondary table
based on current parse phase.
Split the exported funtion spdk_gpt_parse into two functions spdk_gpt_parse_mbr
and spdk_gpt_parse_partition_table. So spdk_gpt_parse_partition_table could be
used to parse both primary and secondary table.
Change-Id: I7f7827e0ee7e3f1b2e88c56607ee5b702fb2490c
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/441200
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Unmap, discard, write zeros will be sent down from
higher stack. Remove these IOs for the QoS limit.
Change-Id: Ieb3cc19f31c43f8ddf8f8d2fd338f442ef48b679
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442673
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When a connection goes to close and has no I/O outstanding,
the current_recv_depth was being decremented beyond 0 and rolling over.
If the poll group then finds a successful receive completion on the next
poll (for a command that arrived prior to starting the disconnect but
hadn't been processed yet), it would trip the max queue depth check
added recently and start another disconnect process. If only one command
arrives in this window, everything actually works out ok.
However, if there are two receive completions sitting in the completion
queue after the disconnect process is started, the first one does the
double disconnect and the second one does another disconnect which ends
up dereferencing a null pointer.
Since there is always a special reserved slot for the dummy recv, don't
do decrements or increments of the current_recv_depth for the dummy
recv. This allows the code to still enforce the actual max_queue_depth
on recvs without underflowing or overflowing the counter.
Change-Id: I56c95b2424e956a3b007b25c50cbf47262245b8f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
trace_record is used to poll the spdk trace shm file
and store new entries from it to another specified trace file.
This could help retain the trace_entires from the overlay of
trace circular buffer
Note:
* trace_record reads the input tracefile into a process-local
memory and writes trace entries to the output file only at shutdown.
* trace_record can be shut down on SIGINT or SIGTERM signal.
A usage sample is:
./spdk_trace_record -s bdev_svc -p <spdk app pid> -f trace.tmp -q
Change-Id: If073a05022ec9c1b45923c38ba407a873be8741b
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/433385
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This RPC doesn't really work in some cases - for example,
trying to delete one NVMe namespace bdev from a controller
with multiple namespaces, or just one virtio SCSI device
from a virtio-scsi controller. We've previously kept it
and marked it as "debugging only" - but every bdev module
has its own RPC method now for deleting what it constructed,
so keeping the generic delete_bdev RPC is asking for
trouble in some of the cases mentioned above. We'll remove
it in the 19.04 release.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I639254b32a3e1c840a4e9ae2658c42f4f321b676
Reviewed-on: https://review.gerrithub.io/c/442616
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This was marked deprecated in the v18.10 release, so
remove it now before v19.01 is tagged.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I57673a5ab475b97c812bebcefd77ff90d9305d1c
Reviewed-on: https://review.gerrithub.io/c/442412
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This includes properly detecting when a key's name
extends past the end of the valid data.
Note that the unit tests were using sizeof() instead
of strlen() since some of the strings contain
NULL characters. This means that we should be
subtracting one to account for the implicit null
character at the end of the string. Note that the
iSCSI spec only says that the key/value pair has to
end with a null character - a key/value pair that
is split across two PDUs will not have a NULL character
at the end of the first PDU.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie95d6dd3b9ffa6a3902a31771ac4edb482418cce
Reviewed-on: https://review.gerrithub.io/c/442450
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since params are parsed directly from the PDU's data
buffer, we need to know the end of the valid data. Otherwise
previous PDUs that used this same data buffer may have left
non-zero characters just after the end of the text associated
with a LOGIN or TEXT PDU.
Found this bug while debugging an intermittent Calsoft test
failure. Added a unit test to reproduce the original issue,
which now verifies that it is fixed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic3706639ff6c4f8f344fd58c88ec11e247ea654c
Reviewed-on: https://review.gerrithub.io/c/442449
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For the number of trace entries, change strtoull to spdk_strtoll
because no issue will occur by the change.
Besides, getopt guarantees that if an argument is followed by a
semicolon, optstring of it is not NULL. spdk_app_parse_args()
had unnecessary NULL pointer check related with this. Hence
remove those NULL pointer checks too.
Change-Id: I33d0328205d1765f70f70fc734d0d8b4165fef5e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/441641
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Without this check valgrind complains that we are using
uninitialized variable.
Change-Id: I5cb73d10e167004f6e4df9e3621ec3b35ec2448d
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442519
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In SPDK, we will build isa-l with no shared option
and then integrate it into SPDK. And we do not need
to install isal in the system libaries.
Note: ocf build in autobuild.sh now needs to build
include/spdk/config.h before building the ocf library,
to ensure that header is available in a clean build
environment.
Change-Id: I3f0ce6932b386de17a77cf5bfdfd738b22417e2d
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Signed-off-by: paul luse <paul.e.luse@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441279
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Chunyang Hui <chunyang.hui@intel.com>
The io_device associated with the aio bdev was only
getting unregistered when the aio bdev was explicitly
deleted - not in the implicit deletion path at shutdown.
Move the io_device_unregister into the destruct_cb -
this makes sure the io_device is always unregistered, whether
the bdev is getting unregistered via an explicit RPC or
implicitly in the shutdown path.
Fixes#618.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I44b77f5c38339f4cf97b02c0ee4002bf5fcc9998
Reviewed-on: https://review.gerrithub.io/c/442119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Verify that the namespace used is formatted with a supported LBA format
(4K block size).
Change-Id: I59e2ed71354e8530d9fa0e3f6b323ded83097afa
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441881
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Existing specific vhost socket messages for vhost-nvme target
will get some information from backend target before start_session
call, so we should iterate the associated nvme controller by vid
but not session.
Fix issue #628.
Change-Id: Ia400bf33895a0feee0058a870f26b0ff72b7556f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442498
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Marked as deprecated in 18.10.
Change-Id: I40d0e6103623aee6e6a0b9fa6e82f7b826ca1fe6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442420
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Error check of strtol is left to users of it. But some use cases
of strtol in SPDK do not have enough error check yet.
For example, strtol returns 0 if there were no digits at all.
It should be avoided for each use case to add enough error checking
for strtol.
Hence spdk_strtol and spdk_strtoll do additional error checking
according to the description of manual of strtol.
Besides, there is no use case of negative number now, and to keep
simplicity, spdk_trtol and spdk_strtoll allows only strings that
is positive number or zero.
As a result of this policy, callers of them only have to check if
the return value is not negative.
Subsequent patches will replace atoi to spdk_strtol because atoi
does not have error check.
Change-Id: If3d549970595e53b1141674e47710fe4dd062bc5
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/441626
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This was marked deprecated in 18.10
Change-Id: Id47e770b0388c935fe684aeef7a9824f24cef47f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442416
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add rte_pause to waiting while loop
This commit also adds spdk_pause as interface for rte_pause
Change-Id: I56e1023731e2e78febaa4f45808d6f07656d290f
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/436494
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Use mempool for allocating OCF requests with constant size
Previous method was using mallocs which is significantly slower
Change-Id: I539ff22efc18fbd353ceb2687ea211d2baaa7523
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439680
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
One of the messages we send on memory hotplug event is
SET_VRING_BASE, which tells vhost e.g. the position in
a vring it should start processing requests from. Sending
this message with any outstanding I/O could cause that
I/O to be never processed as it could be at a vring
position that won't be practically polled.
To fix the above, we don't send SET_VRING_BASE message
on memory hotplug event anymore since it's completely
unnecessary. It was sent together with a couple other
messages that would reinitialize the vring, but we know
vrings occupy a memory buffer that won't be hotremoved
during vring lifetime. We also know that vring GPAs will
never change. Hence we can initialize the vrings just
once on device start now.
We still need to send SET_VRING_ADDR after updating the
memory table, as rte_vhost depends on it to apply that
new memory table. Luckily, this single message doesn't
cause us any trouble.
Change-Id: I2125099f1cf3f8c76e8160ec819bd1a9a3e7823c
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439436
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We assumed the second descriptor in an I/O descriptor
chain will always point to a payload buffer, but in case
there is no payload, the second descriptor will point to
a response buffer. The vhost code doesn't provide proper
checks to handle such case, so to avoid various errors
down the stack, we just fail all requests with no
payload.
Change-Id: I6785c2843d6db4fc17e68e03562c2a1530bb469b
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437187
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dstepanov.src@gmail.com>
This ensures that SPDK will detect descriptor chains
that are too long.
The additional check in vhost block stands as an
optimization and makes us fail the corrupted I/O early.
Change-Id: Icceaa0dd938dca96a1872e5ee96bf6a151fdd9e7
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Signed-off-by: Dima Stepanov <dstepanov.src@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/433641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK doesn't provide sufficient runtime checks to properly
handle clients with memory sizes that aren't 2MB multiples
and could potentially segfault during I/O processing.
That's why we'll reject such clients now.
Change-Id: I34e85be5b5c6df863371d0ad688f228ed44107ff
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/433640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add new RPC method for OCF bdev: get_ocf_bdevs
It is useful in respect to not registered OCF bdevs
which do not appear in standard get_bdevs call
Change-Id: I8a5fc86a880b04c47d5f139aa5fa4d07ca39c853
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441655
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add basic handling of base devices hotremove
When either core or caching device gets unregistered,
the vbdev_ocf does so as well
Change-Id: I05769f714bf22cb320558fed86adc8c3d8a0a185
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/435729
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Since we have different requirements for submitting RDMA read and write
operations, we should track them separately so that we don't block
writes when the device does not have enough resources for read
operations.
Change-Id: I5d6424c0e26f2f5362866d1bb21eb46700c245da
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Before, the number of WRs and the number of RDMA requests were linked by
a constant multiple. This is no longer the case so we need to make sure
that we don't overshoot the limit of WRs for the qpair.
Change-Id: I0eac75e96c25d78d0656e4b22747f15902acdab7
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add OCF module based on OCF meta-library
Open CAS Framwework (OCF) is high performance block storage
caching meta-library
It is open-source, published at https://github.com/Open-CAS/ocf
With this patch OCF-enabled device is represented in SPDK
as virtual bdev having core and caching devices as its base devices
This patch includes implementation of:
* OCF top adapter (vbdev_ocf.c)
* OCF bottom adapter (dobj.c, data.c)
* Adaptation layer for OCF (env/)
* OCF context abstractions (ctx.c)
Adaptation layer and context abstractions are not dependent on SPDK bdev
OCF bdev supports reads and writes, configured at startup
Other features will be added with separate patches
Change-Id: Ic2dcab378c8238d16f1e4b64d4374bdf257565bc
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/435708
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
_spdk_bdev_io_submit uses the bdev_io->internal.in_submit_request
flag to ensure we unwind in cases where the I/O is completed
inline (i.e. malloc or null bdevs). But when an I/O gets queued
for QoS, and then we iterate through the queued I/O in
_spdk_bdev_qos_io_submit(), this flag was not getting set
when those I/O would get submitted to the underlying bdev. This
would allow for _spdk_bdev_qos_io_submit recursion, resulting
in all kinds of different types of memory corruption.
Fixes#613.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I29263f4e7b2ead60f08b60474d210defa803348c
Reviewed-on: https://review.gerrithub.io/c/442127
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
It is perfectly valid for a bdev to not support the
unmap command - there's no need to print an ERRLOG
when a SCSI INQUIRY 0xB2 (LOGICAL BLOCK PROVISIONING)
command is sent to query if the LUN supports it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I18389df4d55a1ac186707d624ddea292a5470e80
Reviewed-on: https://review.gerrithub.io/c/442104
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Technically this check is correct, but the Linux kernel
target doesn't have it, and older versions of libiscsi
have a bug which result in stale ExpStatSN getting sent
resulting in terminated connections with the SPDK iSCSI
target at high queue depths.
Fixes#600.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I76eaf9dee2d733bfa3f8d43b86528de6b556cbd6
Reviewed-on: https://review.gerrithub.io/c/441981
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Those cases should never occur. Klocwork pointed out
possible dereference based on the returns later in
the functions.
Change-Id: I282a56f3f415f85c38e9c451cbb10bc80fc6176b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This gives us more realistic control over the number of requests we can
submit.
Change-Id: Ie717912685eaa56905c32d143c7887b636c1a9e9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441606
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
rw_depth was a misinterpretation of the spec. It is based on the value
of max_qp_rd_atom which only governs the number of read and atomic
operations. However, we were using rw_depth to block both read and write
operations which is an unnecessary restriction. write operations should
only be governed by the number of Work Requests posted to the send
queue. We currently guarantee that we will never overshoot the queue
depth for Work requests since they are embedded in the requests and
limited to a size of max_queue_depth.
Change-Id: Ib945ade4ef9a63420afce5af7e4852932345a460
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441165
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be necessary later on when we need to throttle send and recv
requests in software.
Change-Id: Ifb25eaabd15e101fbfc2959a08a321f80857b280
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441604
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Both initiator and target are using the minium 10 seconds
timeout value, so set it in kas field when initializing
the controller.
Change-Id: Idda68bdfe27613ebaf706a0de497145d3f9ed766
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441995
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently, the code does not comply with the spec,
so remove such code for 19.01 and will add the code
which complies with the spec for 19.04
Change-Id: Icd3b2573fbc46dc2fa7a00c6672c23ea01ffe0ee
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/441985
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Although Vhost SCSI code is technically capable
of polling different sessions on different lcores,
the underlying SCSI API won't allow allocating
io_channels on more than one lcore.
That's why we will now let device backends assign
lcores by themselves.
The first Vhost SCSI session will now choose one
core from the available ones, and any subsequent
sessions will stick to the same one.
Change-Id: I616cd195a919960dff68508473cea236abf8d6a3
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441581
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If there is socket read error, we should directly disconnect
the socket instead of set the tqpair into RECV_ERROR state.
When it is in ERROR_RECV state, it does not mean that
we should close the socket immediately.
Change-Id: I975906653c13eb3fa5195799c517015435176785
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/441830
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Bump the log level for EAL to RTE_LOG_NOTICE.
Reading from rte_log.h:
```
RTE_LOG_NOTICE 6U /**< Normal but significant condition. */
RTE_LOG_INFO 7U /**< Informational. */
RTE_LOG_DEBUG 8U /**< Debug-level messages. */
```
We're doing this primarily for the NVMe hotplug poller,
which calls spdk_pci_enumerate() and constantly bloats
the output with logs describing which device is currently
iterated over. We don't want to see those.
Change-Id: I1a90e514fdf467bc95da910f786f1818757cfdcf
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441789
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In patch fbec702944 (bdev/crypto: Set QAT alignment
requirement) we added an alignment requirement for I/O
buffers, but the internally-allocated buffers for
encryption haven't respected it.
We now allocate those buffers with the crypto bdev's
required alignment. It is only required for QAT and we
do it unconditionally, but we don't want to strcmp the
driver name in the hot I/O path just for that - the
code is to be refactored anyway.
Change-Id: I2cbc04408ddc5574f212b63536a05eb73ceba104
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441908
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Assigned CQ size when creating CQ may run over due to
heavy workload with too many qpairs. Enlarge it dynamically
can prevent IBV_EVENT_CQ_ERR caused by CQ's runover.
This patch fixes issue #498:
https://github.com/spdk/spdk/issues/498
Change-Id: I6c2d7194d4147d812d49d4fe787fcba5c6bbede9
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440853
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This change was provided by GitHub user vikasbrcm to fix issue 562.
I am uploading his change to facilitate testing of the issues and
possibly get it merged before the 19.01 window closes.
Change-Id: I58fb1058f68c6c02006ceed6e577be627e6dbc09
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441611
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This patch adds FTL bdev. RPC scripts have been updated to allow for
creation and removal of FTL bdevs.
Change-Id: I82a5c5033b65bbeb67c238cae969a68cff767dcc
Signed-off-by: Jakub Radtke <jakub.radtke@intel.com>
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Signed-off-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/431329
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
With all the patches in place, we can finally
enable having more than one simultaneous sessions
to a single vhost device.
This patch adds a unique id to the session structure,
similar to the one in a vhost device and also fills in
the implementation holes in foreach_session().
Vhost-NVMe can support only one session per device
and now has an additional check that prevents it from
starting more than one at a time.
Vhost-SCSI also has the same check now since it needs
additional work on the lcore assignment policy. The
check will be removed once the required work is done.
Change-Id: I13a32c7a0eae808e9bec63a7b8c15ec0bc2e36ed
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Particular backends will now be responsible for sending
events to vsession->lcore. This was previously done by
the generic vhost layer, but since some backends will
need different lcore assignment policies soon, we need
to give them more power now.
Change-Id: I72cbbccb9d5a5b2358acca6d4b6bb882131937af
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441580
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
It's sessions that are tied with the lcores now.
This makes the vhost devices accessible by any
thread that only locks the global vhost mutex.
The mechanism used for external device events was
refactored to serve for foreach_session() API.
Additionally, since we don't want to handle cases
where the entire vhost device gets removed while
an asynchronous foreach_session chain is pending,
a new per-vdev counter of pending async operations
was added. We'll fail the device removal request
if there are any pending operations. Eventually
we would like the device removal to be asynchronous,
but that's a todo for later.
The external events are still there, although
they only lock the mutex and call the provided
function now.
Change-Id: I20618f9420a9bc04270373469deaad8fb2049c7c
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439323
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch adds RPC calls for histograms in bdev layer.
Following calls are added:
- enable_bdev_histogram - enable/disable histogram structures for specified bdev and each of its channels.
- get_bdev_histogram - merges histograms from all channnels and encodes histogram as base64
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ib423a919dc1cde7dd7d92247db5482cfb9d66956
Reviewed-on: https://review.gerrithub.io/c/433573
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The controller shall treat a Keep Alive Timeout in the same manner
as connection loss. If the Keep Alive feature is in use and the
timer expires, then the controller shall:
1, stop processing commands and set the Controller Fatal Status
(CSTS,CFS) bit to '1';
2, terminate the NVMe Transport connection;
3, break the host to controller association;
A timer poller is added to each subsystem to monitor timeout event.
Change-Id: I001afab8a6764f30c39df37fa96384180d117486
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439330
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Each Vhost SCSI session will now keep its local
SCSI targets state that can be accessed without the
global vhost mutex.
Hotplug will still add the SCSI target reference to
the device struct, but will also asynchronously tell
each active session to add it's session-local copy.
Hotremove, on the other hand, will now try to
asynchronously remove a SCSI target from all sessions
first and only afterwards it will remove it from the
device struct.
This allows us to safely hotplug and hotremove SCSI
targets into Vhost SCSI devices that have multiple
active sessions.
Each session will still use its management poller to
try to locally remove those targets that were scheduled
for hotremoval and the additional
spdk_vhost_dev_foreach_session() will now also try to
remove each one of them from the entire Vhost SCSI
device after making sure they were already removed from
all sessions.
Change-Id: Idd080b618768c71cd1cd564efeaf930bf79fb578
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439321
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Before we implement the support for multiple sessions
per device, we still need to make a few intermediate
changes that will require a counter of currently polled
sessions. So here it is.
Change-Id: I0a1d928eafa75efa1b5c2e6670a5ceb282c87fa4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441734
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Fix case when remote is doing SHUT_WR but we still have requests in
progress. In this case we should finish requests, send response and then
close the connection.
Fixes#604
Change-Id: I009029c95e0557c7347a78c3a50d35b30fc8141e
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441718
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Some users require to do write zeroes operation when
erasing data on lvol. Currently the default method is
unmap. This patch adds flag to spdk_rpc_construct_lvol_bdev
call that changes default erase method. This is also a base
implementation for possible future function for erasing
data on lvol bdev.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I8964f170b13c2268fe3c18104f7956c32be96040
Reviewed-on: https://review.gerrithub.io/c/441527
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
The backend struct will get some new dependencies
soon, so move its definition further in a header
file.
Change-Id: I39c25027312777c7e570b12511dc9c5e9b9023d4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439322
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Returning negative value from a `foreach` callback
will now break the entire chain. This is required
for refactoring spdk_vhost_dev_foreach_session() to
use the same mechanism as external events. Before
we actually do the refactor, we add the only feature
that external events were missing.
Change-Id: I70bda3df99748de51429e329a056c37a3bc7e348
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This patch implements unit tests for the following modules:
* band
* PPA (Physical Page Address) translations
* write buffer
Change-Id: Ia7292bd3027347e8a3da77dafe71cde2c016bf38
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/431328
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch adds the RPC support for the Read/Write separate
bandwidth limit controls. The basic usage as following:
usage:
rpc.py set_bdev_qos_limit [-h] [--rw_ios_per_sec RW_IOS_PER_SEC]
[--rw_mbytes_per_sec RW_MBYTES_PER_SEC]
[--r_mbytes_per_sec R_MBYTES_PER_SEC]
[--w_mbytes_per_sec W_MBYTES_PER_SEC]
name
positional arguments:
name Blockdev name to set QoS. Example: Malloc0
optional arguments:
-h, --help show this help message and exit
--rw_ios_per_sec RW_IOS_PER_SEC
R/W IOs per second limit (>=10000, example: 20000).
0 means unlimited.
--rw_mbytes_per_sec RW_MBYTES_PER_SEC
R/W megabytes per second limit (>=10, example: 100).
0 means unlimited.
--r_mbytes_per_sec R_MBYTES_PER_SEC
Read megabytes per second limit (>=10, example: 100).
0 means unlimited.
--w_mbytes_per_sec W_MBYTES_PER_SEC
Write megabytes per second limit (>=10, example: 100).
0 means unlimited.
Change-Id: I822ec4814d21adff9826ce03a6af3783b1b98f44
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/417650
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds the support of read and write separate
bandwidth rate limits control with the configuration file.
Below is the example (in MiB) for the configuration section:
[QoS]
Limit_Read_BPS Malloc0 100
Limit_Write_BPS Nvme0n1 200
Change-Id: I0221516ce70c3fbb07b9e80c1c814ed5ba271c88
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/416672
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We know how big this should be so no need to dynamically allocate it.
Change-Id: I00ae6cb7c7f525d946d213e0f58cc5fe05bcf932
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441128
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Everything need to be removed to get the remove callback called.
Otherwise we will end up with dangling devices and user callbacks
possibly not called.
Fixes#567
Change-Id: I37259f6cd97268060170a6b17a0c0df4d543a224
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440890
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If IO fail e.g. during hotremove error shouldn't be ignored as this will
trigger operations (like crc checking) that shouldn't be done. Also
false error messages are printed.
Change-Id: Ie023ddcd9bdba2378e69808302ff9978497c7852
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440889
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
In SPDK sock_ut, the port 3260 is tested. It is conflict with the SCSI
Target Daemon (tgtd). If the service is enabled, it makes sock_ut failure.
Suite: sock
Test: posix_sock .../home/ubuntu/spdk/lib/sock/posix/posix.c: 238:spdk_posix_sock_create: *ERROR*: bind() failed errno = 98
FAILED
The patch changes port 3260 to UT_PORT, and enhance the error log when
socket bind is failed.
Also enhance error log carrying port value in posix.c.
Change-Id: I31e47af49335cdf7eaffa44237860ebc140d2419
Signed-off-by: tone.zhang <tone.zhang@arm.com>
Reviewed-on: https://review.gerrithub.io/c/439983
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Partially decoded request need to be free even if
spdk_json_decode_object() fails.
Change-Id: Icd00f835537dbaf197cc4f05930be8c543a534a6
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439716
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This wasn't allowed at first to make debugging these
thread changes easier. Now they're mostly solid, so
let's allow this behavior.
Change-Id: Ida689c738ed550ec46c3cb6cab6788a651c6088b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441159
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This patch will solve the following two cases:
1 Free the pdu resources. Add the checkout of c2h_pdu_data_cnt of the qpair.
2 Do not recyecle the req accoriding to the pdu in the send_queue, but directly
recylcing the reqs in TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST state.
Change-Id: I5856c3421019ec49d576d3dae4c62fefbb3925ca
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440847
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Fix potential bug. In _spdk_nvmf_subsystem_add_ctrlr(), befor free(
ctrlr) we should free ctrlr->qpair_mask. Because we set qpair->ctrlr
= NULL, when destroy qpair the qpair_mask is not released. For the same
reason, req->qpair->ctlr = ctrlr is placed at the bottom of the function.
Change-Id: I38e268b532ff3ce87721c02f15ac4f674856d103
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440858
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This change is related to enabling multi-sgl element support in
the NVMe-oF target.
For single SGL use cases, there is a 1:1 relationship between
rdma_requests and ibv_wrs used to transfer the data associated with
the request. In the ingle SGL case that ibv_wr is embedded inside of
the spdk_nvmf_rdma_request structure as part of an rdma_request_data
structure.
However, with Multi-SGL element support, we require multiple
ibv_wrs per rdma_request. Insted of embedding these
structures inside of the rdma_request and bloating up that object, I
opted to leave the first one embedded in the object and create a pool
that requests can pull from in the Multi-SGL path.
By leaving the first request_data object embedded in the rdma_request
structure, we avoid adding the latency of requesting a mempool object
in the basic cases.
Change-Id: I7282242f1e34a32eb59b55f326a6c331d455625e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/428561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Purpose: To avoid the buffer contention among different
polling groups if there are multiple core configurations
for NVMe-oF tcp transport.
Change-Id: I1c1b0126f3aad28f339ec8bf9355282e08cfa8db
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added set_read_only_lvol_bdev() RPC that marks
lvol bdev as read only.
This enables to create clones off of a base lvol,
without having to create additional lvol bdev with snapshot.
Change-Id: Ic20bbcd8fbebcef157acce44bfa1da035b0da459
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440534
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds supplementary tracing for OCSSD library to allow
for more efficient debugging and profiling of the user I/O path
as well as the background tasks (maintaining write pointer,
defragmentation, ANM event handling, etc.).
Change-Id: I741f1304f4ee0eba019e31bea7814af475c3296e
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/431327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The purpose this patch is to fix the following issue:
https://github.com/spdk/spdk/issues/568.
The root cause of issue is in nvme_rdma_fail_qpair
since we want to recycle all outstanding rdma_reqs.
There is an aer req, the callback of which is:
nvme_ctrlr_async_event_cb. In this function, we
will call nvme_ctrlr_construct_and_submit_aer again,
however the nvme controller is already in shutdown state.
(The ctrlr->vcprop.cc.bits.en is set to 0).
Change-Id: I422f0fe5faf472e9a1cb6bbd174e806e6405b95c
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440014
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows us to remove the requirement to install intel-ipsec-mb to
system directories.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I579655a98b515cf148b7cd17823a9bb541ea6ad7
Reviewed-on: https://review.gerrithub.io/c/440785
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It was possible to leak pollers if we had multiple devices in the
transport. The new err_exit path fixes this.
Change-Id: Iafd5643c67fae741113f10afe761af1988cb6a9b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439419
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is implemented at a generic level.
Change-Id: Ibf8167e828f8da27cc26cd04e611c3f3c084319a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440418
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This allows for proper exit when error was encountered
during scsi port create.
Change-Id: Ie46faf7e515823cdc7383c25c4482bd2855b6368
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440676
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For the backend block device which can support flush command,
vhost-blk should report such feature to Guest, and leave such
decision to Guest.
Change-Id: I6cd6fd94ed80256ffe268bc1bf2c1dd57a164825
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439605
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
In an event where spdk_reactors_init() fails the
error message which was getting printed was not
useful. This patch improves error message printed.
Change-Id: If1e8dea20d187d68414782fa59943f0f7963a471
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/441145
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Need to check user input and return status of parsing
to prevent app or target from crashing.
Input checking function will be added in the future.
Change-Id: I8167ac13306ae4f81e2cacb80edd9dcf9382c374
Signed-off-by: Chunyang Hui <chunyang.hui@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439479
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
For QAT, a single operation can be described at most by one IOV.
To assure that we don't get memory buffers that are sub-block sized,
set the alignment requirement to the block size of the underlying device.
Change-Id: I4071bb89e696fc40e010798bd76520e5fb765217
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440988
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This patch is used to dump the requests state if
the tqpair's resource is not freed.
Change-Id: Ic4780662558d73267d4f1ebabfc22780fafec4ec
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440846
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
RDMA transport will report SPDK_NVME_SC_ABORTED_POWER_LOSS code
when fail the admin queue, however, SPDK_NVME_SC_ABORTED_SQ_DELETION
makes more sense here, because we know we are going to shutdown
the controller.
Fix issue #568.
Change-Id: I31da095ec92c06079511d89cc2743654ba2c001b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440132
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch adds the ability to restore the state of the block device
from the SSD, by reading the metadata stored in each band and recover
the L2P (LBA -> PPA map). Only clean shutdown is supported as of yet.
Change-Id: I03e39510a902b098c52edfaedd2e61b43a297bda
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Signed-off-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/431325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This is shared between all currently valid transports. Just move it up
to the generic structure. This will make implementing more shared
features on top of this a lot easier.
Change-Id: Ia896edcb7555903ba97adf862bc8d44228df2d36
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch added two new function pointers (queue_io,
update_io) for related qos operations like iops and
bandwidth rate limits.
Change-Id: I2ffd67c5f1c421eab448fd5e95f809da55805fcd
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
SPDK NVMe driver and NVMe CLI define the parameter about DIF location as
follows:
If set to 1 and DIF is enabled, then DIF is transferred as the first 8
bytes of metadata. If set to 0 and DIF is enabled, then DIF is transferred
as the last 8 bytes of metadata. Defaults to 0.
This patch fixes the apparent inconsistency.
Change-Id: I481332bb9a6caa488664dbd1fe6039270b9f9663
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/440672
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This name more closely resembles pthread_exit, which is a
closer analogy to how the new threading library works.
Change-Id: I68b04509f3ff8e94b8688804a7e5661155a3ecd1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440597
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This mirrors pthread_create, which works more closely
to the new style where SPDK libraries can spawn their
own threads.
Change-Id: Ic524c4c35bcf7c1611e4f261ebb64b98ac5a5a1b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440596
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Instead of implicitly grabbing the thread from the thread
local variable, make it explicit.
Change-Id: I733fad06181439e12b1e71a4829b84e7b64e2468
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440595
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Nothing implements the callback just yet, but it will be used for
dynamic thread creation.
Change-Id: I088f2bc40e1405cd5b9973b9110608f49c8abd68
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The user can now spawn SPDK threads within the context of another
SPDK thread, opening up the possibility of spawning threads from
within SPDK libraries.
Change-Id: I308eebc3f1d6f51da7236621ef2f67817f04ce8b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440760
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
These are no longer used by anything.
Change-Id: I0db6bc88e4dc945ff4f64df2ac410e1d00a669c1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437601
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
During destructing a connection, if timeout of logout timer for the
connection occurs, spdk_iscsi_conn_destruct() will be called again
to the connection.
Segmentation fault by multiple calls of spdk_iscsi_conn_destruct()
have been observed when exiting one of the sesssions during IO tests
for multiple sessions.
Fix#574
Change-Id: I69873905486953bfb0cdb43b000b8e38ff9346ac
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/440466
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Both the scsi device and its state will need an
additional per-session copy, so we put those two
in a single struct now.
This also serves as a cleanup.
Change-Id: I01bc11db4070e9f258de15e6c44b6f4fcd1d51f2
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439320
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Prepared APIs to operate on a session parameter rather
than a device parameter. Some of those functions are
now ready to support more than one session per device.
Change-Id: Ie4a49cd1d1f8ee1b826952bc87e66aa0cabdd925
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Vhost SCSI session specific fields were moved from the
device struct into a new session struct. This is the
same change as Vhost Block had.
Most of the functions inside vhost scsi still accept
vhost device as a parameter and then get the session
object internally. Those functions will be refactored
to accept session object directly in a separate patch
since the amount of changes required is too big to be
done here.
Change-Id: I8b87ba1187413d471b463aa7067821928ac0303e
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439318
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Following the same change with SCSI target hotremove,
we'll now allow hotremoving an underlying bdev without
VIRTIO_SCSI_F_HOTPLUG negotiated. The hotremoval will
be still reported through SCSI sense codes.
Change-Id: I5b3dd09bd72456eda745f4225f76603fad999da6
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Failing the hotremove request will become more tricky once
we allow creating multiple sessions per device, so we try
to get rid of any unnecessary error checks.
VIRTIO_SCSI_F_HOTPLUG tells us just if the host is capable
of receiving hotplug events, but the scsi target can be
hotremoved even without them. The hotremoval will be still
reported through SCSI sense codes. All I/O to a hotremoved
target will be failed with sense key ILLEGAL REQUEST,
asc 0x25, ascq 0x00 (LOGICAL UNIT NOT SUPPORTED).
Change-Id: I2be4e0167eb06804112758a5825cd91745128408
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439316
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In function spdk_fs_file_stat_async, the
stat.size = f->append_pos >= f->length ? f->append_pos : f->length;
but in spdk_file_get_length, we return f->length.
So generally, it should all use the same method to return the file
length, and this patch will fix this issue.
Change-Id: Idb9aa9e737711fcd48ac0075c7f7ffed825fe3b0
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/439627
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This supports 4KiB required alignment in bdevs such as
AIO and some variants of the crypto bdev.
Change-Id: Ib6ab2e8b07fbbcc5fe3f76e41e9d3c5a7ae3fb4d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440767
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Let the application figure out how to close the
underlying block device after an init fails or
an initialized/loaded volume is closed/destroyed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id0b9b99c85ed13166ec9a229eaaa0b0eb04289fd
Reviewed-on: https://review.gerrithub.io/c/440573
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will remove the metadata from the associated backing
device and delete the pm_file associated with the reduce
volume.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I30ffd68e5ba69b31ee1be85418c5b8c592d82e70
Reviewed-on: https://review.gerrithub.io/c/437995
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This requires fixing up the unit test pmem emulation
code - we need to copy the 'persistent' buffer into
the newly 'mmaped' buffer in the pmem_map_file stub.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4e474f437922e652a57d0b45b6ef92fc713eb587
Reviewed-on: https://review.gerrithub.io/c/437888
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Implement spdk_crc32c_update() with ARM CRC32 intrinsics.
Change-Id: I1daf7f21012aab02290f88a65bbae619eedf5087
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.gerrithub.io/c/437218
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch series is geared at solving github issue 555.
Ultimately the goal of this series is to add a per-poll-group buffer
cache to prevent starvation.
Change-Id: I8ddaa47487665c2f9adce2109eb71b8fa71a7927
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This lets us remove the assumption of having only
a single session per device and brings us closer
towards supporting more.
Change-Id: Ibbb7b1ed789ff0690e62c00fb5ed39ce64245028
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438680
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Different Vhost Block sessions could be technically
polled on different threads, so we move the io_channel
from the device struct into the session struct.
Change-Id: I004cad8b6dc6d198844fca3bb11724e3f176dc9d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439315
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Prepared APIs to operate on a session parameter rather
than a device parameter. Some of those functions are
now ready to support more than one session per device.
Change-Id: Id55e70ae521039f5acc47e80ab8b0aa043679d95
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438679
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
With all the core vhost changes in place, we can refactor
the upper layers now. We start with the vhost block since
it's the easiest one.
Vhost Block session specific fields were moved from the
device struct into a new session struct. What's tricky,
is that the blk-specific struct directly contains the
generic session struct. This gives us handy access to
the generic session data from the blk-specific session,
and also allows us to directly upcast generic session
objects.
Most of the functions inside vhost block still accept
vhost device as a parameter and then get the session
object internally. Those functions will be refactored
to accept session object directly in a separate patch
since the amount of changes required is too big to be
done here.
Because of the above, some parts of this patch might
seem overcomplicated. Especially the to_blk_session()
funtion, which checks the device backend inside. The
ultimate goal is to receive session object through
start/stop callbacks - and for that the backend check
does make sense.
Change-Id: If222c31aec16a8cbe2d0cfb98c828e1ac75b91fc
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438678
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
kill a spdk process with aio bdev storage many times will get error messages:
bdev_aio.c: 244:bdev_aio_group_poll: *ERROR*: epoll_wait error(9): Bad file descriptor on ch=0x152a7f0
vhost.c:1010:_spdk_vhost_event_send: *ERROR*: Timeout waiting for event: stop device.
When spdk process exits, the connfd is closed by rte_vhost_driver_unregister, then
other function such as epoll_create1 in aio bdev may allocate the same fd,
but this fd is closed by vhost_user_read_cb again, so epoll_wait return -1
Signed-off-by: Honghui Wang <wanghonghui@ucloud.cn>
Change-Id: Ic3fd938892f004c18fb38d4594c006c40a01efaa
Reviewed-on: https://review.gerrithub.io/c/439851
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: Changpeng Liu <changpeng.liu@intel.com>
Sessions are allocated internally by the core vhost
library whenever DPDK accepts a new connection, so
the only reasonable way to store additional per-sesion
data is to tell the core vhost library how much extra
memory it needs to allocate. Hence, we add a new field
to the vhost device backend struct.
Change-Id: Id6c8285505b2e610e28e5d985aceb271ed232555
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Instead of calculating those settings once and storing
them in the device struct, we'll now recalculate them
whenever a device session is created. This lets us
remove 2 fields from the device struct.
Change-Id: I2cb2bdbc570a41ae78c0666490fb1462a00d0b6f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This is an attempt to clean up requests sititng in the
waiting_for_buffer state before destroying it for good.
Change-Id: I8ae047e4d7fd01f30419ae346e4da49355dc033d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440127
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This allows us to drain all of the pending requests from the qpairs
before we destroy them, preventing them from being picked up on
subsequent process_pending polls.
Change-Id: I149deff437b4c1764fabf542cdd25dd067a8713a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440428
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Necessary to avoid erroring out in the edge case where we have an SGL
request sent with two buffers that fit in the incapsule data size.
Change-Id: If51fb69c402482b564c737319584378cb03e7213
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/436062
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Due to qpair timeout handling refactoring,
we removed the qpair destroying related code.
And this patch is submitted to address this issue. With
this patch, we can detect sock close of the fd from
the initiator, and correctly free the qpair related resource
(e.g., pid) managed by nvmf layer.
Otherwise, the initatior thinks the qpair related source is
freed, however it is not freed in the target side.
Change-Id: Ia2de07bd849fa5d3bc0e0e0d4941464dfd16d266
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440242
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
dev->mem_table_fds init in vhost_user_set_mem_table
but dev->mem may init later in vhost_user_set_vring_addr,
so if qemu crash or lost conntion after vhost_user_set_mem_table
and before vhost_user_set_vring_addr,
it's hugepage memory is not being freed
Signed-off-by: Honghui Wang <wanghonghui@ucloud.cn>
Change-Id: I782c106078829ff6691ed3265a5d1718493de90c
Reviewed-on: https://review.gerrithub.io/c/440254
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
SPDK threads are no longer mapped 1:1 to system threads. They are
instead identified by the memory address of the spdk_thread object.
Change-Id: I417f8376cebd2ee94f624f4436e6394b51486063
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437999
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Check my_lba in gpt's primary header equals to LBA1.
Set correct my_lba in unit test to pass the test, and update
related header_crc32.
Change-Id: I124a7a883e9304bd3955ce3de6b589596c4195ea
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/439889
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Appending string by using sprintf with realloc will be generally
usable and add sprintf_append_realloc() and vsprintf_append_realloc()
to the utility.
These APIs follow realloc about buffer management, i.e., the original
buffer is left untouched if they fail.
Besides, the original buffer is NULL, they are equivalent to
sprintf_alloc() and vsprintf_alloc(), respectively.
Change-Id: I8b69d9640e86e1862ddd3917995bad6f59426b7e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Chunyang Hui <chunyang.hui@intel.com>
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/436913
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
The purpose of this patch is to remove the duplicated code
used in spdk_nvmf_rdma_request_free
Change-Id: I3f74466a7ec788000eff9c2a75c9ea2cacaf5cc2
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/439942
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The nsid field can be used for per namespace basis
reservation notification.
Change-Id: Ia7212020ec893ea367afe79933e1629895fe41b8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439930
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>