Commit Graph

4664 Commits

Author SHA1 Message Date
Jim Harris
c258d73feb ioat: add APIs to only build descriptors
Add spdk_ioat_build_copy and spdk_ioat_build_fill
which mirror the existing spdk_ioat_submit_copy
and spdk_ioat_submit_fill.  These new functions
*only* build the descriptors in the ring - they do
not write the doorbell.  This enables batching
which can significantly improve performance by
reducing the number of MMIO writes.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia3539f936924b7f833f4a7b963d06ffefa68379f

Reviewed-on: https://review.gerrithub.io/c/444973
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-18 07:44:17 +00:00
Jim Harris
ec95646a61 ioat: make ioat_flush a public spdk_ioat_flush API
This will enable batching of doorbell writes in
future commits.  For now, just make the API public.

This is the first in a series of patches that
drastically improves performance for high queue
depth CB-DMA workloads.  Some basic tests on
my Xeon E5-v3 platform shows about 4x improvement
for 512B transfers.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia8d28a63f5020ae8644c1efdec7f68740bb6920c

Reviewed-on: https://review.gerrithub.io/c/444972
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-18 07:44:17 +00:00
Ziye Yang
d4875ed89e nvme/tcp: add nvme_tcp_qpair_check_timeout function.
To enable the timeout function.

Change-Id: Id5c40848957743683b6a5c2d085e7f777f14497d
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444803
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 22:03:44 +00:00
Xiaodong Liu
36e8c20fe9 nbd: avoid impact to device setup by other task
Use NBD_SET_SOCK to check whether the nbd device is setup
by other process or whether nbd kernel module is ready
before other nbd ioctl operations. This can avoid bad
influence to the nbd device setup by other process.

Change-Id: Ic12acbfddb8c4388e25731c39159b1ce559b8f23
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444805
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 22:02:48 +00:00
Xiaodong Liu
d8f5fead29 nbd: avoid unlimited wait for device busy
The ioctl NBD_SET_SOCK can return EBUSY on conditions not
only the kernel module hasn't loaded entirely yet, but
also the nbd device is setup by another process, which will
lead the poller's infinite polling.
This patch will wait only 1 second if device is busy.

Change-Id: I8b1cfab725cba180f774a57ced3fa4ba81da2037
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444804
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 22:02:48 +00:00
wuzhouhui
0238b5c42a bdev/ftl: unlock g_ftl_bdev_lock before unregister ftl_bdev
There is no need to lock g_ftl_bdev_lock when unregister a ftl_bdev.
Besides, the destructor of ftl_bdev will lock it again.

Change-Id: I99870483183879d9422584dbac6e154f605daea8
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/444794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 21:42:58 +00:00
Wojciech Malikowski
b9e462e4a6 lib/ftl: Fix band's metadata inconsistency with L2P
Added check before write submission to indicate if
LBA was update in meantime. In such case don't set band's
metadata and rwb entry cache bit. Previous implementation
invalidates such address during write completion and could
cause that inconsistent lba map was stored into disk.

Change-Id: I4353d9f96c53132ca384aeca43caef8d11f07fa4
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444403
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:42:14 +00:00
Darek Stojaczyk
5d07cfad54 vhost/scsi: handle io_channel allocation failure
We assumed io_channel allocation always succeeds, but
that's not true. Doing I/O to any vhost session that
failed to allocate an io_channel would most likely
cause a crash.

We'll now detect io_channel allocation failure and
print a proper error message. The SCSI target for
which the channel allocation failed simply won't be
visible to the vhost master. All I/O to that target
will be rejected.

We should probably report the error to the upper
layer and either prevent the device from starting
or fail the SCSI target hotplug request. But for now
let's just prevent the crash.

Change-Id: I735dfb930d8905f70636a236b4fa94288d0aaf3a
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:41:00 +00:00
wuzhouhui
6b0d7b82c9 ocssd: hold lock when calling nvme_ctrlr_submit_admin_request
nvme_ctrlr_submit_admin_request() will access admin queue, and we
should hold ctrl->ctrlr_lock when access it.

Change-Id: Iff576fe5e14e854eb38dbc64d6c6d9ec1ba17056
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/444793
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:27:58 +00:00
kreuzerkrieg
64faa14d6e nvme: make the completion status string accessible from external applications
Signed-off-by: kreuzerkrieg <kreuzerkrieg@gmail.com>
Change-Id: Ifdcf7ab7ce7e7449a33d52f8308f537b0e26a238
Reviewed-on: https://review.gerrithub.io/c/444519
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:11:28 +00:00
Changpeng Liu
5a26346a71 nvme: move condition check into nvme_init_controllers()
Also use the same style condition check for secondary process
with PCIE type.

Change-Id: I93c83126145255887914ef5efea1a493c8f7f767
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444492
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-15 21:04:19 +00:00
Shuhei Matsumoto
97768b0773 iscsi: Replace helper function spdk_get_data_out_buffer_size() by macro constant
The helper function spdk_get_data_out_buffer_size() is a little
confusing because it does only returning macro constant
SPDK_ISCSI_MAX_RECV_DATA_SEGMENT_LENGTH.

The macro constant will be configurable and so the helper function
is not sustainable.

Replace the helper function simply by the macro constant.

Change-Id: I4ec300f61783da7bb712512603c2dd80987ec702
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:01:08 +00:00
Shuhei Matsumoto
80fd917004 iscsi: Rename macro constant MAX_FIRSTBURSTLENGTH by SPDK_ISCSI_MAX_FIRST_BURST_LENGTH
Change-Id: If3e2ded87dab5c6596ec499460f7d233bf63154e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444536
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:01:08 +00:00
Shuhei Matsumoto
d3526f523e iscsi: Replace duplicated macro constant DEFAULT_FIRSTBURSTLENGTH
Replace DEFAULT_FIRSTBURSTLENGTH by SPDK_ISCSI_FIRST_BURST_LENGTH.

Change-Id: Ia90ef714fab79dff4f9b9eb92ade8cfed3391450
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444535
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:01:08 +00:00
Shuhei Matsumoto
113db66a9a iscsi: Remove unused macro constant MAX_SESSIONS
Change-Id: I72fcb458d46f90b1036cfa346fbc675b8c21cf2f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444534
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:01:08 +00:00
Changpeng Liu
76126989bd bdev/nvme: don't attach user deleted controllers automaticlly
When hotplug feature is enabled by NVMe driver, users may
call delete_nvme_controller() RPC to delete one controller,
however, the hotplug monitor will probe this controller
automaticlly and attach it back to NVMe driver.  We added
a skip list, for those user deleted controllers so that
NVMe driver will not attach it again.

Fix issue #602.

Change-Id: Ibbe21ff8a021f968305271acdae86207e6228e20
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444323
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 20:57:03 +00:00
yidong0635
9d838d24ad rdma: add return to avoid address points to the zero page
Error logs in nvmf_rdma_dump_request lead to report error about
address points to the zero page, add judgement to return.
this issue occurs in heavy load fio testing.

Change-Id: I50302be88b3af53f718e3800aa16df7c506ca4e8
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441110
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-15 04:29:40 +00:00
Changpeng Liu
bad30d5366 nvme: add the asynchronous controllers probe/poll APIs
User can create a probe context to probe and attach controllers
asynchronously, the controllers will be added to the context list
for the first step, then users can poll the context until the list
becomes empty.

Change-Id: I3a96e2d8a9724332ff15542f78f9553fdab505e2
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442664
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 03:14:20 +00:00
Changpeng Liu
3306e49e24 nvme: introduce probe context data structure and API
Existing NVMe driver uses a global list g_nvme_init_ctrlrs
to track the controllers during initialization, and internal
function will start each controller in the list one by one
until the list is empty.  We introduce a probe context
and move the global list into the context, with the context
we can enable asynchronous probe API in the next patch, also
this can enable parallel probe feature.

Change-Id: I538537abe8c1a4a82fb168ca8055de42caa6e4f9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/426304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 03:14:20 +00:00
Changpeng Liu
207353960f nvme: broke up spdk_nvme_probe_internal() into two stages
Previously, function spdk_nvme_probe_internal() will probe
NVMe controllers and then bring up probed controllers
into the ready state after that.  Broke up original two parts
with probe and start stage, this will help us to introduce
a probe context in the next patch.

Change-Id: Ie0c55a6a5463fb437f84349b0b2b33a217ba63e0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/426303
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 03:14:20 +00:00
Ben Walker
12759ab569 event: The reactor now contains a list of threads
It iterates over the list and polls each one. However,
in practice the list still contains just one thread for
now.

Change-Id: I9bac7eb5ebf9b4edc6409caaf26747470b65e336
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440763
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-14 14:58:56 +00:00
Ben Walker
4e7bb83e78 thread: Add a function to get the thread from a context
This is the inverse of spdk_thread_get_ctx.

Change-Id: I81541ff1687cfea358cb7046caf69982c38f6a38
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444455
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-14 14:58:56 +00:00
Ben Walker
9810431431 thread: Allow for an extra region of memory allocated on each thread
Schedulers can use this region to store required information.

Change-Id: I93efb44f1a534596f6285bbe014579311fe011e7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444454
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-14 14:58:56 +00:00
Ben Walker
86a21aee4e event: Implement context switch monitor without a poller
This is much simpler and avoids the problems with requiring
it to run on a thread.

Change-Id: I811444c5a15d292356703beccc17e505d55d7678
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443645
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-14 14:58:56 +00:00
Ben Walker
3569ea154a event: Remove max_delay_us parameter
The thread scheduling mechanism is being rewritten and this
won't be used in the new system.

Change-Id: I829e8118ed0a10480bd86934b45e68fcb810931a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444453
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-14 14:58:56 +00:00
Changpeng Liu
7d4d22a846 nvme: add a wait for completion timeout API
Althrough SPDK already provides a API to users which
can process runtime timeout NVMe commands, but it's
nice to have another API here, SPDK NVMe driver can
use it to break the endless wait.  Also use the API
first in the initialization process, because we don't
want to add another initialization state with Intel
only supported log pages.

Change-Id: Ibe7cadbc59033a299a1fcf02a66e98fc4eca8100
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444353
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-14 03:47:13 +00:00
Changpeng Liu
2c026cf430 nvme: remove unused minimum period timeout value
Change-Id: I4277166ef5c1ffb5f1d1962ccc5b74d807ef637f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444352
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-14 03:47:13 +00:00
Changpeng Liu
d5b89466cc nvmf: add get/set features with reservation notification mask support
Change-Id: I93089c4b362930d1e2b3a847639e6cc18b15f217
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439933
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-14 01:28:43 +00:00
Ziye Yang
2d0ce5b48b nvmf/tcp: Implement correct behavior of timeout for C2Htermreq case
From TP8000 spec 7.4.7,

"In response to a C2HTermReq PDU, the host shall terminate the connection.
If the host does not terminate the connection in an implementation specific
period that does not exceed 30 seconds, the controller may terminate the
connection on its own".

It means that the timeout is designed for: when the target is
sending out C2hTermReq, if the host does not terminate the connection,
the target should terminate the connection.

PS: For detecting the malicous connection without sending response
(such as no response of R2T PDU) which should be another patch.

Change-Id: I586dbb235d99aeab5d748a19b9128cd8b0cef183
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-13 18:20:28 +00:00
Ben Walker
dadb948585 bdev/aio: Reap completions from userspace if supported
Change-Id: I30d9cc619df2fddb870ed7bf187f14cd44376d19
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443468
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-13 18:14:53 +00:00
Nikos Dragazis
70c703bb49 vtophys: add vfio no-IOMMU mode support
Currently, SPDK does not support vfio in no-IOMMU mode. However, it
seems quite easy to extend the vtophys code to add support for this.

vfio in no-IOMMU mode does not support DMA remapping. This implies that
physical DMA addresses are used instead of IOVAs.

This patch checks whether the vfio no-IOMMU mode is enabled using
function rte_vfio_noiommu_is_enabled() from the DPDK RTE vfio interface.
In this case, physical addresses are used for the DMA mappings. This is
the same code path for the DMA translations as when the uio is used as a
kernel driver.

Change-Id: I6fb3c849a345c6f2f2b4141dddb8c17be2581495
Signed-off-by: Nikos Dragazis <ndragazis@arrikto.com>
Reviewed-on: https://review.gerrithub.io/c/441061
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-13 17:59:16 +00:00
Ben Walker
514358e5d3 event: Move thread lib init/fini into reactor.c
Keep all of the thread library interactions in one file.

Change-Id: Iecb20d3767190b5da105a29670ead9e192d03257
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440761
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-13 09:55:17 +00:00
Ben Walker
836356f2d5 thread: Run all pollers on each spdk_thread_poll call
This prevents issues where spdk_thread_poll may report
that it did not useful work (for the one poller it ran),
causing the system thread to go to sleep.

Change-Id: I7a4842d5e399758c19268aee343a001ccfc88a3a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440598
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-13 09:55:17 +00:00
Changpeng Liu
da30cda946 nvmf: add get/set features with reservation persistence support
The persistence feature can't support for now, but as the features
are mandatory for reservation, so add the two function here, and
we can enable it with future patches for power loss persist feature.

Change-Id: Ic358eda00058809bbfd6984b0861f8b6b5aabecd
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438213
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-13 06:10:53 +00:00
Seth Howell
bdc81134c2 nvmf: use io unit size in transport buffer pools
When this structure was brought up to the generic layer, the tcp
transport was using max_io_size and the rdma transport was using
io_unit_size. In the interest of conserving memory, we should use
io_unit_size instead of max_io_size.

Change-Id: I2633306fcbfd8c3d557445959c745cb2d9a0999e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 23:34:20 +00:00
Seth Howell
b7651b681c NVMe-oF: add asserts for SGE counts
We should never be going over these limits in the respective transports,
but add asserts to check this during testing.

Change-Id: Ifcaa82ccf58546a38020b31df54ee5d1d9822b8b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442777
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 23:34:20 +00:00
Seth Howell
c1e0bded59 nvmf_tgt: read ret from spdk_nvmf_poll_group_add.
It is possible for spdk_nvmf_poll_group_add to fail. In this case we
need to tear down the qpair in the same way that we do in the new_qpair
function.

Change-Id: I17abdec2646d2b7f9ed07c9b9b3e74d3d0991903
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443472
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 20:39:44 +00:00
Seth Howell
145485769e nvmf: remove qpair state activating.
This intermediate state is unused and meaningless. the qpair transitions
into this state right before calling a synchronous operation and then
transitions to active as soon as that operation completes successfully.
If the operation did not complete successfully, we were leaving qpairs
in this weird intermediate state when for all intents and purposes they
had reverted to an uninitialized state. Keeping qpairs in the
uninitialized state until they have been added to a poll group creates a
meaningful distinction between states that can be actionable from the
transport level.

Change-Id: I6de9bc424b393b6fff221aa2f4212aaa91488629
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443471
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 20:39:44 +00:00
Seth Howell
b952668186 rdma: destroy uninitialized qpairs immediately.
Connections in the uninitialized state haven't been added to a poll
group yet, so submitting dummy requests to them will be pointless since
they will never be polled. We need to reject the connection and destroy
the qpair immediately.

Change-Id: Id5dd711882e1ae7c13ae32c06da2285186b00a1b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443470
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 20:39:44 +00:00
Seth Howell
825cac2720 rdma.c: Create a single point of entry for qpair disconnect
Since there are multiple events/conditions that can trigger a qpair
disconnection, we need to funnel them to a single point of entry. If
more than one of these events occurs, we can ignore all but the first
since once a disconnect starts, it can't be stopped.

Change-Id: I749c9087a25779fcd5e3fe6685583a610ad983d3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 20:39:44 +00:00
Darek Stojaczyk
655d54f3f1 nvme: remaning changes related to nvme hooks
Change-Id: I07f3f403bef26a7c3e41b3c9f74e7ba4e378b2cc
Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/443650
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-12 19:43:02 +00:00
Seth Howell
b6b0a0ba59 rdma: adjust I/O unit based on device SGL support
For devices that support fewer SGE elements than our default values, we
need to adjust the I/O unit size so that we don't ever try to submit
more SGLs than we are allowed to.

Change-Id: I316d88459380f28009cc8a3d9357e9c67b08e871
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 18:46:57 +00:00
Seth Howell
e7beb0d1fd nvme_rdma: don't put req until both send and recv have completed
This prevents us from overrunning the send queue.

Change-Id: I6afbd9e2ba0ff266eb8fee2ae0361ac89fad7f81
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443476
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 18:45:11 +00:00
Seth Howell
92f5548a91 rdma: properly account num_outstanding_data_wr
This value was not being decremented when we got SEND completions for
write operations because we were using the recv send to indicate when we
had completed all writes associated with the request. I also erroneously
made the assumption that spdk_nvmf_rdma_request_parse_sgl would properly
reset this value to zero for all requests. However, for requests that
return SPDK_NVME_DATA_NONE rom spdk_nvmf_rdma_request_get_xfer, this
funxtion is skipped and the value is never reset. This can cause a
coherency issue on admin queues when we request multiple log files. When
the keep_alive request is resent, it can pick up an old rdma_req which
reports the wrong number of outstanding_wrs and it will permanently
increment the qpairs curr_send_depth.

This change decrements num_outstanding_data_wrs on writes, and also
resets that value when the request is freed to ensure that this problem
doesn't occur again.

Change-Id: I5866af97c946a0a58c30507499b43359fb6d0f64
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 18:43:44 +00:00
Shuhei Matsumoto
6574647096 bdev/nvme: Enable PI check and verify PI error for read I/O
Pass IO flags to NVMe write IO and verify PI error when PI error
occurs.

To know the location that caused PI error, checked read with disabling
PRCHK is necessary and is used in this patch.

Change-Id: Id90fb90c4b3ca95840785a4443ff98d637ceb247
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443189
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 17:57:25 +00:00
Shuhei Matsumoto
1b2f095865 bdev: Add spdk_bdev_io_get_io_channel API
Currently struct bdev_io holds io_channel that the I/O was submitted
on through bdev_io::bdev_channel, but bdev_io::bdev_channel is private
in bdev.c and cannot be referenced in other files.

Hence add an new API spdk_bdev_io_get_io_channel API to get io_channel
coniveniently.

Change-Id: Ic2e2fde845d324f7a1637e3c75080727a62de5ec
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443843
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-12 17:57:25 +00:00
Shuhei Matsumoto
39995cdadb bdev/nvme: Enable PI check and verify PI error for write I/O
Pass IO flags to NVMe write IO and verify PI error when PI error
is detectec.

For write I/O, PI error will be already contained in write data
buffer, and no extra I/O is necessary.

Change-Id: I2f2359c4201aded7abccb182c39c00b25ff0bd5f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443188
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 17:57:25 +00:00
Vitaliy Mysak
cab1ea1c05 OCF: add support of dump_info_json
Add some information to vbdev_ocf json config

Change-Id: I3b19e1187e833648fe68e8f56a1cddf8ade9fffb
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/435773
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 15:44:40 +00:00
Shuhei Matsumoto
5cefef8039 bdev/passthru: Use typedef in bdev_module.h for spdk_bdev_unregsister
Including bdev_module.h and using spdk_bdev_unregister_cb instead of
spdk_delete_passthru_complete will follow other bdev modules.
This patch doesn't change any behavior.

Change-Id: Ia236ea37ae22ed5c7740b02d1c5bd37491b9cf9a
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444166
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 15:42:06 +00:00
Shuhei Matsumoto
393ac02b56 bdev/passthru: Use vbdev's name as io_device's name instead of base bdev's name
Using passthru virtual bdev's name instead of base bdev's name as
io_device's name will be meaningful. This patch doesn't change any
behavior.

Change-Id: I33f7aa78c60cd1d9f6a7b36280441bc559f44857
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444165
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 15:42:06 +00:00
Shuhei Matsumoto
742b879691 bdev/nvme: Inline bdev_nvme_queue_cmd() into bdev_nvme_readv/writev()
Subsequent patches will implement PI verification when PI error occurs,
but PI verification will be different between read and write.

Subsequent patches will set IO flags for normal read and write but
will not set IO flags for checked read.

Current nesting stack,
 bdev_nvme_readv/writev
  -> bdev_nvme_queue_cmd
   -> spdk_nvme_ns_cmd_readv/writev
    -> bdev_nvme_queued_done
makes these changes difficult.

Hence this patch inlines bdev_nvme_queue_cmd into bdev_nvme_readv/writev,
adds separate completion function bdev_nvme_readv/writev_done, and
removes enum direction.

This patch doesn't cause any functional change.

Change-Id: I2f97ff21245539c690490d0fc4134d2e0049eddd
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443187
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
50d7ad0676 bdev/nvme: Document prchk flags is not set to hot added NVMe controllers
PI check flags is not set to NVMe controllers created by hot plug
handler automatically. Document this behavior for clarification.

Change-Id: I9590d0cb7f53a24c33afd706e222065893d23cb4
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444012
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
0143ff6b13 bdev/nvme: Set per-controller prchk options by .INI config file
Add "prchk:reftag|guard" to the 3rd item of the TransportID row
in [Nvme] section.

apptag is not supported yet as same as JSON RPC.

These two patches cannot control hot added NVMe controllers, but
we should not set prchk options to hot added NVMe controllers
automatically. Hence the next patch will document this behavior
explicitly.

Change-Id: I74a73ac52779aa50c5b45e20ffb61002e95f33ef
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443835
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
9562a5c7c1 nvme: Add parse and output strings of prchk flags
The next patch will use the string "prchk:reftag|apptag" as
per-controller prchk options for .INI config file.

Hence add helper functions for them beforehand.

Change-Id: I58c225cc36cc84bf594f108e611028996b5eedb9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443834
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
260f9a77c3 bdev/nvme: Set per-controller PRCHK options by JSON RPC
Add prchk_reftag and prchk_guard to construct_nvme_bdev RPC.
In spdk_rpc_construct_nvme_bdev, create prchk_flags based on them
and pass it to spdk_bdev_nvme_create, and in spdk_bdev_nvme_create,
pass it to create_ctrlr.

A single option enable_prchk may be enough but add separate options
for reftag and guard to clarify that apptag is not supported yet.

The next patch will make per-controller PRCHK options configurable
by .INI config file.

Change-Id: I370ebbe984ee83d133b7f50bdc648ea746c8d42d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443833
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
a827a91e6d bdev/nvme: Add per-controller PRCHK values
Add prchk_flags in struct nvme_ctrlr and set it at creating of
the corresponding controller, and copy it to each bdev of the
controller.

Change-Id: Ie971a0c1539b5419de9e5168ed47ac0e579be2c5
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443186
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
6afc800d93 bdev/nvme: Return error when creating NVMe bdev with separate metadata
Bdev don't support APIs that passes metadata not interleaved with
logical block data. So, return error explicitly when creating NVMe
bdev with separate metadata for now.

Change-Id: I0776e72232c8e7758ad11b405e7e4914e779d131
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444011
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-12 09:14:31 +00:00
Shuhei Matsumoto
8736e00f0d bdev/nvme: Set variables about metadata and DIF at NVMe bdev initialization
Metadata location and DIF type are set only if there is metadata, and
DIF location is set only if DIF is enabled.

Change-Id: Ib684b54332820446ff1a0b609f5b4e0b3d42f2f9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 09:14:31 +00:00
Ziye Yang
55be9a57a6 nvme/tcp: fix the lvol creation failure issue
The patch is used to fix issue:
https://github.com/spdk/spdk/issues/638

Reason: For supporting sgl, the implementation of
function nvme_tcp_pdu_set_data_buf is not correct.
The translation is not correct for incapsule data
when using SGL. In order not to do the translation
via calling sgl function again, we use a variable
to store the buf.

Change-Id: I580d266d85a1a805b5f168271acac25e5fd60190
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444066
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-12 03:52:48 +00:00
gila
df6b55fd8c bdev: make spdk_bdev_register_module_xxx function names predictable
Currently, the SPDK_BDEV_REGISTER_MODULE() macro uses __LINE__
to generate functions like spdk_bdev_module_register_187().

Typically, this is not a problem as these functions are not called directly
rather, they are only used as constructor functions to load the bdevs during
system startup.

There are languages however, (e.g rust) that require these functions to be
referenced explicitly to prevent them from being removed during the linking phase.

In order to reference them, having the names predictable (and potentially
changed per commit) makes things easier.

Change-Id: I15947ed9136912cfe2368db7e5bba833f1d94b15
Signed-off-by: gila <jeffry.molanus@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/443536
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-11 23:56:53 +00:00
Ben Walker
8abfb06e31 thread: Optionally allow the current time to be passed to
spdk_thread_poll()

This is an optimization if the calling function already knows the
current time.

Change-Id: I1645e08e7475ba6345a44e0f9d4b297a79f6c3c2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443634
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-11 22:58:45 +00:00
Ben Walker
15d3631064 thread: Move stats to thread
Change-Id: I351626355c45d0a3e66fec2688191429781e5952
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443633
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-11 22:58:45 +00:00
paul luse
8a1acca65d bdev/raid: Add strip_size_kb rpc param for create
strip_size as an rpc param is now deprecated and can be removed
in a future release.  Either strip_size or strip_size_kb can be
used but only one of them or the rpc will fail.

Internally we maintain both fields because strip size always
comes in as KB but we convert it to blocks so having both elements
makes it clear for developers what they're looking at.

JSON output includes both strip_size and strip_size_kb.

Fixes #550

Change-Id: I5dc51e8af22eae3d56af8f8d37a564dbaae228fa
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437873
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-11 22:46:33 +00:00
Darek Stojaczyk
1a150069e1 bdev/crypto: use rte_mempool for g_session_mp
DPDK 19.02 requires this mempool to be allocated via
crypto-specific function which returns rte_mempool.
To keep the amount of #ifs minimal, we'll use rte_mempool
unconditionally.

Change-Id: I3a09de41e237e168580bb92b574854e291e68a74
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443785
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-11 19:44:51 +00:00
Darek Stojaczyk
40240f7e9d bdev/crypto: release qpairs on module finish
We setup the qpairs on module init but never
released them. Some memory was leaked, although since
it was allocated with rte_malloc() it couldn't be
picked up by ASAN.

rte_cryptodev API offers rte_cryptodev_queue_pair_setup()
to setup a qpair, but there's no equivalent function to
release it. We have to access the rte_cryptodev structure
directly and call a qpair release function ptr that's
stored inside. It seems very very hacky, but the entire
rte_cryptodev structure is a part of the public API and
the global array of all such devices is an exported
symbol.

Change-Id: I17ac73d1098ca9a92d2dfd52e0f905e2c2b5488f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-11 19:43:03 +00:00
Seth Howell
ceb32abbd8 nvmf: don't set qpair->group to NULL.
The typical rdma qpair disconnect function goes through the function
_nvmf_rdma_disconnect_retry. When this function was introduced, it was
discovered that we could receive a qpair disconnect event for a given
qpair before that qpair had been assigned to a poll group. In order to
ensure that the disconnect procedure completed properly, we waited on
the current thread in _nvmf_rdma_disconnect_retry for the qpair to be
assigned a poll group before we finally disconnected. see rdma.c:2250.
Since _nvmf_rdma_disconnect_retry was not necessarily called from the
poll group's thread, we relied upon the assumption that the group
variable would never be set back to NULL. See the comment on rdma.c:
2243.

However, in _spdk_nvmf_qpair_destroy we were setting the group back to
NULL. This operation can result in the following set of operations
across multiple threads that prevent a qpair from ever being fully
destroyed.
1. thread 1: receive a disconnect event - call nvmf_rdma_disconnect
2. thread 1: from nvmf_rdma_disconnect call
spdk_nvmf_rdma_qpair_inc_refcnt - setting rqpair->refcnt to 1.
3. thread 2: call spdk_nvmf_rdma_poller_poll.
4. thread 2: in spdk_nvmf_rdma_poller_poll reap a completion with an
error status which causes us to call spdk_nvmf_qpair_disconnect -
rdma:2846
5. thread 2: spdk_nvmf_qpair_disconnect calls _spdk_nvmf_qpair_destroy which sets
qpair->group = NULL
6. thread 1: from nvmf_rdma_disconnect we call
_nvmf_rdma_disconnect_retry which checks if qpair->group == NULL. If
that is the case, we assume that the qpair has not been assigned a group
yet and send ourself a message to call _nvmf_rdma_disconnect_retry again. see rdma.c:2253
7. thread 2: from _spdk_nvmf_qpair_destroy we call
spdk_nvmf_transport_qpair_fini which results in a call to
spdk_nvmf_rdma_close_qpair. which sends dummy send and recvs to the
qpair.
8. thread 2: we call poller_poll and get completions for both the send
and recv dummy requests. This results in a call to
spdk_nvmf_rdma_qpair_destroy.
9. thread 2: spdk_nvmf_rdma_qpair_destroy checks rqpair->refcnt and when
it sees that it does not = 0 (see step 2 above) it returns without
freeing the resources. see rdma.c:629
10. thread 1: we keep churning in _nvmf_rdma_disconnect_retry sending
ourselves messages because rqpair->group is going to be null. Thread 1
never reaches line 2257 where it sends a message to call
_nvmf_rdma_qpair_disconnect. _nvmf_rdma_qpair_disconnect is the function
that decreases the rqpair->refcnt and allows us to make forward progress
on destroying the qpair.

I encountered this issue while trying to disconnect from our target
using the kernel initiator with an x722 NIC. I think the timing on this
bug comes out with that specific configuration because come of the calls
in the disconnect path on thread 1 fail causing it to take longer giving
a chance to the second thread to delete the qpair.

There are really two issues at play here. We don't have a single point
of entry for disconnecting RDMA qpairs, and we rely on the qpair->group
variable never being set back to NULL. This patch addresses the second
issue, and the next patch in the series addresses the first.

Change-Id: I65395d0bbb67edfa7bad2ddc70906606c3d83781
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-11 19:25:51 +00:00
paul luse
d9d4e40dd2 bdev/compress: Add configure option and build dependencies
Includes the required DPDK dependencies for SPDK block Reduce aka
Compression.

Change-Id: Ic1ea3cbeb9373a7700f6f0c2a3194d65d6a34a41
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/429523
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-11 19:23:17 +00:00
Shuhei Matsumoto
f4e58a003a lib/bdev: Expose enabled DIF check types of bdev.
This patch is for DIF check types.

Add enum spdk_dif_check_type to DIF library.
Add a field dif_check_flags to struct spdk_bdev and add
spdk_bdev_is_dif_check_enabled to bdev APIs.

Added enum is intended to improve usability. If no enum, the
caller will have to get raw data of flags and mask each bit.

Change-Id: Ia46a37a9684dc968dcc51963674f0a9963e0cd4d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443339
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-08 23:37:13 +00:00
Shuhei Matsumoto
7bb007d206 lib/bdev: Expose DIF type and location of bdev.
This patch is for DIF settings.

Add fields dif_type and dif_is_head_of_md to struct spdk_bdev and
add APIs spdk_bdev_get_dif_type and spdk_bdev_is_dif_head_of_md to
bdev APIs.

The fields dif_type and dif_is_head_of_md are added to the JSON
information dump.

Change-Id: I15db10cb170a76e77fc44a36a68224917d633160
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443184
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-08 23:37:13 +00:00
Shuhei Matsumoto
96f29261d4 dif: Rename bitmask macros from SPDK_DIF_*_CHECK to SPDK_DIF_FLAGS_*_CHECK
Next patch will introduce enum spdk_dif_check_type for user to
know easily if checking DIF field is enabled or not.

This patch renames bitmask macros from SPDK_DIF_*_CHECK to
SPDK_DIF_FLAGS_*_CHECK to avoid mis-interpretation .

Using FLAGS was derived from SPDK_NVME_IO_FLAGS_PRCHK_* in
include/spdk/nvme_spec.h.

Change-Id: I89e155d047352f54091c14b9251464cd3a72a162
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443338
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-08 23:37:13 +00:00
Shuhei Matsumoto
139da44c43 lib/bdev: Expose metadata size and setting of bdev
To support DIF, bdev will need to expose the following information:
- Metadata format
 - Block size
 - Metadata size
 - Metadata setting (interleave or separate)
- DIF settings
 - DIF type 1, 2, or 3
 - DIF location
- DIF check types
 - Guard check
 - Reference tag check
 - Application tag check

This patch is for the metadata format. Subsequent patches will do for the DIF
setting and DIF check types.

Add fields, md_len and md_interleave, to struct spdk_bdev and add APIs,
spdk_bdev_get_md_size and spdk_bdev_is_md_interleaved, to bdev APIs.

The fields, md_len and md_interleave, are added to the bdev JSON infomation dump.

DIF will be used only in the NVMe bdev module and the upcoming virtual
DIF bdev module first. But additional required storage by md_len and md_interleave
will be very small and they are simple. Hence add them to struct spdk_bdev simply.

Change-Id: I4109f6a63e6f0576efe424feb0305a9a17b9b2e8
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/443183
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-08 23:37:13 +00:00
Ben Walker
a1cea6f48f bdev/aio: Set minimum events to 0 in io_getevents
The timeout is set to 0, so it never waits anyway. But
this should be 0.

Change-Id: I8b4058017a91b647ea9324f1474a732921c389f0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443647
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-08 16:36:18 +00:00
Wojciech Malikowski
5f959d5f0c bdev/ftl: write_config_json support
Change-Id: Ifbd2b61ef38b216a8c7071f1206c0370dbe496e6
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442980
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2019-02-08 16:35:34 +00:00
Wojciech Malikowski
8cda50fd96 lib/ftl: Remove NULL pointer checks in external APIs
Change-Id: Ia2d89e7bf350544ced9a8ae40821634dedb3a741
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443384
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-08 16:35:34 +00:00
Ben Walker
7a4d6af182 nvmf/tcp: Stay in AWAIT_PDU_READY state until atleast 1 byte arrives
This doesn't fix any bug, but it makes more sense to leave the qpair
in the NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY state until it
receives at least one byte.

Change-Id: Ic5f34a733a80b58f65a1334fae7e07dbded2b3d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-08 16:35:12 +00:00
Darek Stojaczyk
f22de50bee env/dpdk: remove rte_pci_bus extern declaration
It simply isn't used anywhere.

Change-Id: Ie055c15d86563be710d25e502660f79efcb67a23
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443510
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-06 20:22:16 +00:00
Darek Stojaczyk
95200f5bac iscsi: remove unused mobj fields
The `len` field wasn't used at all and `reserved` is
no longer needed after we removed the paddr in the
previous patch.

This effectively cuts down spdk_mobj struct size by half.

Change-Id: Ica39f3a30e14ec1275a87d827dc41df5df9cf623
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443483
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-06 20:21:48 +00:00
Darek Stojaczyk
473d5a8f14 iscsi: don't store paddr in pdu buffers
The physical addresses in iSCSI are completely unused
as iSCSI does not perform any DMA on its own.

Change-Id: I350037b708a9f36f423e6ca6f7c822d8b6b95116
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443482
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-06 20:21:48 +00:00
Darek Stojaczyk
1dc9e7024b vhost/rpc: remove unnecessary if in the add_vhost_scsi_lun RPC
We explicitly checked for one of the strings in the
parsed RPC request even though it's required for the
entire request to parse successfully. The extra check
is now removed.

Change-Id: I19c446786e4ac88b88f14e18dc5258f31b1a87f1
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-06 19:04:21 +00:00
Darek Stojaczyk
244a619fa9 vhost/rpc: remove dynamic memory allocations for RPC requests contexts
Since we no longer use external events and we access
all vhost devices synchronously, we no longer need
to dynamically allocate our RPC request contexts. They
can be put just on the stack.

Change-Id: Ie887607b67451aba4f3404c4b9551e6424335beb
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440380
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
2019-02-06 19:04:21 +00:00
Darek Stojaczyk
29944a2f33 vhost: remove vhost external events
Removed their various usages inside the core vhost code
together with the external events themselves. External
events were completely replaced by spdk_vhost_lock()
and spdk_vhost_dev_find().

Change-Id: I1f9d0268c27a06e2eecab9e7d179b1fd54d4223d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440379
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-06 19:04:21 +00:00
Darek Stojaczyk
dd4f161b7e vhost/rpc: remove any usages of the external events
Replaced them with inline code that performs exactly
the same but is shorter and easier to follow. External
events were replaced by spdk_vhost_lock() and
spdk_vhost_dev_find().

Change-Id: Id46a619c592c20a573664b54efc097489e9bb893
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440378
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-06 19:04:21 +00:00
lorneli
4b6621d08e nvme/pcie: mark infrequent cases as unlikely in submission path
Currently infrequent cases in request completion path are marked as
unlikely. This patch applies that to submission path.

These cases are infrequent and marked using unlikely marco:
a. The sq tail reaches the end of queue.
b. The sq tail equals to sq head. (never happen if FW runs correctly)
c. The qpair is admin queue.

Change-Id: I8b873a18615788f2efbf7c683aad710c7007a082
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/443451
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-06 18:37:40 +00:00
Ben Walker
63de221bf6 nvmf/tcp: Eliminate management channel in favor of poll group
The management channel was used in the RDMA transport prior
to the introduction of poll groups and made its way over to
the TCP transport when it was written. Eliminate it in favor
of just using the poll group.

Change-Id: Icde631dd97a6a29190c4a4a6a10a0cb7c4f07a0e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2019-02-06 16:02:43 +00:00
Ben Walker
993c4a0799 nvme: Add a function to query controller memory buffer support
Change-Id: Id539f4eaabe2038d4925eaa140864c0abd9b2649
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442635
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
2019-02-06 16:01:56 +00:00
Ben Walker
4ea3e63291 bdev/aio: Remove list of channels on channel group
This was only temporarily required for polling. With
a per-group aio ctx, it isn't needed anymore.

Change-Id: Ie59b50a4700f0f99dea470f857d187ac656dd229
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443467
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
6bb59dff48 bdev/aio: Move aio context onto group channel
We only need one aio context for the entire set of channels
sharing a thread.

Change-Id: I1143247901586efe50530b28323ddb923bc6b242
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443314
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
318bd602ac bdev/aio: Hold pointer to bdev_aio_group_channel in bdev_aio_io_channel
This is marginally more convenient.

Change-Id: I9989d687b80051ccb2e07edc5e1efdbca75e8716
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
0ed2a120b3 bdev/aio: Keep a pointer to the channel on the aio_task
This will be used later.

Change-Id: I12b07756a13d03a34c9705306d720c1db7ecb15c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443312
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
7f534fca6a bdev/aio: Remove epoll support
This wasn't actually necessary. The next patch in this series will
change the way aio is used such that only one aio context is
polled for the entire group of channels on a single thread.

Change-Id: I05c4d824d9c63a51c8a2d608d84c184f249f66d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443311
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-06 15:15:22 +00:00
Ben Walker
7ffbf85dab bdev/aio: Store channels in a list in the group channel
This isn't used just yet, but will be necessary temporarily
during this patch series.

Change-Id: I7f04426c27e3fe0417e2f60bac28217fa44c0cb2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443310
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-06 15:15:22 +00:00
Ben Walker
6ced440ad6 bdev/aio: Move definition of bdev_aio_group_channel up
Move it next to the other channel definition.

Change-Id: I9ec33c135836d3dc326abe4ce7588e7a2eff77d4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443309
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
669f1ea74a bdev/aio: Move structure definitions into bdev_aio.c
These didn't need to be visible.

Change-Id: I337a02802cac4431b4abd9a922408d4147801565
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443308
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
d92e0a403b bdev/aio: Eliminate bdev_aio_initialize_channel
Small static function only called from one place, so
just inline it.

Change-Id: Ibc54f790da55dd1635d81181208b1d506550ca9c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443307
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-06 15:15:22 +00:00
Ben Walker
71889403e2 bdev/aio: Move epoll include inside bdev_aio.c
It does not need to be in the header file.

Change-Id: I5c489de81e48b11d02b66cbdd6d9ac05eae16429
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443306
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-06 15:15:22 +00:00
Seth Howell
41cd5ff4fb rdma: fix max_read_depth_definition.
max_read_depth should be based on max_qp_init_read_atomic, or the
maximum number of read values that the initiator will accept as
outstanding.

The device attributes object contains values for both the initiator
(remote side) and the target (local side). All attributes with the name
init in them are meant to correspond to the initiator. The
qp_read_atomic value represents the number of reads and atomic
operations that can have this device as the target. qp_init_read_atomic
represents how many read operations the initiator has said that we can
have outstanding that have the initiator's rdma device as the target.

Since this number represents how many outstanding reads we will send to
the initiator at once, we should use the qp_init_read_atomic value.

Change-Id: Iacc044e8321080de8accd9128ac3777bbb948afc
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442409
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-05 18:04:04 +00:00
Wojciech Malikowski
68b49203a7 lib/ftl: Change order of relocation queues processing
ftl_process_reloc should process free_queue in first place
(this will start read operations) and then process write queue.

Change-Id: I3a44b3651cc1526f8a024330472f94aa8d818193
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443403
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-05 18:02:15 +00:00
Wojciech Malikowski
7f21e8c571 lib/ftl: Free IO after lba map read during relocation
Change-Id: Id44f9de4500ec2be45aa4203c5945b1501fbdb21
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443236
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-05 18:02:15 +00:00
Jim Harris
ff182630dc bdev: explicitly mark _spdk_bdev_io_submit as inline
This function gets used as a function pointer, which
seems to keep the compiler from trying to inline the
function.  Stack manipulation was showing up in the
perf profile pointing to this.  Marking the function
as inline gets it actually inlined in the hot I/O
path.

Improves bdevperf microbenchmark from 78M to 85M IO/s.
Cores are virtually identical - 11.4M on core 0 and
10.4-10.6M on remaining cores.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iadced071dfc07fc09db6da3571c930988b2dc3fd

Reviewed-on: https://review.gerrithub.io/c/443278
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-05 17:25:31 +00:00
Jim Harris
ab0a454dc6 bdev: insert freed spdk_bdev_io to the head of the cache
This keeps the hottest structures at the head of the
cache and helps improve performance.

Improves microbenchmark (8 null bdevs on 8 lcores,
bdevperf seq read with qd=1) from 67M to 78M on my
Xeon E5-v3 system.  Core 0 performance remains about
the same (10.7-10.8M) but others cores improve from
around 8.0M each to 9.4M.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia3ccf94ab39b6f911127f0bd1016e352027b11fc

Reviewed-on: https://review.gerrithub.io/c/443277
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-05 17:25:31 +00:00