Commit Graph

4653 Commits

Author SHA1 Message Date
Wojciech Malikowski
0f12c406d1 lib/ftl: Propagate ENOMEM error during read to upper layer
ENOMEM is expected when nvme_qpair will be out of resources.
In such a case ENOMEM shall be propagated to allow upper (bdev)
layer proper handling.

Change-Id: Ie647c2d3efff24a8de949a22ac42a31dfd0e78b7
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445580
Reviewed-by: Jakub Radtke <jakub.radtke@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-25 22:37:13 +00:00
Jim Harris
6ff6f6d6f8 blob: pass NULL or SPDK_BLOBID_INVALID when bserrno != 0
When an operation fails, we shouldn't pass a handle or
a 'valid' blob ID to the caller's completion function.
The caller *should* ignore it when bserrno != 0, but
it's best to not take that chance.

Fixes #685.

Note: #685 seems to have a broader issue related to
a possibly locked NVMe SSD in the submitter's system.
This only fixes the assert() that was hit.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3fb3368ccfe0580f0c505285d4b1e9aca797b6a6

Reviewed-on: https://review.gerrithub.io/c/445941
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-25 07:06:04 +00:00
GangCao
120825c91c QoS: enable rate limit when opening the bdev
There are some cases that virtual bdev open and close
the device and QoS will be disabled at the last close.
In this case, when a new bdev open operation comes again,
the QoS needs to be enabled again.

Change-Id: I792e610f4592bad1cac55c6c55261d4946c6b3e2
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442953
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-25 04:35:14 +00:00
Zahra Khatami
a55b2109bb nvmf: remaning changes related to nvmf hooks
Change-Id: I6780fa43cebd9f48d1ae0ea6fbeb92a95c4dfa15
Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/443653
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-22 21:16:36 +00:00
Wojciech Malikowski
1442b5f28a lib/ftl: Fix size of write buffer submission queue
SPDK ring size used for write buffer submission queue
must be increased if required number of batches is a
power of two.

Change-Id: I9b9f885064cf6f0f5fe94b0ed4f9d49a4e5c0cd0
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445721
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-22 19:13:40 +00:00
Changpeng Liu
de3eb36a61 bdev/nvme: exit controller removal callback if the destruction was started
For real PCIe drives, if we removed one drive, existing hotplug
monitor will trigger the remove callback twice, there is one
workaround for vfio-attached device hot remove detection which
will also trigger the hot removal callback.  For now we add
the check in the bdev_nvme layer so that coredump will not happen.

Fix issue #606.

Change-Id: I0605fbdf391fed20c4aa9a2d54b4f059f29dc483
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-22 18:32:17 +00:00
Darek Stojaczyk
5c1c946c7a bdev/crypto: compile with DPDK 19.02
It seems like DPDK 19.02 has split the "session mempool"
into two separate mempools but this isn't really described
in the DPDK release notes, so this patch only makes our
crypto code behave just like DPDK crypto examples.

rte_cryptodev_queue_pair_setup() no longer accepts
a separate mempool parameter but instead requires it
to be passed through a new field in struct
rte_cryptodev_qp_conf, which is also passed as a param
to rte_cryptodev_queue_pair_setup(). It's referred to as
"session private mempool" instead of "session mempool",
which makes some sense since we already use
rte_cryptodev_sym_get_private_session_size() (with the
word "private" in name) to calculate its size.

The other mempool - "session mempool" - now has to be
allocated with rte_cryptodev_sym_session_pool_create()
instead of regular rte_mempool_create().

Change-Id: I3bc6185855988b864ca59bc1972beaf4f7ea8925
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443738
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-22 18:31:52 +00:00
Seth Howell
b38e3a60c6 rdma: change the logic of rdma_qpair_process_pending
I think this simplifies the process a little bit.

Change-Id: Icc87a59c9f6fd965ef35531975b7036d85c4bc95
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-22 18:31:02 +00:00
Seth Howell
80eecdd881 rdma: use an stailq for incoming_queue
Change-Id: Ib1e59db4c5dffc9bc21f26461dabeff0d171ad22
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445344
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-22 18:31:02 +00:00
Seth Howell
bfdc957c75 rdma: remove the state_cntr variable.
We were only using one value from this array to tell us if the qpair was
idle or not. Remove this array and all of the functions that are no
longer needed after it is removed.

This series is aimed at reverting
fdec444aa8 which has been tied to
performance decreases on master.

Change-Id: Ia3627c1abd15baee8b16d07e436923d222e17ffe
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-22 18:31:02 +00:00
Seth Howell
04ebc6ea28 RDMA: Remove the state_queues
Since we no longer rely on the state queues for draining qpairs, we can
get rid of most of them. We cn keep just a few, and since we don't ever
remove arbitrary elements, we can use stailqs to perform those
operations. Operations on Stailqs carry about half the overhead as
operations on tailqs

Change-Id: I8f184e6269db853619a3581d387d97a795034798
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445332
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-22 18:31:02 +00:00
Changpeng Liu
30bbf3d944 nvme: move probe context as a internal data structure
Users should not access the internal probe context fields when
using the asynchronous probe API, so change spdk_nvme_probe_async()
to let it can only return the probe context pointer.

Change-Id: I0413c2d8db6cbe4539ad80919ed34dd621a9df70
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445870
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-22 18:13:39 +00:00
Shuhei Matsumoto
8696cd4288 dif: Add seed value for guard to avoid 0 in case of all zero data.
Allow user to add seed value for guard compuation to DIF context.
This will avoid the guard being zero in case of all zero data.

NVMe controller doesn't support seed value for guard computation
explicitly, and hence if we want to use such a seed value in
NVMe controller, we have to format metadata more than 8 byte,
and add seed value into the reserved metadata field.

But some popular iSCSI/FC HBAs and SAS controllers have supported
seed value for guard computation, and so supporting seed value
in the SPDK DIF library is very helpful for some use cases.

Hence this patch makes the DIF library possible to specify seed
value for those use cases.

Change-Id: I7e9e87cb441bf263e64605c7820409fdc22dd977
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444334
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
2019-02-22 17:52:51 +00:00
Jim Harris
7739a1f338 vhost: use mmap_size to check for 2MB hugepage multiple
Older versions of QEMU (<= 2.11) expose the VGA BIOS
hole (0xA0000-0xBFFFF) by specifying two separate memory
regions - one before and one after the hole.  This results
in the "size" not being a 2MB multiple.  But the underlying
memory is still mmaped at a 2MB multiple - so that's what
we should be checking to ensure the memory is hugepage backed.

Fixes #673.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1644bb6d8a8fb1fd51a548ae7a17da061c18c669

Reviewed-on: https://review.gerrithub.io/c/445764
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-22 10:24:16 +00:00
Darek Stojaczyk
b42cf6eaff env/dpdk: allow changing DPDK loglevels
spdk_env_opts->env_context may now contain a DPDK-specific
string that will be appended directly into rte_eal_init().
It can be used to e.g. override the default EAL loglevel,
which was hardcoded to RTE_LOG_NOTICE so far.

This is primarily meant to be used during development.

As a test for this feature, the vtophys test app will now
set the highest possible EAL loglevel which will give us
a ton of additional debug logs.

Note: the opts->env_context field is implementation-specific
and hence the vtophys app needs to check if it's run with
our env_dpdk. As SPDK_CONFIG_ENV is a raw text not even
surrounded with quotation marks, the vtophys app needs to
do a bit of #define magic to make it a string.

Change-Id: I0b2196770e5b59a6c33d0170337c34f9f8b8466e
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445111
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-22 08:49:45 +00:00
Darek Stojaczyk
9858408b55 env/dpdk: fix potential memleak on init failure
When we were trying to push a newly allocated string
into the arg array and the array realloc() failed,
the string we were about to insert was leaked.

Change-Id: I31ccd5a09956d5407b2938792ecc9b482b2419d1
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445149
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-22 08:49:45 +00:00
Shuhei Matsumoto
df99e28158 nvmf: Expose bdev's PI setting to NVMe-oF Initiator
This patch expose backend's bdev's PI setting to the corresponding
NVMe-oF Initiator by Ideintify command, and removes the check if
block size is 512 multiple.

These change enables NVMe-oF Initiator to send extended LBA payload.

Change-Id: Ia7aa8332d36f056872a515b6da90c83112edb909
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/445056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-22 00:36:55 +00:00
lorneli
815f82b17b nvme: mv submit_tick assignments to generic qpair code
Move req->submit_tick assignments from specific transports to generic
qpair code.

Check whether submit_tick has been assigned before doing the actual
assignment, because a request may be submitted several times and the
original submit_tick shouldn't be covered.

Change-Id: I2de8018dc21763eb5a19bb9d48dfbdef764b036e
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/444702
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-21 20:29:59 +00:00
Shuhei Matsumoto
95fc5d3581 iscsi: Remove SPDK_ISCSI_MAX_SEND_DATA_SEGMENT_LENGTH
In iSCSI, SPDK_ISCSI_MAX_SEND_DATA_SEGMENT_LENGTH was an alias
of SPDK_BDEV_LARGE_BUF_MAX_SIZE.

iSCSI had used both interchangeably.

SPDK_BDEV_LARGE_BUF_MAX_SIZE means the buffer size of the large
buffer pool in generic bdev layer, and will be changed to be
configurable.

SPDK_ISCSI_MAX_SEND_DATA_SEGMENT_LENGTH had been used to negotiate
MaxRecvDataSegmentLength with iSCSI initiator and to split large
read data, but both are determined by not iSCSI target but generic
bdev layer.

Hence this patch replaces SPDK_ISCSI_MAX_SEND_DATA_SEGMENT_LENGTH
by SPDK_BDEV_LARGE_BUF_MAX_SIZE.

Change-Id: I822a5203a5092fe8b2d1ca3f93423f1acbfc782e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444539
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-21 18:38:07 +00:00
Shuhei Matsumoto
d0efddd281 iscsi: Move macro constant DEFAULT_MAX_QUEUE_DEPTH to the appropriate location
This macro constant is not related with data size and should be moved
to the separate location.

Change-Id: I73b337f5750c39d1f87591c2e372664019e50b95
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444545
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-21 18:38:07 +00:00
Ziye Yang
2da86de69f nvmf/tcp: fix error message printing in spdk_nvmf_tcp_qpair_set_recv_state
If the current recv_state of qpair is same with the state to be set,
we will print error message. And checked the current code,
we should add a check to avoid this.

Change-Id: I49334f637c48e565e785d1fe6d0f000e18b2048a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445653
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-21 18:04:10 +00:00
heyang
7cd3a6f5e0 nvme: add memory barrier in completion path for arm64
Add a memory barrier for arm64 to prevent possible reordering
of tracker and cpl access,
because arm64 has less strict memory ordering behavior than x86.

Change-Id: I0a8716f7bfeffb0bbce27ee3174e214c8e4566b4
Signed-off-by: heyang <heyang18@huawei.com>
Reviewed-on: https://review.gerrithub.io/c/442964
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-21 18:02:31 +00:00
Changpeng Liu
f8dfbc5e9f bdev/nvme: set hotplug poller with default period value
If users didn't set the "HotplugPollRate" field, the value
will be set to NVME_HOTPLUG_POLL_PERIOD_MAX, which isn't
aligned with our design purpose.

Change-Id: I9795d7a16a1cc44ed4de7c40f376c563d977b455
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445077
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-21 09:05:55 +00:00
Robert Bałdyga
ec50de0957 bdev/ocf: Add missing error handling in bottom adapter
Signed-off-by: Robert Bałdyga <r.baldyga@hackerion.com>
Change-Id: Iffa18e578511ad656cc4aae097f0066c0a2709eb
Reviewed-on: https://review.gerrithub.io/c/445032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-21 07:39:59 +00:00
Ziye Yang
a1c5442d16 nvmf/tcp: remove the tqpair->group = NULL statement
Purpose: solve the coredump issue for the buffer
return later in spdk_nvmf_tcp_request_free_buffers.

If keep this statement, we cannot return the buffer
to the polling group.

Change-Id: Ib5c95ba54b37540950e654110fe6317cab507076
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445435
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-21 03:37:47 +00:00
Ziye Yang
3a486ab6be nvme/tcp: remove the unnecessary active_r2t_reqs
Change-Id: I3ce4c8cfce5f3e7c2e05b4fa11322805a08ec688
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445240
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-20 21:47:02 +00:00
Ziye Yang
14e1d0c747 nvme/tcp: call nvme_ctrlr_add_process in construct function.
Purpose: to make the timeout work for NVMe TCP transport,
we miss this for TCP transport.

Change-Id: Iab4af988cc4796b4d6d98430453f3dbce1fcf313
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445117
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-20 20:27:25 +00:00
paul luse
ba82b412cb bdev/crypto: fix error path memory leak in driver init
This patch refactors driver init and in doing so eliminates the mem
leak described in the GitHub issue.  Also it is now consistent with
how the pending compression driver does init.

Fixes #633

Change-Id: Ia2d55d9e98fb9470ff8f9b34aeb4ee9f3d0478f5
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442896
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-20 20:22:16 +00:00
Ziye Yang
73c5108684 bdev/nvme: Enable the timeout function if timeout value is provided
We should not add addtional check since we already have this
option in timeout_cb function, the addtional check is unnecessary.

Change-Id: I77c89303155e0c14072a1838994f9e76a0ffc0f4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-20 20:20:21 +00:00
Ziye Yang
7bf5e1dee3 nvme/tcp: Implement nvme_tcp_qpair_fail function.
This patch is used to implement this function.
Since we need to call nvme_tcp_req_complete in this
function, so we need to adjust the location of the
nvme_tcp_rep_complete funtion.

Change-Id: I5fc3693aec8dc166ac1eb03babcd2d73d7b00e63
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444489
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2019-02-20 20:18:46 +00:00
Shuhei Matsumoto
e7dc23696b scsi: Inline spdk_bdev_scsi_read/write into spdk_bdev_scsi_read_write
In this patch series, spdk_bdev_scsi_read and spdk_bdev_scsi_write
became almost identical. Hence squash them into spdk_bdev_scsi_read_write.

Change-Id: Ibbaddf74c1bf2dac37a0133eac27086af650a061
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444780
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-20 20:17:56 +00:00
Shuhei Matsumoto
07e9a00b60 scsi: Use spdk_bdev_writev_blocks instead of spdk_bdev_writev
This is in a effort to consolidate SCSI read and write I/O
for the upcoming transparent DIF support.

Previously conversion of bytes and blocks are done both in
SCSI layer and BDEV layer. After the patch series, conversion is
consolidated into SCSI layer.

Change-Id: Ib964a41ec22757f2a09cea22f398903f78d0781f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444779
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-20 20:17:56 +00:00
Shuhei Matsumoto
56e12b0071 scsi: Use spdk_bdev_readv_blocks instead of spdk_bdev_readv
This is in a effort to consolidate SCSI read and write I/O
for the upcoming transparent DIF support.

Previously conversion of bytes and blocks are done both in
SCSI layer and BDEV layer. After the patch series, conversion is
consolidated into SCSI layer.

About conversion from bytes to blocks, we don't expose bdev API
spdk_bdev_bytes_to_blocks and but create private helper function
_bytes_to_blocks because we will use not block size but data
block size when we support transparent DIF feature.

Change-Id: I37169c673479c92e027e2507a0e54a1e414b43e1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-20 20:17:56 +00:00
Shuhei Matsumoto
05f30ca396 scsi: Remove the param xfer_len and refactor spdk_bdev_scsi_read/write
The last parameter xfer_len of spdk_bdev_scsi_read is not used,
and of spdk_bdev_scsi_write is used only to check task->transfer_len.

Hence remove the last parameter xfer_len from spdk_bdev_scsi_read/write
and extract the check operation from spdk_bdev_scsi_write and insert
it into spdk_bdev_scsi_read_write.

Additionally, remove a debug log because xfer_len is not passed to
spdk_bdev_scsi_write anymore. Hopufully, this will not degrade any
maintainability.

On top of this, factoring out the operation to convert byte to
block in spdk_bdev_scsi_read/write be done.

Change-Id: I35faca269a9c4a7f15d27e8e61b6a1b809a36b3f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-02-20 20:17:56 +00:00
Jim Harris
c7598147ff bdev/iscsi: remove unused master_ch
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibef9c3aebea5aa4516e05bf48254c915cb4c8a34

Reviewed-on: https://review.gerrithub.io/c/445352
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-20 17:02:02 +00:00
Jim Harris
608a2d5875 env/memory: add inline tag to spdk_mem_map_translate
This helps ensure it gets inlined in the spdk_vtophys
code path, now that spdk_vtophys is defined in the same
compilation module.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0d0d9bba4295f0d9a7c0657834aa5d39f3b682d8

Reviewed-on: https://review.gerrithub.io/c/445354
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-20 16:41:46 +00:00
Jim Harris
f74d069ec0 env_dpdk: move vtophys.c contents to memory.c
CPU profiling on workloads with intensive vtophys
operations (i.e. very small CB-DMA transfers) exposed
overhead introduce by spdk_vtophys having to call
spdk_mem_map_translate in a different compilation
unit.  Let's just move the vtophys.c contents into
memory.c so that spdk_vtophys can inline
spdk_mem_map_translate and avoid this extra overhead.

This of course breaks the memory and vtophys unit
tests, so some additional changes are needed there
to keep everything linking correctly.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I295ed5f441d3eec7abdbc9d881c49d2174ec9f48

Reviewed-on: https://review.gerrithub.io/c/444975
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-20 16:41:46 +00:00
Ziye Yang
8bbf0391f6 event: Change the base to 0 when calling strtol
Previously, we can -p + hex value(e.g., 0x1) to assign the master core
and start the NVMe-oF or iSCSI target app.

However now it is not supported and prints error. I checked
the code, it only supports transformation with Decimal format,
so chaning the base to 0 to make it supporting other formats.

Change-Id: I82510ba0cef47b5593484b4fd3490f85c93cf6a5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444830
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-18 07:55:36 +00:00
Jim Harris
b01a393382 ioat: reduce completion writebacks
Do not set the completion_update bit except on
the last descriptor built before the dmacount doorbell
is written.  This allows much better batching of
completions (to match batching of the submissions).

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idd0281fb2e9e1ad2eb0f65f097c54fc051dfd935

Reviewed-on: https://review.gerrithub.io/c/444974
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-18 07:44:17 +00:00
Jim Harris
c258d73feb ioat: add APIs to only build descriptors
Add spdk_ioat_build_copy and spdk_ioat_build_fill
which mirror the existing spdk_ioat_submit_copy
and spdk_ioat_submit_fill.  These new functions
*only* build the descriptors in the ring - they do
not write the doorbell.  This enables batching
which can significantly improve performance by
reducing the number of MMIO writes.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia3539f936924b7f833f4a7b963d06ffefa68379f

Reviewed-on: https://review.gerrithub.io/c/444973
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-18 07:44:17 +00:00
Jim Harris
ec95646a61 ioat: make ioat_flush a public spdk_ioat_flush API
This will enable batching of doorbell writes in
future commits.  For now, just make the API public.

This is the first in a series of patches that
drastically improves performance for high queue
depth CB-DMA workloads.  Some basic tests on
my Xeon E5-v3 platform shows about 4x improvement
for 512B transfers.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia8d28a63f5020ae8644c1efdec7f68740bb6920c

Reviewed-on: https://review.gerrithub.io/c/444972
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-18 07:44:17 +00:00
Ziye Yang
d4875ed89e nvme/tcp: add nvme_tcp_qpair_check_timeout function.
To enable the timeout function.

Change-Id: Id5c40848957743683b6a5c2d085e7f777f14497d
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/444803
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 22:03:44 +00:00
Xiaodong Liu
36e8c20fe9 nbd: avoid impact to device setup by other task
Use NBD_SET_SOCK to check whether the nbd device is setup
by other process or whether nbd kernel module is ready
before other nbd ioctl operations. This can avoid bad
influence to the nbd device setup by other process.

Change-Id: Ic12acbfddb8c4388e25731c39159b1ce559b8f23
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444805
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 22:02:48 +00:00
Xiaodong Liu
d8f5fead29 nbd: avoid unlimited wait for device busy
The ioctl NBD_SET_SOCK can return EBUSY on conditions not
only the kernel module hasn't loaded entirely yet, but
also the nbd device is setup by another process, which will
lead the poller's infinite polling.
This patch will wait only 1 second if device is busy.

Change-Id: I8b1cfab725cba180f774a57ced3fa4ba81da2037
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444804
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 22:02:48 +00:00
wuzhouhui
0238b5c42a bdev/ftl: unlock g_ftl_bdev_lock before unregister ftl_bdev
There is no need to lock g_ftl_bdev_lock when unregister a ftl_bdev.
Besides, the destructor of ftl_bdev will lock it again.

Change-Id: I99870483183879d9422584dbac6e154f605daea8
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/444794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 21:42:58 +00:00
Wojciech Malikowski
b9e462e4a6 lib/ftl: Fix band's metadata inconsistency with L2P
Added check before write submission to indicate if
LBA was update in meantime. In such case don't set band's
metadata and rwb entry cache bit. Previous implementation
invalidates such address during write completion and could
cause that inconsistent lba map was stored into disk.

Change-Id: I4353d9f96c53132ca384aeca43caef8d11f07fa4
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444403
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:42:14 +00:00
Darek Stojaczyk
5d07cfad54 vhost/scsi: handle io_channel allocation failure
We assumed io_channel allocation always succeeds, but
that's not true. Doing I/O to any vhost session that
failed to allocate an io_channel would most likely
cause a crash.

We'll now detect io_channel allocation failure and
print a proper error message. The SCSI target for
which the channel allocation failed simply won't be
visible to the vhost master. All I/O to that target
will be rejected.

We should probably report the error to the upper
layer and either prevent the device from starting
or fail the SCSI target hotplug request. But for now
let's just prevent the crash.

Change-Id: I735dfb930d8905f70636a236b4fa94288d0aaf3a
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:41:00 +00:00
wuzhouhui
6b0d7b82c9 ocssd: hold lock when calling nvme_ctrlr_submit_admin_request
nvme_ctrlr_submit_admin_request() will access admin queue, and we
should hold ctrl->ctrlr_lock when access it.

Change-Id: Iff576fe5e14e854eb38dbc64d6c6d9ec1ba17056
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/444793
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:27:58 +00:00
kreuzerkrieg
64faa14d6e nvme: make the completion status string accessible from external applications
Signed-off-by: kreuzerkrieg <kreuzerkrieg@gmail.com>
Change-Id: Ifdcf7ab7ce7e7449a33d52f8308f537b0e26a238
Reviewed-on: https://review.gerrithub.io/c/444519
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-15 21:11:28 +00:00
Changpeng Liu
5a26346a71 nvme: move condition check into nvme_init_controllers()
Also use the same style condition check for secondary process
with PCIE type.

Change-Id: I93c83126145255887914ef5efea1a493c8f7f767
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444492
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-15 21:04:19 +00:00