Commit Graph

343 Commits

Author SHA1 Message Date
wuzhouhui
d1399f4410 bdev: remove unnecessary if when destroy shared_resource
When create channel, the fields that _spdk_bdev_channel_destroy_resource()
checked are always be set to non-null. Remove these unnecessary if
statements makes issue exposed more easily if something goes wrong.

Change-Id: I2d505c87176d4d49eb1528a258e4bea6477e0fe6
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/438799
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-01-07 06:03:52 +00:00
lorneli
02be32d482 bdev: optimize coversion of bytes to blocks
Since division is more expensive than right shift operator, in
function spdk_bdev_bytes_to_blocks, use right shift instead of
division if the blocklen of bdev is a power of two.

Change-Id: Ib3dbc792e86582bba30b3dc028efbd12c69075ba
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/438318
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-01-02 08:31:59 +00:00
GangCao
70a3488657 bdev/qos: enable and disable when the QoS thread is not set
In the case the QoS thread has not properly initialized yet,
needs to go through the regular QoS enabling process to notify
all the channels and also disable the QoS properly. The channel
and poller related staff also needs to be handled together
with the thread.

Change-Id: Ifc2b2cdfb1181aa6418ad1d43ae5905c0c317549
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/437519
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-12-19 18:17:02 +00:00
Liu Xiaodong
0e7ca66922 lib/trace: show specific usage of trace mask
Previously, if want to know which mask bit is used for specific
trace group, the only way is to check source code. Now list
each trace group with its trace tpoint group mask bit in
usage message

Change-Id: I7a85fe9c0885f1919f6ffbdc97dab81f1986fb07
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/435448
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-11-30 14:52:28 +00:00
Piotr Pelplinski
676717e4da bdev: calculate tsc_diff in bdev_io_complete
This will be required in following histogram patches.

Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I2eee6629243b7a4838a80dc1de33ae485c58081e

Reviewed-on: https://review.gerrithub.io/433874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-11-29 03:59:32 +00:00
Tomasz Zawadzki
6177d220fb bdev/qos: assert io channel when acquiring new reference
Change-Id: Ib546f14be158404ec14067e0f4f1ecb60def0f02
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/434321
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-11-26 18:19:26 +00:00
Andrey Kuzmin
af034b6de4 bdev: unregister bdevs top-down during shutdown.
There are some use cases such as multipath and RAID expansion where a
vbdev could have been registered before one of its base bdevs.

Currently we unregister bdevs at shutdown in reverse order of their
registration.  Continue to do that in general, but skip any bdev that
is still claimed.  Any bdevs skipped in this way will eventually be
unregistered once any bdevs that have claimed it have completed
unregistration.

Change-Id: Iafde9558430bc5ce56e8608ef50bcb2b5fbfbf71
Signed-off-by: Andrey Kuzmin <akuzmin@jetstreamsoft.com>
Reviewed-on: https://review.gerrithub.io/432136
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-11-20 22:49:23 +00:00
Piotr Pelplinski
c1f1a876aa bdev: double buffering for unaligned buffers
Now, that _spdk_bdev_io_get_buf offers allocating aligned buffers,
add possibility to store original buffer and replace it by aligned
one for the time of IO.

Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: If0ed306175631613c0f9310dccaae6615364fb49

Reviewed-on: https://review.gerrithub.io/429754
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-11-08 23:11:17 +00:00
Piotr Pelplinski
f06d3d3bbf bdev: add possibility to allocate aligned buffer in _spdk_bdev_io_get_buf
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I275de0d7d216b15924034d08a6d5c727e3764c82

Reviewed-on: https://review.gerrithub.io/429187
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2018-11-05 22:32:58 +00:00
Piotr Pelplinski
092de1460a bdev: set optimal io boundary to size of large buf pool
This is requirement for following patch. Requests that will
reuquest bounce buffer can only allocate limited size buffer.

Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I850b614305d66065733381ceb7bd67d4b1cad6b3

Reviewed-on: https://review.gerrithub.io/430783
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-11-05 22:32:58 +00:00
Piotr Pelplinski
85b805f72e bdev: rename need_aligned_buffer to required_alignment
This patch changes the name of the field. Following patches
will introduce logic that will guarantee that buffers
provided to bdev module will be aligned to value
specified in this field


Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I5329b9fe26ef2417bc7beae86518cc643b263f97

Reviewed-on: https://review.gerrithub.io/430782
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-11-05 22:32:58 +00:00
GangCao
91420344f4 Bdev/QoS: return actual submitted IO count
SPDK poller function will return the different value for
the different status. Update the QoS poller for this change.

Change-Id: Ia384bf00a23713df663c317b1997ead441c5adcb
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/428573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-11-02 18:02:37 +00:00
shahar salzman
1ca4b47252 lib/bdev: reset bdev internal properties
Initiailize fields later assumed to be NULL

Change-Id: I61e054dd275c6c04fb3f826adc445e56f0add331
Signed-off-by: shahar salzman <shahar.salzman@kaminario.com>
Reviewed-on: https://review.gerrithub.io/428304
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-10-12 22:50:02 +00:00
shahar salzman
3e868ad401 lib: reset globals to allow re-init
Change-Id: I96b5410a92f176aef11e00829fdebd36910ac2d4
Signed-off-by: shahar salzman <shahar.salzman@kaminario.com>
Reviewed-on: https://review.gerrithub.io/428302
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-12 22:50:02 +00:00
Shuhei Matsumoto
5616c1ed9c bdev: Change split IOV submission from sequential to batch
Large read I/O will be typical in some use cases such as
web stream services. On the other hand, large write I/O
may not be typical but will be sufficiently probable.

Currently when large I/O is submitted to the RAID bdev,
the I/O will be divided by the strip size of it and then
divided I/Os are submitted sequentially.

This patch tries to improve the performance of the RAID bdev
in large I/Os. Besides, when the RAID bdev supports higher
levels of RAID (such as RAID5), it should issue multiple
I/Os to multiple base bdevs by batch fasion in the parity
update. Having experience in batched I/O will be helpful
in the future case too.

In this patch, submit split I/Os by batch until all child IOVs
are consumed or all data are submitted. If all child IOVs are
consumed before all data are submitted, wait until all batched
split I/Os complete and then submit again.

In this patch, test code is added too.

Change-Id: If6cd81cc0c306e3875a93c39dbe4288723b78937
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/424770
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-10-10 17:19:32 +00:00
wuzhouhui
3785a4d83b bdev/qos: fix a heap-use-after-free error
When destroy qos, spdk_bdev_qos_destroy() allocates a new qos, and swap
old one. After spdk_bdev_unregister() frees the new qos, the old qos poller
may still reference new qos via bdev->internal.qos. Fix this error by
using old qos in _spdk_bdev_qos_io_submit().

Reported in 72aac51430.1539054028/ubuntu16.04/build.log

Change-Id: Id1bce6c8b1cefae604dd2c69e8f3482ec34b1b54
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/428444
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-10-09 22:17:03 +00:00
GangCao
868c28cd13 QoS/Bdev: add the RPC support for the bandwidth rate limit
This patch added the support of RPC method to enable,
adjust, disable the bandwidth rate limit on the bdev.

And it can work together with the existing IOPS rate limit.

The RPC method has been consolidated to support both IOPS
and bandwidth rate limits as below:

usage:
rpc.py set_bdev_qos_limit [-h]
                          [--rw_ios_per_sec RW_IOS_PER_SEC]
                          [--rw_mbytes_per_sec RW_MBYTES_PER_SEC]
                          name

positional arguments:
  name       Blockdev name to set QoS. Example: Malloc0

optional arguments:
  -h, --help show this help message and exit
  --rw_ios_per_sec RW_IOS_PER_SEC
             R/W IOs per second limit (>=10000, example: 20000).
             0 means unlimited.
  --rw_mbytes_per_sec RW_MBYTES_PER_SEC
             R/W megabytes per second limit (>=10, example: 100).
             0 means unlimited.

Change-Id: I9c03cd635280add01801a81c6a6c02f0cf85bee1
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/416511
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-10-09 18:26:44 +00:00
Shuhei Matsumoto
8a295129e1 bdev: Fix scan build error by checking the limit of parent iovec
scan-build requests to check the size of parent iovec by using
artificially large LBA in unit tests.

Fix the error by using not pointer but position and checking if
position is less than count of parent iovec.

Change-Id: I74c4f6d1b68ecfca93e9247acc5ac6bd5412a960
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427965
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-10-05 21:57:34 +00:00
Pawel Wodkowski
c4fee1e970 mk: don't use '-include spdk/config.h'
Each file that need to check SPDK_CONFIG_* options need to include
spdk/config.h explicitly.

Change-Id: If9f2a91ac4c2b1a300dcf88ec3e2a12714ad344a
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/427221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-02 23:13:32 +00:00
GangCao
7d030ef7fc QoS/Bdev: add the QoS related structure and enumeration
This patch is to introduce the specific QoS related structure
and the enumeration for types of QoS rate limits. Later new
types of QoS rate limits can be supported easily.

Change-Id: Idb8d2e7627fd145bf2b0ddb296c968b6b068f48c
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/424459
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-10-02 22:10:13 +00:00
Shuhei Matsumoto
6fa7e38667 bdev: Avoid assert and factor out queue IO operation in _bdev_write_zero_buffer_next
Currently write to bdev fails in _spdk_bdev_write_zero_buffer_next
due to other than -ENOMEM, assert is called.

RAID bdev using this feature is generally availale now and it will be
OK to remove this assert and return error instead.

Additionally, applying the factored function to _bdev_write_zero_buffer_next
will improve readability slightly.

These two changes are done in this patch.

Change-Id: I462630a71e57e2e5146b085b215d62a378ea9402
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427186
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-09-29 03:12:24 +00:00
Shuhei Matsumoto
9d4708f35e bdev: Factor out queueing IO operation in _spdk_bdev_io_split
This patch factors out that operation into a function and adds
error handling to that operation to improve readability slightly.

Change-Id: Ic24df0c0a9abbebc38d30fc17779dc5a5f6138a6
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2018-09-29 03:12:24 +00:00
Shuhei Matsumoto
9872b99206 bdev: Avoid assert when read/write to bdev fails in _spdk_bdev_io_split
Currently when read/write to bdev fails in _spdk_bdev_io_split_with_payload
due to other than -ENOMEM, assert is called.

RAID bdev that utilizes the split IO feature is generally availale now
and it will be OK to remove this assert and return error instead.

Change-Id: I6ea6fd45b94bff0ea84e498e0c4dfd1dd31e0260
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-09-29 03:12:24 +00:00
Shuhei Matsumoto
0df515a842 bdev: Remove limitation of child iov size in bdev_io_split_with_payload()
When a bdev IO is split, if iovec size in a strip is more than 32,
the IO will fail.

Remove the limitation by spliting the split IO further.

Change-Id: I962ad86dfe63ea1fcd86ffa52ead7452fb80e53d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/425876
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2018-09-29 03:12:24 +00:00
Dariusz Stojaczyk
1e4f9974a7 bdev: do not finish unitialized modules
To achieve its goal, this patch changes the order
in which bdev modules are finished. All modules
that examine bdevs (e.g. lvol,split,...) will be now
finished last. It should not cause any issues though,
since all bdevs are already removed at the time when
any module finish is called

Fixes #387

Change-Id: Id60c375eb5c3d7306b69cdce86bded77354868d8
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/421158
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-09-26 21:11:51 +00:00
Shuhei Matsumoto
bc3dfe3043 bdev: Fix the second parameter success of spdk_bdev_io_completion_cb
The type of the second parameter `success' of spdk_bdev_io_completion_cb
is bool. Hence change the code to use bool type success or failure.

Change-Id: I9e93f4ccbb085e8e184f209e706915dcd34aa966
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/426648
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-09-26 09:03:15 +00:00
Jim Harris
01e7c02e15 bdev: call spdk_bdev_io_get_buf before splitting
We cannot split an iov if a buffer hasn't been
allocated yet.  So always call spdk_bdev_io_get_buf
on reads before trying to split.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2c26efb9dc6cb2c7c3e3b7ae5bab2c37844b9113

Reviewed-on: https://review.gerrithub.io/424879
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-09-07 14:59:21 +00:00
Jim Harris
de4d961996 bdev: set iovs on correct bdev_io in spdk_bdev_io_put_buf
spdk_bdev_io_put_buf() is responsible for reclaiming
bdev-allocated buffers from a bdev_io.  If there are
bdev_ios waiting for one of these buffers, it calls
spdk_bdev_io_set_buf() on the next bdev_io in the queue.
This will set the iov_base and iov_len on the bdev_io
to point to the bdev-allocated buffer.

But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf()
on the just completed bdev_io, not the next bdev_io in the
queue.  So fix that.

Fixes: 844aedf8 ("bdev: Simplify get/set/put buf functions")
Reported-by: Alan Tu
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638

Reviewed-on: https://review.gerrithub.io/424880
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-09-06 23:42:32 +00:00
wuzhouhui
f118de60af bdev: fix race condition between spdk_bdev_close and _remove_notify
When new bdev was created, the struct spdk_bdev_module::examine_disk()
may open and close bdev. On the other hand, if something goes wrong,
the creation procedure may unregister new created bdev, so race
condition appeared between _remove_notify() and spdk_bdev_close().

Add the new field "closed" and "remove_notified" in struct spdk_bdev_desc,
so _remove_notify() and spdk_bdev_close() knows how to deal with this
situation.

Change-Id: Ibfe915a4d76096796b039a13a4f49f26669eba2c
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/423369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-09-06 20:59:05 +00:00
Ben Walker
898739fbac bdev: Enforce that spdk_bdev_close() is called on same thread as open()
spdk_bdev_close() must be called on the same thread as
spdk_bdev_open(). Further, the remove callback on the
descriptor will also be run on the same thread as
spdk_bdev_open().

Change-Id: I949d6dd67de1e63d39f06944d473e4aa7134111b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424738
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2018-09-06 20:59:05 +00:00
Jim Harris
73c8b61cac bdev: free bdev_io in write_zeroes emulation
When emulating write_zeroes commands on device that
don't natively support it, we submit a write with
a zeroed buffer.  We used to just reuse the original
bdev_io, but that was recently changed due to other
splitting code added for iovs.  But when making those
changes, we forgot to free the bdev_io for the
write that was sent down to the device.

Fixes: 183f37e8 (bdev: do not reuse bdev_io when...)

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If08782c65f6305c0a9f9d15d74fd8823e1158e9b

Reviewed-on: https://review.gerrithub.io/424733
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-09-06 20:42:47 +00:00
Jim Harris
ba38785b7e bdev: remove get_bdevs_config RPC
This RPC does not work for a lot of bdev types.  For
example, NVMe namespaces and virtio scsi LUNs are not
explicitly constructed by an RPC - they are indirectly
constructed by an RPC associated with an NVMe controller
or virtio-scsi controller.

While here, remove spdk_bdev_config_json.  It was
only created to facilitate this get_bdevs_config RPC.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I537166d8f91ab458bd2000859d74f7254bfc9c0a

Reviewed-on: https://review.gerrithub.io/424584
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-09-05 19:46:17 +00:00
Jim Harris
30dc6a1893 bdev: don't output "name" when write_config_json not specified
This isn't valid RPC so it needs to be removed.  Bdev modules were
working around this issue by defining empty write_config_json
methods.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3c4c20249eadfcfb4103430f5801190b14897249

Reviewed-on: https://review.gerrithub.io/424582
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2018-09-05 19:46:17 +00:00
Jim Harris
82c3c30f44 trace: remove alias concept
This was added a long time back for tracking an rte_mbuf
whose buffer was a different rte_mbuf - all related to
a userspace TCP stack that is no longer in development.
The concept isn't useful now, so remove it to reduce
the complexity of the tracing code.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I310e492eba7f55df242bb29d82fb19f6daee1f51

Reviewed-on: https://review.gerrithub.io/424565
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-09-05 18:03:43 +00:00
Ben Walker
c94020001a thread: Add a name parameter to spdk_register_io_device
This is a string name used for debugging only.

Change-Id: I9827f0e6c83be7bc13951c7b5f0951ce6c2a1ece
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424127
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-09-05 16:00:54 +00:00
Jim Harris
afaabcce23 bdev: add trace points
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I37e7f2fb19fecfe5933b4815d24240954b74b62b

Reviewed-on: https://review.gerrithub.io/424278
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-09-04 17:09:25 +00:00
Jim Harris
86d77e2eb6 bdev: account for missed qos timeslice timeouts
There could be cases (especially in virtualized and/or test
environments) where we could accumulate significant skew in
the timeslice frequency.  Rather than depend on the application
framework to try to guarantee the rate of timeslice poller
callbacks, keep track internally of the last time the poller
was invoked.  If/when we accumulate and detect skew equivalent
to one or more timeslices, increase the allowed IO and bandwidth
of the next timeslice to accomodate.

Since bdev poller now calls spdk_get_ticks() to do accounting,
this patch also fixes up the increment_time() unit test function
and the test env layer to properly increment the fake TSC.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba301ddc0fb3d02042106a8bf6e4a6a9a84dc263

Reviewed-on: https://review.gerrithub.io/423580
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
2018-09-02 23:59:27 +00:00
Jim Harris
4f860d7e40 bdev: only apply split_on_optimal_io_boundary to R/W
Splitting a 1TB unmap into individual 64KB unmap commands
(for a RAID volume with 64KB strip size) would be awful -
the RAID module can be much smarter about this.

So back out the changes for splitting I/O without payload.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I24fe6d911f4e3c9db4b2cb5d66c7236a5596e0d9

Reviewed-on: https://review.gerrithub.io/424103
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2018-08-31 17:53:42 +00:00
Jim Harris
183f37e8ad bdev: do not reuse bdev_io when splitting write_zeroes
Now that we split on I/O boundaries, that code needs to
be able to use the bdev_io split* members to track
what is left to submit.  This means that the write_zeroes
code cannot submit the parent bdev_io as the child bdev_io,
since the I/O boundary code will overwrite the write_zeroes
split accounting.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9316b59267508f60799766fc4f1ea05a4b3e5d9e

Reviewed-on: https://review.gerrithub.io/423404
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-08-29 16:42:53 +00:00
Jim Harris
5443e0aed3 bdev: count down for qos tracking
This will simplify some future patches which will
account for missed timeslice timers by allowing
additional IO/BW in the following timeslice.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9dd46a768c98ce267c733a9f9719a2d3d2c3c915

Reviewed-on: https://review.gerrithub.io/423579
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-08-29 16:42:39 +00:00
Ben Walker
e6bbe23277 util: Move definition of SPDK_SEC_TO_USEC to util.h
This was defined in two places, so consolidate
the definitions.

Change-Id: I0bbb262b97e90d1064bcc50ee201928f6ca9518a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/423182
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-08-27 18:44:51 +00:00
Chen Wang
6fa48bbf62 lib: fix typos in the lib directory
Change-Id: Idcb60b79d2902bb316facc6f60e0a81e5cf847ed
Signed-off-by: Chen Wang <chenx.wang@intel.com>
Reviewed-on: https://review.gerrithub.io/423372
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2018-08-24 17:15:12 +00:00
Jim Harris
4bd9762165 bdev: add split_on_optimal_io_boundary
A number of modules (RAID, logical volumes) have logical
"stripes" that require splitting an I/O into several
child I/O.  For example, on a RAID-0 with 128KB strip size,
an I/O that spans a 128KB boundary will require sending
one I/O for the portion that comes before the boundary to
one member disk, and another I/O for the portion that comes
after the boundary to another member disk.  Logical volumes
are similar - data is allocated in clusters, so an I/O that
spans a cluster boundary may need to be split since the
clusters may not be contiguous on disk.

Putting the splitting logic in the common bdev layer ensures
bdev module authors don't have to always do this themselves.
This is especially helpful for cases like splitting an I/O
described by many iovs - we can simplify this a lot by
handling it in the common bdev layer.

Note that currently we will only submit one child I/O
at a time.  This could be improved later to submit multiple
child I/O in parallel, but the complexity in the iov splitting
code also increases a lot.

Note: Some Intel NVMe SSDs have a similar characteristic.
We will not use this bdev stripe feature for NVMe though -
we want to primarily use the splitting functionality inside
of the NVMe driver itself to ensure it remains fully
functional.  Many SPDK users use the NVMe driver without
the bdev layer.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife804ecc56f6b2b55345a0d0ae9fda9e68632b3b

Reviewed-on: https://review.gerrithub.io/423024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-08-23 16:08:21 +00:00
Seth Howell
b7d9caf2e6 bdev: increment io_time if queue depth > 0
This value is used to calculate the disk utilization of a given bdev.

Change-Id: I4bf101c524b92bdd21573941e17f61db59c5c6b8
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/423017
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-08-22 20:34:09 +00:00
Jim Harris
9f2dd0c4f8 bdev: save the bdev_desc specified when submitting the I/O
This will be needed for using this same descriptor when
splitting an I/O.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idec759df7ab27f8de567d3c8a4214e25dbe173f5

Reviewed-on: https://review.gerrithub.io/423022
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-08-22 16:29:58 +00:00
wuzhouhui
6deac3e660 bdev/lvol: using spdk_bdev_alias_del_all() to delete all alias on destroy
So we don't need to allocate memory (maybe failed) just for free other
memory.

Change-Id: I2c83f6acc2aa6ed79455bff90f952a2e70b44d59
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/422203
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2018-08-21 00:53:31 +00:00
Jim Harris
671b77e5cd bdev: remove vbdevs and base_bdevs arrays
The intents of these arrays was to keep track in the
bdev layer of all base<->virtual bdev relationships -
i.e. which member disk bdevs make up a RAID bdev,
which logical volume bdevs are associated with a
bdev that contains an lvolstore, etc.

Currently none of this is used however.  And trying
to keep track in the bdev layer instead of asking
the bdev modules for the relationships has a number
of complications.  Early one, we tried to do this
with TAILQs - but that doesn't work since this can't
be done with a single TAILQ_ENTRY in the bdev
structures.  So we moved to arrays - that works a bit
better, but then the pointer arrays have to be
realloc'd which isn't ideal.

The biggest problem though with these arrays is that
they held bdev pointers - not bdev descriptor pointers.
It's not really valid to access bdevs without a
descriptor - the descriptors are what make sure active
references are accounted for when a bdev is hotplugged.
Of course the bdev layer knows when a bdev is getting
removed and could go and do the updates to these
arrays separately - but that just seems very convoluted.

So for now just remove these arrays completely.  If
there is a future need for the bdev layer to
understand relationships between bdevs, we can add
module APIs so that the generic layer can ask
the modules about the relationships.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99ef1068240bff1262f64f234260cf2fb44df51d
Reviewed-on: https://review.gerrithub.io/420932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
2018-08-17 00:11:03 +00:00
Jim Harris
01035cd49f bdev: cleanup registered bdevs in reverse order
This ensures (for example) that a RAID volume is
unregistered before its member disks.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7a7c16acc351f2d5d4218b64b370e2c77c6e2b5e
Reviewed-on: https://review.gerrithub.io/420812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-08-17 00:11:03 +00:00
Jim Harris
c899854d03 bdev: add new fini_start notification callback for modules
When an SPDK application shuts down, the bdev layer will
automatically unregister all of the bdevs to ensure they
are properly quiesced and cleaned up.

Some modules may want to perform different operations when
a bdev is destructed during normal runtime vs. shutdown.
For example, for lvol, when the last lvol is cleaned up,
it should unload the lvolstore, release and close the bdev
that contains the lvolstore.  You never want to do this
during normal runtime though - it is perfectly valid to
have an lvolstore that contains no lvols.  RAID and future
bdev modules such as multipath have similar use cases.

So add a new bdev module callback named "fini_start".
If a module specifies a function pointer for this callback,
the bdev layer will call it before it starts the bdev
unregistrations.

This enables some future patches to the bdev layer such
that it will always unregister block devices that are not
claimed (i.e. logical volumes) before block devices that
are claimed (i.e. the bdev containing an lvolstore).

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6e87f5c2b27f16731ea5def858f26e882a29495a

Reviewed-on: https://review.gerrithub.io/421175
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-08-15 23:32:04 +00:00
Changpeng Liu
ff458be850 nvmf: claim each bdev when constructing new Namespace
Claim the block device when adding it to a new Namespace,
and prevent the block device to be added twice for other
modules and Namespaces.  Also remove the test that using
same block device over different Namespaces.

Fix issue #371.

Change-Id: Ib7ce18e9fde4a15c0f19ce9e28e69145e54570e0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/420472
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-08-06 21:14:37 +00:00