Commit Graph

88 Commits

Author SHA1 Message Date
Daniel Verkamp
7fd3a82561 bdev: rearrange struct spdk_bdev_io
Move the commonly-accessed fields to the front so they end up in the
same cache line where possible.

Also tweak the types of type, status, error.nvme.sct, error.nvme.sc,
error.scsi.sc, and error.scsi.sk (they can fit in 8 bits), and move the
Write Zeroes splitting variables into u.bdev.

This reduces sizeof(struct spdk_bdev_io) from 272 to 224, in addition to
the better cache line usage.

Change-Id: I4a91fd07f252e7add4a2db179df9c53268672198
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/404053
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-03-16 14:06:36 -04:00
Daniel Verkamp
d4ef1338b0 bdev: make QoS channel management thread safe
We must hold bdev->mutex around all QoS channel manipulations, not just
channel_count; otherwise, there are race conditions.

Change-Id: I6183aef83f4d5789bded426a1832e3faaa688363
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/403367
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-03-13 14:06:23 -04:00
Daniel Verkamp
19100ed580 bdev: rename spdk_bdev_module_if -> spdk_bdev_module
This better matches the style in the rest of SPDK.

No functional change - this is a pure find/replace of
spdk_bdev_module_if to spdk_bdev_module.  Instances of this struct will
be renamed in another patch.

Change-Id: I3f6933c8a366e625fc3a1b6401aee26ee03ba69c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/403368
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-03-13 00:55:12 -04:00
GangCao
a2142f3a33 bdev/qos: add the QoS rate limiting support on bdev
This patch is to add the basic support of QoS on bdev.

Including two major functionalities:

1. The QoS rate limiting algorithm:
	a. New IO will be always queued first also under
	   the no memory condition
	b. Start the QoS IO operation based on the limit
	c. A poller started in each millisecond to reset
	   the rate limit and send new IOs down
	d. The rate limit is based on the millisecond and
	   converted from user configurable IOsPerSecond

2. The Master Thread management:
	a. Add a per bdev channel_count
	b. Whenever QoS is enabled on bdev, if QoS bdev
	   channel is not created, create the QoS bdev
	   channel and assign the QoS thread
	c. When new IOs coming from different channels
	   (threads), pass the IOs to the QoS bdev channel
	   through the thread event
	d. When the IOs are completed from the QoS bdev
	   channel, pass the IOs back to its orignal
	   channel(thread)
	e. Destroy the QoS bdev channel when it is the
	   last bdev channel for this bdev. Defer the
	   destruction if current thread is not QoS thread

Change-Id: Ie4444551d7c3c7de52f6513c9db926628796adb4
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/393136
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-03-09 13:48:21 -05:00
Pawel Wodkowski
4d36735401 bdev: rework bdev module registration
Currently SPDK_BDEV_MODULE_REGISTER() take many parameters. Extending it
(eg for incoming JSON configuration dump/load) is quite challenging and
error prone. As we are already here in next patches, rework this macro
to take one parameter - the pointer to struct spdk_bdev_module_if.

This patch also remove following macros:
SPDK_GET_BDEV_MODULE - this is not really needed, to find module outside
module translation unit use spdk_bdev_module_list_find()

SPDK_BDEV_MODULE_ASYNC_INIT and SPDK_BDEV_MODULE_ASYNC_FINI - replaced
by bool fields in spdk_bdev_module_if struct.

Change-Id: Ief88e023fbbaee7d5402c838dbecbdffd4dfb259
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/402883
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-03-09 12:07:35 -05:00
Daniel Verkamp
364d4fdfe0 bdev: add spdk_bdev_get_uuid() function
Add a generic way to get a UUID from a bdev.

For now, malloc and null bdevs generate random UUIDs, and no other bdev
types report a UUID.

Change-Id: Id9608c8c1b3ce3f1783e7f74bef96d44cd5d98a7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/402177
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2018-03-08 10:49:51 -05:00
Daniel Verkamp
8a6ba58cb4 scripts/check_format: check for spaces before tabs
Automatically detect more whitespace errors.

All existing cases are fixed; only whitespace change (verify with
diff -w) except for one comment style fixup in include/spdk/nvme.h.

Change-Id: If750e54b9c8e3421ea6feda5f20184a31431631e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/402360
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2018-03-05 11:09:13 -05:00
Pawel Wodkowski
2939b715d0 bdev: rename 'dump_config_json' to 'dump_info_json'
Unfortunatly not all bdevs produce its configuration in responce to
get_bdevs RPC call (eg nvme is producing tons of additional
informations). To not breake any existing scripts rename
'dump_config_json' to 'dump_info_json' instead of reworking those
callbacks. Next patches will introduce real 'dump_config_json' handlers
and API

Change-Id: If9c1a4ab864791b24a5f7d022e970cd65990ffc0
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/401216
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-02-26 15:49:51 -05:00
Isaac Otsiabah
a4f3920d51 bdev: Added latency to channel statistics
Modified include/spdk/bdev.h and include/spdk_internal/bdev.h
add data members to capture statistics information. Modified
lib/bdev/bdev.c to calculate read/write latency.

Change-Id: Idcd55dd2e88c4b308e016f16ced53720256c79e3
Signed-off-by: Isaac Otsiabah <iotsiabah@us.fujitsu.com>
Reviewed-on: https://review.gerrithub.io/390654
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-01-30 12:41:41 -05:00
Slawomir Mrozowicz
d4822a7db5 bdev: Add bdev resize function
Add api and unit test functions for
change number of blocks for provided block device.

Change-Id: I55d67c99375cb88bdaa79ce1a36d4298223beddc
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Reviewed-on: https://review.gerrithub.io/390802
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-01-26 14:45:08 -05:00
Tomasz Zawadzki
b6aaba0852 bdev: remove vbdevs during spdk_bdev_unregister()
spdk_vbdev_unregister() is part of internal bdev API,
yet bdev module that uses spdk_vbdev_register() directly
will not be removed correctly when using delete_bdev RPC.
spdk_vbdev_unregister() is now consolidated with
spdk_bdev_unregister().

This comes up when deleting lvol bdev, as it does not use
spdk_bdev_part_* functions.
base_bdev->vbdevs entry was not removed for bdev that lvs
is created on.

Additionally patch expands test to create lvol bdev,
after removing it using delete_bdev RPC.
With ASAN enabled this would report accessing
already freed memory previously.

Change-Id: I9547e83862e2daa50355d56a1c9f453aaa6cfdb8
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/395711
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-01-24 15:50:04 -05:00
Jim Harris
b5f4a259bd bdev: save mgmt_ch that spdk_bdev_io was allocated from
This avoids having to dereference the spdk_bdev_io's
channel in the spdk_bdev_free_io() path.

Cleaning up after hotplug events should ensure that
all associated bdev_ios have been freed (not just
completed) before the bdev channels have been freed,
but this patch gives us some more wiggle room in this
area.

Results in a small (1-2%) performance degradation on
a bdevperf microbenchmark, but should result in no
noticeable difference on any real world workload.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id8b88300fc53e8c0b83309a738a4c3bd2aeaff52

Reviewed-on: https://review.gerrithub.io/394399
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
2018-01-12 13:27:20 -05:00
Jim Harris
cd55b3886e bdev: change spdk_bdev_io buf_link to STAILQ
This is more efficient and buf_link users don't
need the extra flexibility that TAILQ provides.

link could also be changed to an STAILQ, but its usage
is not in the performance path so not touching that
as part of this commit.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7df2cc24a3784add8370db859003783e92cbfc21

Reviewed-on: https://review.gerrithub.io/393834
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2018-01-10 09:51:31 -05:00
Sebastian Basierski
3f41a8e506 bdev: Return aliases list through get_bdevs
Change-Id: Ic0cdcf088ebd5053f2e69ad2e607ee825d96fcb6
Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Reviewed-on: https://review.gerrithub.io/390202
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2018-01-04 13:15:09 -05:00
Sebastian Basierski
b9afa3c732 bdev: Added bdev aliases list.
Added aliases list to bdev struct.
Added 2 API calls to add and remove aliases.
Added test for adding and removing aliases.

Change-Id: I1815aec8c02cfa398b2d1de41577197315665fdc
Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Reviewed-on: https://review.gerrithub.io/390200
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2018-01-04 13:15:09 -05:00
Daniel Verkamp
453f5ae9f6 bdev: unregister all bdevs in spdk_bdev_finish()
Instead of requiring each bdev module to track its own bdevs and clean
them up during its fini callback, we can walk the list of registered bdevs
during spdk_bdev_finish() and call spdk_bdev_unregister() on each one of
them before cleaning up the bdev modules.

Change-Id: I01816707c9100f66f542bfd73b90bcb0e0fb0c0c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/389878
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 13:29:29 -05:00
Jim Harris
f1f14e5583 bdev: pass mgmt_channel to spdk_bdev_get_io()
This prepares for some upcoming changes which will
add a per-thread bdev_io cache.

While here, remove spdk_bdev_get_io() from the
internal bdev API.  This function is not meant
to be called outside of bdev.c.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f764a88a079fac936931c46d615999454013732

Reviewed-on: https://review.gerrithub.io/392529
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Ziye Yang
6802fe99f4 bdev: handling duplicated bdev name
Change the return type of spdk_bdev_register related
functions and try to handle the duplicated name
issue.

Change-Id: I23af11583cf2050579d1624508306a35394bffde
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/388178
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-11-28 16:20:30 -05:00
Ben Walker
44770c2a11 bdev: Remove poller abstraction
Use the new one from io_channel.h.

Change-Id: I7bf6729caf6eeebcb58450a36119601957ad5da4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/388290
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-11-28 15:29:35 -05:00
Young Tack Jin
d0a4c8e0a9 bdev/nvme: support meta data on vendor specific commands
spdk_bdev_nvme_io_passthru_md() is verified on QEMU NVMe
and will be verified on Cosmos mini OpenSSD.

Change-Id: Ib759b6b6095deaa4ae7cf746f3a241f678295605
Signed-off-by: Young Tack Jin <youngtack.jin@circuitblvd.com>
Reviewed-on: https://review.gerrithub.io/387114
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-11-20 19:26:48 -05:00
Tomasz Zawadzki
6c54c13cd4 event/subsystem/bdev: asynchronous SPDK finish
First this change moves spdk_subsystem_fini() to trigger on
spdk_app_stop(). This ensures that spdk_subsystem_fini() is called
before reactors are stopped in spdk_reactors_stop().

Finish paths for subsystems, bdevs and copy engine is now
asynchronous.
Each of those three mentioned have to make sure they are
asynchronous as well.

Only bdev that currently has requirement for asynchronous finish
are logical volume.
Thus the change in vbdev_lvol.c making it move to next bdev module
only after all lvol stores were unloaded.

Fio_plugin finish of bdev and copy_engine was removed for now.
Next patch in series adds it back with async support.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I80ee2d084f3d82c50bf1329e08996604ae61b1b3
Reviewed-on: https://review.gerrithub.io/381536
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-10-27 13:03:55 -04:00
Maciej Szwed
2baeea7dd4 bdev: add callback to spdk bdev unregister and bdev destruct
Currently deleting bdev does not support asynchronous delete
operations. Because of that results are returned before device
is actually deleted and some operation can be peformed on that
device after removal of this device started.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I305c302d8abd5d7c2c0f947fca70c58396872132
Reviewed-on: https://review.gerrithub.io/383732
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-10-26 17:23:58 -04:00
Seth Howell
3fd7f28dc9 bdev: add fallback from write_zeroes to writev
if write_zeroes is not supported by the block device, we can get the
same behavior by simply writing a buffer full of zeroes to the blocks
we want to erase. I also incorporate splitting into the bdev layer to
accomodate large i/o.

Change-Id: I8fa1bfaaf22d7bfc6e3afb6e89d22fa9f7767e55
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/373829
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-10-23 11:50:16 -04:00
Daniel Verkamp
8eef5183d3 bdev: remove spdk_bdev_poller_start() lcore option
Always start bdev pollers on the calling core.

This removes the lcore concept from the bdev poller abstraction and
simplifies the job of spdk_bdev_initialize() callers providing their own
poller and event implementations.

All callers except the NVMe bdev hotplug poller already used the current
core as the parameter.  The NVMe HotplugPollCore option was undocumented
and unused in any of the tests or example configuration files, so it
should be safe to remove.

Change-Id: I93b466e1e58901b8785c40cbe296fa46c157850f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/382857
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-10-18 20:28:29 -04:00
Dariusz Stojaczyk
13017493af bdev: added spdk_bdev_io_get_thread()
This is required for virtio-initiator,
where multiple io_channels have to
share the same virtio queue. A single
poller will receive responses from
a virtio queue and send the completions
to the thread that submitted bdev_io.

Change-Id: I951c7655aaa17d41a680d437661afff27d2c3077
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/382200
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-10-17 15:21:09 -04:00
Maciej Szwed
e46404f9cb bdev: remove bdev_opened field from spdk_bdev structure
bdev_opened field in spdk_bdev structure is no longer required.


Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ie4a368425b11b1c2e1a3a48b5858857b3935498b
Reviewed-on: https://review.gerrithub.io/381375
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-10-11 12:13:38 -04:00
Jim Harris
5fb8722899 bdev: add callback to free part_base
This puts responsibility on the caller to free the buffer
and do any other cleanup associated with the partition base
(for example, GPT buffer).

While here, also clean up a bunch of places where on
various failures during initialization, it would just
free the buffer instead of calling spdk_bdev_part_base_free().
The latter is required to make sure the bdev descriptor
gets closed as well.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic000339459eca4a4a1d103da2e1f3feffe7e764f
Reviewed-on: https://review.gerrithub.io/378653
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2017-10-01 21:57:00 -04:00
Jim Harris
94bc8cfdba bdev: add ENOMEM handling
At very high queue depths, bdev modules may not have enough
internal resources to track all of the incoming I/O.  For example,
we allocate a finite number of nvme_request objects per allocated
queue pair.  Currently if these resources are exhausted, the
bdev module will return failure (with no indication why) which
gets propagated all the way back to the application.

So instead, add SPDK_BDEV_IO_STATUS_NOMEM to allow bdev modules
to indicate this type of failure.  Also add handling for this
status type in the generic bdev layer, involving queuing these
I/O for later retry after other I/O on the failing channel have
completed.

This does place an expectation on the bdev module that these
internal resources are allocated per io_channel.  Otherwise we
cannot guarantee forward progress solely on reception of
completions.  For example, without this guarantee, a bdev
module could theoretically return ENOMEM even if there were
no I/O oustanding for that io_channel.  nvme, aio, rbd,
virtio and null drivers comply with this expectation already.
malloc only complies though when not using copy offload.

This patch will fix malloc w/ copy engine to at least
return ENOMEM when no copy descriptors are available.  If the
condition above occurs, I/O waiting for resources will get
failed as part of a subsequent reset which matches the
behavior it has today.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iea7cd51a611af8abe882794d0b2361fdbb74e84e

Reviewed-on: https://review.gerrithub.io/378853
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2017-10-01 21:57:00 -04:00
Dariusz Stojaczyk
04124278b9 bdev: assert against bdev_io buffer overflow
Also updated docs.

Change-Id: I2e53c31b2c9c575d8adea23ed92f113e69f324ed
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/380490
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-28 17:08:34 -04:00
Dariusz Stojaczyk
3595fc5f40 bdev: allow allocating custom-length buffers for bdev_io
This is required for upcoming UNMAP
implementation in bdev_virtio.

While here, also added documentation for
spdk_bdev_io_get_buf().

Change-Id: Ia769ee9b8b132f31208ae66598b29a1c9ed37312
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/379721
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-28 17:08:34 -04:00
Jim Harris
1f935c7a9b bdev: properly handle aborted resets
Previously, we naively assumed that a completed
reset was the reset in progress, and would
unilaterally set reset_in_progres to false.

So change reset_in_progress to a bdev_io pointer
instead.  If this is not NULL, a reset is not in
progress.  Then when a reset completes, we only
set the reset_in_progress pointer to NULL if we
are completing the reset that is in progress.

We also were not aborting queued resets when
destroying a channel so that is fixed here too.

The added unit test covers both fixes above - it will
submit two resets on a different channels, then destroy
the second channel.  This will abort the second reset
and check that the bdev still sees the first reset as in
progress.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I61df677cfa272c589ca03cb81753f71b0807a182

Reviewed-on: https://review.gerrithub.io/378199
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-22 15:38:44 -04:00
Jim Harris
ab29d2ce5d bdev: improve handling of channel deletion with queued resets
Upper layers are not supposed to put an I/O channel if there
are still I/O outstanding.  This should apply to resets as well.

To better detect this case, do not remove the reset from
the channel's queued_reset list until it is ready to be
submitted to the bdev module.  This ensures:

1) We can detect if a channel is put with a reset outstanding.
2) We do not access freed memory, when the channel is destroyed
   before the reset message can submit the reset I/O.
3) Abort the queued reset if a channel is destroyed.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0c03eee8b3642155c19c2996e25955baac22d406

Reviewed-on: https://review.gerrithub.io/378198
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-22 15:38:44 -04:00
Dariusz Stojaczyk
dcb7db965d bdev: unify all direct bdev_io I/O types
Since all direct bdev_io types have the same layout,
there is no need to keep them differentiated.

Change-Id: If8bb85e43c9922c0ebfc39837e3a45006e508b56
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/377686
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-09-22 13:44:25 -04:00
Dariusz Stojaczyk
9663f84395 bdev: added iovs to bdev_io->unmap & flush
This is a mid-step towards unifying all direct
bdev_io types.

Change-Id: Ie4da108f2710891e503eb0863148d8fa182d44ee
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/379291
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-09-21 14:06:16 -04:00
Dariusz Stojaczyk
c644fb069f bdev: reorder bdev_io->flush fields
num_blocks and offset_blocks are present in most
other bdev_io types with num_blocks being
before the offset_blocks. bdev_io->flush was
outstanding here.

The fields has been reordered in preparation
to unifying many bdev_io types. See next patches
for details.

Change-Id: I7f12bfdc88a87ab263107d0d0d1b30b848702bcc
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/379290
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-09-21 14:06:16 -04:00
Maciej Szwed
c783caabc7 bdev: remove bdev_opened_for_write restriction
This restriction causes bdevs examination/discovery to fail
when there are asynchronous operations. We should remove this
limitation for now. Future plan is to implement approach
similar to the one that is present in kernel.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ibe5572297672022412d25a4a835dc9527ce97f3e
Reviewed-on: https://review.gerrithub.io/378758
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-09-20 14:43:32 -04:00
Tomasz Zawadzki
27f44662ac lvol: Logical volume implementation
Change-Id: Ia96ae78ff9530d953181ac5f7255a38f3c8ec430
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Reviewed-on: https://review.gerrithub.io/375392
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-09-14 19:30:54 -04:00
Jim Harris
d8cdbb8931 bdev: queue resets per channel instead of per bdev
This moves towards the ability to queue ios while
a reset is in progress.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibb8efbcec3931f966eb43030ea91945dfaaa5a77

Reviewed-on: https://review.gerrithub.io/377827
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-12 11:16:50 -04:00
Jim Harris
70a0bc3096 bdev: add bdev_io optimizations
First, improve packing of spdk_bdev_io.  Move any
fields which must be zeroed into the first cacheline.
Also change type and status from enums (needing 4 bytes)
to int16_t which is still a far bigger range than
needed.

Next, modify spdk_bdev_get_io to only zero the
first cacheline (actually a bit less).

SPDK_BDEV_IO_TYPE_INVALID is also added, making it
explicit that 0 is an invalid value for the IO type.
Previously this was somewhat inferred.

There are still additional improvements that can
be made to this area - primarily combining
spdk_bdev_get_io() and spdk_bdev_io_init() into
a single function.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1916d6d5db02c93622b9725ec1095148e3f384d8

Reviewed-on: https://review.gerrithub.io/377799
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-12 11:16:50 -04:00
Jim Harris
ff9cadadb3 bdev: remove spdk_bdev_resubmit_io
We no longer support resubmitting an existing I/O to
a lower bdev.  This enables accurate tracking of I/O
at each bdev layer for purposes of reset handling and
QoS.

__submit_request() in bdev.c is now only called
from spdk_bdev_io_submit(), so just collapse these
two functions together while we are here.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1b46df42b1a6476f9dbd8c61dd3380dbd145315d

Reviewed-on: https://review.gerrithub.io/377626
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-09-12 11:16:50 -04:00
Jim Harris
85cc223946 bdev: add common partition helper functions
split, gpt and error all have a lot of similar code, and
are based on the presumption of one or more "partition" bdevs
on a "base" bdev.  This results in quite a bit of duplicated
code between these three bdev modules.  The error bdev
module does follow this same model - it just always has only
a one-to-one mapping between the error bdev and its base bdev.

As all of the modules move to allocating their own bdev_io
rather than allowing for adjusting an existing bdev_io and
resubmitting it, there will be even more duplicated code
between these modules.

So this patch adds a set of helper functions in the common
bdev library to eliminate all of this duplicated code.

This patch also moves the split module to use it - future
patches will also convert gpt and error.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id8032331b46c32da9fca18a077580ccb274c6204

Reviewed-on: https://review.gerrithub.io/376423
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-12 11:16:50 -04:00
Daniel Verkamp
7394f69054 bdev: convert bdev module APIs to use blocks
The bdev modules now take all read, write, unmap, and flush requests in
terms of blocks rather than bytes.

The public bdev APIs still accept offset and length in bytes for now.

Change-Id: I57f0955d52272f57755f0ff4dbc56721fdc2ef51
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/376037
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
2017-09-06 13:01:13 -04:00
Dariusz Stojaczyk
c5ff3d7d26 bdev: make module init asynchronous again
bdev_virtio requires us to do SCSI inquiry for
each device and wait for a response. It currently
polls inline for this response, taking up entire
reactor exclusively. Making it async would allow
other pollers to run while bdev_virtio still has
to wait.

Change-Id: Iefc88e0a2efb5b791ffe23b2e623b7c50d759084
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/375573
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-09-01 18:16:46 -04:00
Jim Harris
2a75143441 bdev: change bdev_io flush "length" to "len"
This makes it consistent it with other parts of this structure.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie36488813b53ce20663c50a5c9f049c4c9723d3a

Reviewed-on: https://review.gerrithub.io/375494
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-08-28 13:06:37 -04:00
Jim Harris
f054577df1 bdev: remove gencnt
gencnt is no longer really used since there is no more
differentiation between "soft" and "hard" reset.  The
original idea was that a hard reset would complete I/O
known to the bdev layer even if the underlying bdev
module had not actually completed it yet.  We do not
actually support that concept, and it was a bit flawed
anyways.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa2c85bb474c7dd55eb7386d6cad5079f5edbccc

Reviewed-on: https://review.gerrithub.io/375484
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-08-25 13:12:04 -04:00
Daniel Verkamp
45d4b1973a bdev: add API to get optimal I/O boundary
Allow passing the NVMe namespace optimal I/O boundary through the bdev
layer.

Change-Id: I27a2d5498df56775d3330e40c31bd7c23bbc77a5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/374532
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-08-17 17:56:18 -04:00
Ben Walker
60c38d4022 bdev: Change unmap to use offset/len instead of descriptors
This is far simpler, although it does limit the bdev
layer to unmapped just one range per command. In practice,
all of our code reports limits of just one range per command
anyway.

Change-Id: I99247ab349fe85b9925769e965833b06708d0d70
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/370382
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-08-04 20:03:37 -04:00
Roman
5b53b568e4 add VTune support: spin-wait stat
Change-Id: Ie25a87c4b3f781299fa744fdcff6c9a63d473935
Signed-off-by: Roman <roman.sudarikov@intel.com>
Reviewed-on: https://review.gerrithub.io/365723
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-07-18 15:33:56 -04:00
Jim Harris
be9a3b9f69 bdev: pass descriptors for I/O operations
This enables checking permissions - for example,
spdk_bdev_write will fail if the descriptor was not
created with write permissions.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I68b65a560f471f2e0f71a7f42cfa6689b911110f

Reviewed-on: https://review.gerrithub.io/369493
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-07-14 13:31:30 -04:00
Jim Harris
f71447e80d bdev: remove differentiation between bdev and vbdev modules
We still will sort the bdev_module list so that modules
with an examine() callback are initialized first.  This ensures
they have a chance to initialize before later modules start
registering physical block devices.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I792cfb41b0abe030fe2486a2c872cbf329735932

Reviewed-on: https://review.gerrithub.io/369486
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-07-14 13:31:30 -04:00