1) Change each split_disk to just point to a
split_base structure and drop the extra base_bdev
member. We can get the bdev from the split_base.
2) Simplify the names a bit for some of the members:
split_base::base_bdev to split_base::bdev and
split_disk::split_base to split_disk::base.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia8621b19ad09324d939ed43d79900fecb18291e6
Reviewed-on: https://review.gerrithub.io/375493
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
gencnt is no longer really used since there is no more
differentiation between "soft" and "hard" reset. The
original idea was that a hard reset would complete I/O
known to the bdev layer even if the underlying bdev
module had not actually completed it yet. We do not
actually support that concept, and it was a bit flawed
anyways.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa2c85bb474c7dd55eb7386d6cad5079f5edbccc
Reviewed-on: https://review.gerrithub.io/375484
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
I/O requests are now allocated from bdev_io memory.
virtio_req->iov now points to raw payload, request and response
iovectors are available as separate fields. This solution
should apply for both vhost-scsi and blk.
Change-Id: I588fbdd7fc5442329aadbcb3e31b2f4a7118ec8f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/375264
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Fixed various minor bugs and removed unused code.
Change-Id: I24d3f10a494b9f9c69f45e888c7e1511adc268bc
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/375004
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fixes a GCC 7 warning:
rte_virtio/virtio_user/vhost_user.c: In function ‘vhost_user_setup’:
rte_virtio/virtio_user/vhost_user.c:439:46: error: ‘%s’ directive output
may be truncated writing up to 4095 bytes into a region of size 108
[-Werror=format-truncation=]
snprintf(un.sun_path, sizeof(un.sun_path), "%s", dev->path);
Change-Id: I147c9efe93cc6ce9370da6443f181f916457e3e6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/375198
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Supports both PCI mode (for usage in guest VMs) and
vhost-user mode (for usage in host processes). The rte_virtio
subdirectory contains a lot of code lifted from the DPDK
virtio-net driver. Most of the PCI and vhost-user code is
reused almost exactly as-is, but the virtio code is drastically
rewritten as the DPDK code was very network specific.
Has been lightly tested with both the bdevio and bdevperf
applications in both PCI and vhost-user modes.
Still quite a bit of work needed - a list of todo
items is included in a README in the module's directory.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I85989d3de9ea89a87b719ececdb6d2ac16b77f53
Reviewed-on: https://review.gerrithub.io/374519
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Allow passing the NVMe namespace optimal I/O boundary through the bdev
layer.
Change-Id: I27a2d5498df56775d3330e40c31bd7c23bbc77a5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/374532
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When users don't enable hotplug option in their configuration
section, SPDK will enable it by default. DPDK will print probing
messages continuously for NVMe devices which don't belong to SPDK.
Change-Id: I8c43335a282ecba206b4b5305bd881d2bd07836e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/374486
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Report the Ceph pool and RBD name in the get_bdevs output for RBD bdevs.
Change-Id: I0e9be0b540e90503ce052c968f979b5887673c24
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373416
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In this iteration, we only support write_zeroes in the form of a
deallocate call that returns all zeroes.
Change-Id: Ica837ce70672174df63012719de60463fdb799cf
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/372005
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add functionality to the bdev layer to handle the nvme write_zeroes
function.
Change-Id: I0dadad273b28c16db5a2275f7d8d57e98253a8d3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/372171
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change-Id: I0e8c34237acc88fb51bac56c2f99df52ec199f9b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373835
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The construct_aio_bdev RPC still accepts "fname" for backwards
compatibility.
Change-Id: Ibf44f5f3667c6de4b827f7f3f8787aff0a6c4fc9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373834
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ac636c18d7e71e0184d4f67ec54a36217a11db0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373833
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Disk size in bytes is only used within create_aio_disk(), so we don't
need to save it in the struct file_disk context structure.
Change-Id: I63d230448a67c2b49c57eac2b4d44dce27c303cf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373832
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a file-backed AIO bdev to test it out.
Change-Id: Ifdf206bbdf6cae9379fdc02c80755e96a7198bce
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373673
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia2202eadfb8140b7a51dd64d4241e85aceff361c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373408
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Remove the redundant sector size check; the generic bdev code already
checks for this.
Also use the bdev blocklen field for both offset and size calculations.
The bdev blocklen is the same as the namespace sector size.
Change-Id: Ia8061eb4cfc229d4b6fbe2caabf2dd81656bc697
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/372862
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I33506d6b9ff09c45c057326f7339d742eebc45b4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/372861
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Swap the meaning of the return value to match the name of the function.
Change-Id: I89dc09e3b309a06586adf2ab750092f06077ffd9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/372859
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is far simpler, although it does limit the bdev
layer to unmapped just one range per command. In practice,
all of our code reports limits of just one range per command
anyway.
Change-Id: I99247ab349fe85b9925769e965833b06708d0d70
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/370382
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Factor out the iSCSI and GPT CRC32 functions into generic library
functions.
Change-Id: I1f1a5f3968a983b663a51bd984500492eeb12605
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/370765
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id004e06eea8dfb5d7be24282bfe4d31069ff6573
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/371025
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
An I/O completion callback may delete the channel, so
restructure the poller to avoid touching channel memory
while triggering completions.
Change-Id: I612f10ff172481084386c9a3056fdd5e2f19e854
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/370526
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change the poller to tolerate channel deletion within
an I/O completion callback. This won't happen today
because channel deletion is always deferred, but
prepare for that case.
It turns out this is simpler anyway.
Change-Id: Ibff23d84fe14247849e95cebc3c80369812bdd6c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/370525
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The protective MBR entry may not be the first one. This is already
handled correctly when comparing the total size field, but the start LBA
field was always looking at the first entry.
Change-Id: Ie54e424b2e9cb546b1ed04192662936e04e08b6b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/370747
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: If3cd7ee8251b5e311d5e1e9210085f8c2cbc83e1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/370746
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Remove unnecessary headers and order the remaining #includes correctly.
Change-Id: I6b331aa1514e55e7bf56a07be23f630e4ae3fcdb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/370731
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The message printed when start_lba doesn't match the expected value
would print byte-swapped values on big-endian architectures. On
little-endian architectures, the problem would not be noticeable, since
the conversion doesn't do anything.
Change-Id: I9e8d4485b5710f4333d04bb006bc204416c689cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/370730
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
bdev could be unregistered multiple times when all it's descriptors have
been instantly closed via it's remove_cb without any deferred
event/poller
Change-Id: I128716077b0512c6334bdd113220684f8cfcbecb
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370949
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Patch afe860ae deferred freeing the io_device. However, for nvme, the
io_device context (spdk_nvme_ctrlr) is still being destructed before
io_channels are destroyed, causing segfaults on hotremove.
This patch defers io_device context destruction and fixes nvme
hotremove.
Fixes: afe860aeb1 ("channel: Correctly defer unregisters if channels exist")
Fixes: 5533c3d208 ("util: defer put_io_channel")
Change-Id: I7af699174cac0c6c6a6faa2cc65418c47347eb9a
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370459
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Hotremove event was detected on base bdev, but wasn't propagated to
vbdevs. This patch makes base bdev destroy all it's children once
hotremove is triggered. This fixes hotremove segfaults and adds full
support for split/gpt hotremove.
Change-Id: I7f8b0b109ef237783b6b2e33a18f68c59a8bbe72
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/367824
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add a new struct spdk_nvme_io_qpair_opts to allow the user to override
controller options on a per-I/O qpair basis.
Existing callers with qprio == 0 can be updated to:
... = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, NULL, 0);
Callers that need to specify a non-default qprio should be updated to:
struct spdk_nvme_io_qpair_opts opts;
spdk_nvme_ctrlr_get_default_io_qpair_opts(ctrlr, &opts, sizeof(opts));
opts.qprio = SPDK_NVME_QPRIO_...;
... = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, &opts, sizeof(opts));
Change-Id: I8ac3ea369535cfde759abbe75e1d974b6450a800
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/369676
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch also releases bdev module and closes claiming descriptor on
regular split base destruction.
These problems must've been overlooked in patches 26d6770f and be9a3b9f.
Fixes: 26d6770f1c ("GPT: add GPT bdev support")
Fixes: be9a3b9f69 ("bdev: pass descriptors for I/O operations")
Change-Id: Ib47e2c4d3b99c6d3f2dbe7ef01be81ca3dd97341
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370181
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This was probably overlooked in patch be9a3b9f.
Fixes: be9a3b9f69 ("bdev: pass descriptors for I/O operations")
Change-Id: If29ad65ac168f3dbf7e1602f26f939dfbf17599a
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370180
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ie25a87c4b3f781299fa744fdcff6c9a63d473935
Signed-off-by: Roman <roman.sudarikov@intel.com>
Reviewed-on: https://review.gerrithub.io/365723
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Otherwise we'd fail sanity checks when closing the claiming descriptor.
Current approach works because we never close the claiming descriptor.
Fixes: 4fc7e66614 ("bdev: add vbdev claim/release semantics")
Change-Id: I1c1f0c11450e749419726df460334ab97b43b584
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370179
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
When device has more than one open descriptor, remove_cbs of particular
descriptors could be called more than once.
Consider the following scenario:
bdev X with 2 open descriptors A and B.
X is removed (hotremoved for instance)
bdev_unregister(X) is called
* bdev->status = REMOVING
* A->remove_cb is called
* some poller is started
* B->remove_cb is called
* another poller started
* poller from A->remove_cb finishes it's work and closes the desc:
* bdev_close(A)
* A is removed from bdev->open_descs list
* bdev->status is REMOVING, so bdev_unregister(X) is called again!
* B->remove cb is called again!
* another poller starts? segfault?
Fixes: 57d174ff67 ("bdev: add spdk_bdev_open/close")
Change-Id: I0a898ec0aee521d0b2a1168fe7d469cc41a8ef4f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/369727
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
GPT uses the standard IEEE CRC-32 polynomial, not CRC-32C.
Change-Id: I94dae01ec7b31cb3c3efc735d9dfa4e0cfea9ce4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/369306
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This enables checking permissions - for example,
spdk_bdev_write will fail if the descriptor was not
created with write permissions.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I68b65a560f471f2e0f71a7f42cfa6689b911110f
Reviewed-on: https://review.gerrithub.io/369493
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We still will sort the bdev_module list so that modules
with an examine() callback are initialized first. This ensures
they have a chance to initialize before later modules start
registering physical block devices.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I792cfb41b0abe030fe2486a2c872cbf329735932
Reviewed-on: https://review.gerrithub.io/369486
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Module initializaiton was made asynchronous recently
to support bdev modules like gpt which need to do
asynchronous I/O. But all modules now do any
asynchronous I/O in their examine() routines, and
init functions only do very basic operations to be
ready to handle examine() callbacks.
So simplify the bdev code and modules to go back to
a synchronous init procedure.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idb16156796ad7511d00f465d7a2db9acda6315b6
Reviewed-on: https://review.gerrithub.io/369485
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This includes file names, functions, #defines, etc.
There are still a few uses of "blockdev" outside of
include/ and lib/ - these can be handled later.
This preps for a future patch to consolidate vbdev
modules and bdev modules into just bdev modules.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I70e575709ae1b0a116b08515fd38ae793de05377
Reviewed-on: https://review.gerrithub.io/369325
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
vbdev modules still open/close bdevs as normal, but
should open bdevs read only when tasting (i.e. reading
the GPT to see if there are SPDK partitions). When
a vbdev module is ready to claim the bdev for purposes
of creating virtual bdevs on top of it, it calls
spdk_vbdev_module_claim_bdev(). It can pass its
open descriptor as well to have it promoted to
write access (required for future vbdev modules like
logical volumes).
Note: error vbdev was changed to copy the base bdev
parameters one-by-one instead of a blind memcpy - we
do not want to copy the base bdev's vbdev_claim_module
into the new bdev!
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If2ee67dc78daf96050343c473671aa3402991bb1
Reviewed-on: https://review.gerrithub.io/368628
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change-Id: I1899a47fa9d9821c16ea648bbe3290f6306d0e3d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368626
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows vbdev modules to be ready for examine
calls for (physical) bdev modules when the latter
initialize.
This requires the following modifications to existing
bdev modules:
1) error and split now search their config sections
at examine time (instead of init) to see if the
bdev should be consumed by the vbdev module
2) gpt is simplified considerably - it no longer
needs to save bdevs to examine when
gpt initializes later
3) nvme must register its io device before registering
the bdev, since vbdevs may immediately start trying
to send I/O to the new nvme bdev
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8fe5686092ffb15fc8bdbc068b09add229d9da6c
Reviewed-on: https://review.gerrithub.io/368598
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add additional fields to the g_bdev_mgr to enable
deferring spdk_bdev_init_complete when modules have
finished initialization, but vbdevs are still processing
examination callbacks due to asynchronous I/O.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I12dc7cb08e44e022b5bad75ba9a946571eb65957
Reviewed-on: https://review.gerrithub.io/368832
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add a new spdk_vbdev_module_examine_done() API which
vbdev modules can use to notify the generic bdev layer
once a base bdev examination is complete.
This is especially required for asynchronous vbdevs
like GPT which must issue I/O to the base bdev.
As part of this patch, add examine callbacks
for both split and error, which for now only call
this new functions. Later patches will move code from
the init callback to the examine callback for these
modules. examine callbacks are now required for all
vbdev modules.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I49f2d012d1675b878bcd23afff427c740c6502c7
Reviewed-on: https://review.gerrithub.io/368831
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This eliminates overloading of the term "register"
in the context of bdevs, as well as opens up the
idea of examining bdevs not just when they are
registered, but also after they are closed with
write access.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9e15f9dc0fa4e02214f188f987f9da2dbc8db1f7
Reviewed-on: https://review.gerrithub.io/369042
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic3051b63942770e45be22af0ae03a78a7c543f81
Reviewed-on: https://review.gerrithub.io/368597
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I96cf4e40e8218c6ba150fe9ffead58340c7bdf6a
Reviewed-on: https://review.gerrithub.io/368627
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Add a name parameter to the MODULE_REGISTER macros, and then
modify each bdev module to pass a string for its name.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If878617ce3c3eacfcf5df44ed6f194f11c66f78f
Reviewed-on: https://review.gerrithub.io/368596
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ibb28b8432c94e70f522bdf34c5f4a10e4c25a8bb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368610
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These are UEFI-style GUIDs in little endian byte order, not standard
UUIDs.
Change-Id: I4d61afa2901830c784c24a5e039bba1d98f32e62
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368609
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Fix the partition type GUID comparison to actually look at the partition
type GUID, not the unique per-partition GUID, and fix the test to match.
Change-Id: Ie64f1effcc75883f17ccf6240f6469161d2a5aa5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368606
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It should not print error if there is no GPT
signature
Change-Id: I6e5f8d31aed62dfb420f0ff0560bfbf8a7e8afb3
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/368476
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The purpose is to store which partition is used from
the original bdev, and it can be used to read
the further partition entry info.
Change-Id: Idbc4452846e88b486ff281c288704af80938788a
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/368486
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will enable asynchronous request handling in a future patch, and it
also removes the need for the RPC handlers to know about request id and
the JSON-RPC rules about notification-only requests.
Change-Id: I25aaa8e48bff8d5594ffcccecb61842b1e31ec3c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368225
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I6a138e1c1d775e8c8bfeede9600f8b0f799ecdad
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/362445
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Retire the old claim/unclaim semantics in favor of
open/close. Clients must now open a bdev to get
an spdk_bdev_desc, then pass this desc to get an
I/O channel.
This allows multiple clients to open a bdev,
although only one may open a bdev with write
access.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4d319f1278170124169a8a75fd791e926b3f7171
Reviewed-on: https://review.gerrithub.io/367611
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Each virtual bdev now has a pointer to its base
bdev, and a base bdev has a pointers to any virtual
bdevs built on top of it.
Also add a new set of leaf iterators, to get only
bdevs that have no virtual bdevs built on top of
them. These iterators are now used by the bdevio and
bdevperf utilities, in advance of the claim/unclaim
semantics getting removed in a future patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I669783764407cdd4920b5ee121959e2a58c8d436
Reviewed-on: https://review.gerrithub.io/367610
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Icdb017d7e6b0a38f8ff3aa78ea60117936dfe178
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/366702
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also add adrfam to the NVMe bdev JSON config output.
Change-Id: I9472bda04947cffc0df9b02eba0035bac01b7d7b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/367292
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Abstract these through the bdev API to break this
dependency on the event framework.
Change-Id: I108505bf27e94b2985f53d0a4dc0b847ae264d25
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/366340
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This breaks one dependency of the bdev library on the
event framework.
Change-Id: I47ac81a3e4e951f94a66b5de2639eb485dc62e41
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/366339
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This breaks the dependency on the event subsystem logic.
Change-Id: Ic47a219bc1e272c3421b265f74bba959e1aa5f62
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/365730
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Anything that is not a PCIe TransportID is a remote NVMe over Fabrics
address and must have a subsystem NQN.
Change-Id: I1d34ce09a2c4ad7d3ec14fd90b5afccc33eb2bbf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/365917
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Preivously, we can only parse NVMe-oF target by
rpc. Also the handling for local NVMe device and
remote NVMe devices are not consistent. With this
patch, we can handle it in the consistent way.
Change-Id: Id5c25f8b38a38ac997887b78c2321f148b8844ec
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/364348
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Much like bdev modules inside the bdev directory,
add a subsystems directory inside of event. The subsystem
specific code for the bdev library is placed in to
a separate library in that directory, breaking the
strict dependency of the bdev library on the event subsystem
code.
Change-Id: I255941b823a9ec3e2d62f22a586414949d8ff5ad
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/365055
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This also requires vbdev modules to call spdk_bdev_reset
explicitly on the base bdev, rather than just resubmitting
the original reset bdev_io.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie33d506f68506096306c9f0a9ff5e11141578b15
Reviewed-on: https://review.gerrithub.io/365712
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We can do all of the completion work first, and then
make the decision at the very end on whether to
defer the callback or not.
This also removes the bdev_io defer_callback member -
we no longer need it since we now only defer the
callback itself, instead of deferring the full
execution of the spdk_bdev_io_complete() routine.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3285e34d2fbce34d4254dca2119561ff825ee9e8
Reviewed-on: https://review.gerrithub.io/365711
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia45876fb6f0eefd987cdb36521ecb591ef1f9499
Reviewed-on: https://review.gerrithub.io/365669
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This ensures a reset is completed after any I/O completions
that may have been deferred.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9efe5c07435371ff8c8e0c826349e9349ade02f3
Reviewed-on: https://review.gerrithub.io/365663
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
If the user specifies the name of a bdev that is not an error injection
bdev, bdev_inject_error would cause a NULL pointer dereference.
Change-Id: Ibcc7daee5a75ac37c3567ebc048662e0165c2860
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/365526
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This ensures all I/O will be aborted and that no I/O
will be submitted while the reset is ocurring.
Change-Id: I0f5c993b91d9be6073c6ddf66ae12010f56f864c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/364682
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id0b79002744006fd5bce14b3a7fc1371bf91dcba
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/365107
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Switch from the SPDK event library event to the spdk/io_channel.h thread
message abstraction. This partially decouples the bdev layer from the
SPDK event library.
Change-Id: I2300b8d84c357d6d8cab0d370d7681e948fe97d0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/365077
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
These commands should be treated as aborted by spec,
so correctly deliver abort notifications when a
qpair is deleted.
Change-Id: I8af47a3f42f5695ef8e1a70813662e69102720b2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/364681
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
There are several possible usages of this information -
the first will be related to starting a poller for a bdev
channel when there are I/O that need to be retried.
Usually we can retry an I/O when a previous I/O completes,
but if there are no outstanding I/O, we need to start a
poller. This count will help us make that decision.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifff7d99970510785a1cf30d20b86b3974ce8a106
Reviewed-on: https://review.gerrithub.io/362235
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It is not actually useful to be immediately returned
a handle to the bdev_io. There isn't anything valid
that the user can do with it at that point. Instead,
return an integer error code.
Change-Id: Iffa9a8dc5b2eefab57e3cc1f68919985431d17d1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/364137
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The VTune ittnotify_static.c code, as currently shipped, triggers
sign-comparison warnings when built with SPDK's CFLAGS.
Wrap the file in a pragma to disable the warnings for just this file and
allow a build with --enable-werror to pass.
Change-Id: I701fb6b88f09564fdadfd17e1d16d80df455940b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/364511
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ica7f76a5354b27e6e5fd7114de1fdba46434b073
Signed-off-by: Roman Sudarikov <roman.sudarikov@intel.com>
Reviewed-on: https://review.gerrithub.io/362994
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I4d281be99553629563026cc4f9ab890d0a97986c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/364115
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Dynamically allocate bdev names to remove the arbitrary 16-character
name length limit.
All of the existing product_names are constant strings, so those can
just use string literals instead of a copy per bdev.
Change-Id: I3280da67a4fcf2e4ec8ee8193362ca1b96a9c0cb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/363601
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Need to free this variable.
Change-Id: I78a7d5b312db6487ed65b9d314590a28408da761
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/363477
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Support passthru for NVMe admin commands.
Change-Id: If926f2ecabb078a553158f544c10a92452dbdb39
Signed-off-by: Edward Yang <eyang@us.fujitsu.com>
Reviewed-on: https://review.gerrithub.io/363294
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I05bb8294471bac3f9ecc4744e28d7d40e57e5364
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362622
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The stricter warning levels (-Wimplicit-fallthrough=4) require all-caps
FALLTHROUGH, so update the existing comments to match.
Change-Id: I5f8608101cad31d8ea8e84d48604397f98400e87
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/363491
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Instead of an array of queues per core, allocate
the queues per management channel.
Change-Id: I4ace5bd13362a549a45aba62e978dabbbe1a333d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362617
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: John Meneghini <johnm@netapp.com>
Reset requests will now pass messages to all channels,
correctly abort all queued I/O, wait for that to complete,
and then pass the reset request down to the next lower
layer in the stack.
Change-Id: I167cc5f424d3d0fc52b041bda63ee176f6acea29
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362616
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is now just one type of reset, which is equivalent
to a HARD reset previously.
Change-Id: I955b219cbc5c25793d97de1cc003b30ae99313ac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362615
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change-Id: I93684a004e2ae276734edbb4767b5ba1bac3dd48
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/362111
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The next patch will make bdev modules init
in the async manner.
Change-Id: I4909c80510d786daf54003b99a5925428cf37373
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/362110
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
- rename spdk_malloc_socket to spdk_dma_malloc_socket
- rename spdk_malloc to spdk_dma_malloc
- rename spdk_zmalloc to spdk_dma_zmalloc
- rename spdk_realloc to spdk_dma_realloc
- rename spdk_free to spdk_dma_free
Change-Id: I52a11b7a4243281f9c56f503e826fd7c4a1fd883
Signed-off-by: John Meneghini <johnm@netapp.com>
Reviewed-on: https://review.gerrithub.io/362604
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This allows virtual blockdevs to inspect newly-added bdevs and
potentially insert themselves automatically.
Change-Id: If567a950d753e5f08861a5de22a2e1350376e50f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/362077
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I03d042cbb6de08f7e07b8c0ccc8af44d7b16ae3b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362614
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently this is just a placeholder, but eventually
it will be used to replace the per-core queues.
Change-Id: Iefeb90711bcf001a383e36cd4eaadf813af66506
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362613
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This wasn't used anywhere and we currently believe there
are superior software-only techniques for controlling
quality of service.
Change-Id: Icdadd5870ed0629b338c307d2619bbc242c3e7a3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362065
Reviewed-by: Jim Harris <james.r.harris@intel.com>
If error injection is enabled and we choose an I/O to
fail, do not submit the I/O and fail it when completed -
just fail it immediately.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib58513f66e0df22c36137c0adb273fc31066c983
Reviewed-on: https://review.gerrithub.io/362386
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Only DPDK primary processes can initiate device probe.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia9f966a65fc98ad92b48814dbd6f36f78905162f
Reviewed-on: https://review.gerrithub.io/362452
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This ensures that all spdk_bdev_io structures now have
an attached channel, simplifying some future work around
things like counting the number of outstanding IOs for
a given channel (which otherwise would have had to
account specially for resets).
Reset semantics are still that they affect the entire bdev
and not just the channel it was submitted on.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8aad21a88faacecfd94bdba350059528eb62c390
Reviewed-on: https://review.gerrithub.io/362251
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This prepares for translating -ENOMEM from the NVMe
driver into an associated bdev status code.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0250d5d48e2131da31b71cf9695d12fee67b0fbb
Reviewed-on: https://review.gerrithub.io/362234
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The user should not see the bdev_io status directly; the NVMe and SCSI
error code wrappers provide the ability to translate to the desired
format regardless of what kind of error is stored inside the bdev_io.
Replace the spdk_bdev_io_completion_cb status parameter with a bool
simply indiciating whether the I/O completed successfully.
Change-Id: Iad18c2dac4374112c41b7a656154ed3ae1a68569
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/362047
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is no longer used anywhere. For the places where we previously
used it, we've since found alternate solutions that do not
require it.
Change-Id: I738a80b95ef50348ce1c14969a3812b0a625b3fd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362064
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also change the discovery/nvmf.sh test to use it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I56bce9a84bd46f13b6d4f34da81abf23413f2598
This was implemented as two functions, but it
is much simpler as one. Also, the public function
was way at the bottom of the file instead of near
spdk_bdev_put_io_buf.
Change-Id: I3a90688910b0542cc77b6333bab15132cf514eeb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This was implemented as 3 separate functions but
it is simpler as 1.
Also, this wasn't previously freeing the buffer pools.
Change-Id: Ic1b2b3a0596e745a223099cb2a79bea6ef5c69cc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This was broken into three functions, but it is
a lot simpler as one.
Change-Id: If58ad50fe7d4f65c598b62f24e9e1ce7a64fdd8e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is better organizationally, but also will serve as
an io_device in the future.
Change-Id: I6d65cf39df59e874d13f5fccc5a489720e86c48f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Return types should be on a separate line for definitions.
Change-Id: Iaa38dd00042359fc6640fc67053bd69ebbb7af03
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Make the buffer allocation work for all types of
commands, not just read.
Change-Id: I72d8f67a724566630e7c4a74759fcb08449f7de4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Blockdevs already indicate support for unmap via
spdk_bdev_io_type_supported(bdev, SPDK_BDEV_IO_TYPE_UNMAP).
Change-Id: I634f27a281fd900bb3a6da2e4ff8a74e43579578
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
We plan to use these buffers for more than just reads.
Change-Id: I8fa6cb432a6cfe4406fbf240cd3aa2ae4ab5f3d5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The user can get there via the bdev, so this didn't
have a purpose.
Change-Id: I7f85bb71d5ee238d37ba3624d0ac68a161c95e49
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The SCSI task bdev I/O should never be pending when spdk_scsi_task_put()
is called, and just setting the status to failed is not correct (when
the bdev eventually completes the I/O, it will write into the now-freed
bdev_io, which may be reused by someone else).
Change-Id: Iaad6ce9ab41539652abc40147fed47c5012109dc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
In the pattern set by spdk_bdev_io_complete_nvme_status(), allow
blockdev modules to complete a bdev_io with a SCSI status code.
Also move it to the internal bdev header file, since only bdev modules
should be setting bdev_io status codes.
Change-Id: I8b6afad2c02d7c010c5e60f06a7c7e0785eb87ca
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the scsi_nvme translation code from the SCSI library into bdev, and
provide a generic way to translate any bdev_io status into a SCSI
status.
Change-Id: Ib61a6209387c24543e31574e2b5ca249e2ac8b74
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These channels can handle generic bdev context.
Change-Id: I61f41884ddf4cf86fa156e9051421b354bbb349d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Avoid allocating a large amount of stack space when increasing
NVME_MAX_CONTROLLERS.
Change-Id: I7017e5ed9f4d4f5c860dac608c3e5ae3c35864e7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove the "Nvme" from several field names. The parser
will still accept the old name for backward compatibility.
Change-Id: I6fa86ec359b23fb63960d0aa479a845b36a0977a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The user can now not only specify an optional timeout for
commands, but also the action to take when a timeout is
detected.
Change-Id: I7d7cdd846d580e0b3a5f733d398ee9b19d6fe034
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Instead, they register some internal structure of
their choosing.
Change-Id: Id1f8c563d0a2c6f1066d741f86b8aa6fe09b6319
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Some calls were passing bdev->ctxt, some calls just
bdev. In most of our implementations those are the
same pointer, but they aren't necessarily.
Change-Id: If2d19f9eef059aded10a917ffb270c1dc4a8dc41
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Also add a message when a controller is attached and assigned a name.
Change-Id: I54f2d711d55ba7ae99913fdfea652770b1f8931d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make sure the name will not exceed the length of SPDK_BDEV_MAX_NAME_LENGTH.
Change-Id: I33a3f10c836e650fdcb578c7d9e58169d9bb766a
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
this patch fix the potential possibility of coredump when
we have NVMe device hot inserted.
Change-Id: Idac255f25f42b4746c2d3ae6dfc57a19b7001160
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
It was causing segfaults and infinite looping.
Change-Id: I4c19b5d3af1ba1360250cd5f6aa573a27003409f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
The user now must choose the name for each AIO bdev. This
provides consistency for names across restarts.
Change-Id: I13ced1d02bb28c51d314512d60f739499b0c7d8d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Names for the NVMe bdevs are now assigned by the user.
This means the same name will always be assigned to the
same device, even across restarts.
Change-Id: If9825ec9abcb5236b4671bc44a825e4f0d704fe3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
All devices must be specified by BDF. Add support for scripts
to use lspci to grab the available NVMe device BDFs for the
current machine.
Change-Id: I4a53b335e3d516629f050ae1b2ab7aff8dd7f568
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
SPDK_COUNTOF works like sizeof, except it returns the number of elements
in an array instead of the number of bytes.
Change-Id: I38ff4dd3485ed9b630cc5660ff84851d0031911f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This can be used for issuing an abort for the timed-out command.
Change-Id: I3c5727fdddc156cd7c8f99afbc3e6da8e73bba56
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
No need to build a whitelist and scan anymore - the NVMe
driver can directly attach to a specified device.
Change-Id: Ie60c09b6ab37a7f068c496f0cad53bfdc8617349
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is consistent with the rest of the RPC calls that report a number
of blocks, and it matches the field in the split_disk structure.
Change-Id: Ie25534617112d65979c317fe13e05a6c32520a15
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The driver_specific object should contain a single object with the
blockdev driver's name so that the user can determine how to interpret
it. This matches the NVMe blockdev driver.
Change-Id: I434b910a95dd527363af78469dc900e9d19ec12e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Now that namespace splitting support has been removed from the NVMe bdev
in commit efccac8 ("bdev/nvme: remove NvmeLunsPerNs and LunSizeInMB"),
the block_size and total_size fields in the NVMe bdev's driver_specific
config data are redundant. The generic get_bdevs num_blocks and
block_size fields provide the same information.
Change-Id: I080d2017d608716a593bb553ee667e9c4017ffb7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move cb_arg to the first argument to match the other NVMe callback
function signatures.
Change-Id: I4e699c8071dcb7ba4ce3cdb82ee985600208204c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Since the io_channel will be passed to the underlying bdev's
read/write/... functions later, we need to also acquire an io_channel
for the underlying bdev, not for the virtual bdev.
Change-Id: Ica13076973fef875ea636770fce8eb27017aa1c3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This virtual block device takes an underlying block device and splits it
into several smaller equal-sized block devices.
Change-Id: I6f6e686c1177b2e4885f7e88809ad329caae55bd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These were only intended for testing and should be replaced by a virtual
blockdev that can be layered on top of any kind of bdev.
Change-Id: I3ba2cc94630a6c6748d96e3401fee05aaabe20e0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is necessary to process asynchronous events, as well as keep-alive
support for NVMe over Fabrics connections.
Based on a patch by Edward Yang <eyang@us.fujitsu.com>
Change-Id: I3e81f3d5061f75b12b625fa1a06629c6dc3dc61b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prevents the need for bdev users and modules to manipulate the
internal bdev_io error.nvme fields.
For now, all non-NVMe error types are treated as a generic device error,
but translation from SCSI to NVMe could be added in the future.
Change-Id: I4e831b26a2f41bf2f405c7576d5019bb898d4d1b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If a blockdev module calls spdk_bdev_io_complete() within its
submit_request function, and the user's completion callback issues a new
I/O, it is possible to cause infinite recursion, consuming all available
stack space.
To avoid this, track whether a bdev_io is being processed by
submit_request, and if io_complete() is called in this case, defer the
completion via an event.
Change-Id: I6ccdb8ed4ee0d5738e6c9840d35431de52bd5fa2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Because of the addition of io_channel support to the bdev layer, there
is no longer a need to re-run a completed I/O through the submission
event pipeline; it can be freed directly.
Change-Id: I2b9163c87293345acf0e85f6d0c1032f30209659
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Claim the block devices used by iSCSI LUNs and NVMe-oF subsystems so
they can't accidentally be reused.
This will also be used by virtual block devices to allow layering of
bdevs.
Change-Id: I5384923fbf24f13f4ce720a797c5a628053d49f4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use a plain function pointer + callback context for the bdev I/O
completion callback. This is possible now because each I/O channel will
be polled on the core that submitted the I/O.
Change-Id: I29ee8e4a3430df11c74845adab840395b9bc5010
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
An SGE could be for a payload that is greater than the NVMe
devices MDTS (i.e. 128KB), but that SGE may not be aligned
on a sector-size boundary. We can safely assume that each
iov is individually physically contiguous - the DPDK
mempools for example guarantee this.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8143ed01814c3154d0a06b8bbc548484437c1e88
The 'next' event pointer was never used in the entire code base (always
NULL).
Change-Id: I75f999d3a2e10512d86edec1a5a46ef263e2635b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use 'struct spdk_event *' directly for consistency with the rest of the
API.
Change-Id: Ib41a9bf47f5b18f4aebf5f4dee055455cb12ef7d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spdk_poller_register() function provides a way to pass an event to
call once the poller is registered, but it is always NULL in the current
code base.
Change-Id: I459bf40ae4d050589577d113b7984f1563aaa9cc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is only used within bdev.c and can be static.
Change-Id: Id6e2cd9e5dd61a3ef1e1a27993d7a5ea7728bff2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is consistent with the other internal-only API headers.
Change-Id: I2c4748977d38a6c173311d26197d6273c168da7d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
They were very close to the same already, so finish the job.
Change-Id: Ifba9e3b2d11a3e70cbfbe46f57a67552db2757ed
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Comply with the definition format used by other bdev
modules
Change-Id: Iac108bac540687b32fea4bb70374c22534c60aa0
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The probe_info was reduced to just containing a
transport_id, so remove probe_info entirely.
Change-Id: Ica9a22d126cd14e282decd3eea1a0afe0460f099
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Drop the complicated buffer size/strlen math and just split the version
string formatting into two cases depending on whether the tertiary
version is set.
Change-Id: I4b4983cb8805e8734c408f473dd8c592ec8e8138
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The printf # specifier adds 0x for %x values, but the field width then
includes the 0x part, so for example printf("%#04x", 0x1) prints "0x01"
rather than the intended "0x0001".
Rather than increasing the field width, just manually insert the 0x in
the format string and drop # for less confusion.
Change-Id: Ie6044619a22b51b39562bfa5c0c0239933bf38c8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add the following infromation.
- PCI Address
- Vendor ID
- Model Number
- Serial Number
- Firmware Revision
- NVMe spec version
- Namespace sector size
- Namespace total size
Make the public API clearer - if the user wants to allocate a
spdk_copy_task directly, they need to allocate spdk_copy_task_size()
bytes.
Also change the return type to size_t for consistency.
Change-Id: I0f3757056757c510421d680c5b4532edd9bc2561
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The SPDK_TRACELOG macro depends on a CONFIG setting (DEBUG), so it
should not be part of the public API.
Create a new include/spdk_internal directory for headers that should
only be used within SPDK, not exported for public use.
Change-Id: I39b90ce57da3270e735ba32210c4b3a3468c460b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the len field from the generic spdk_bdev_io instead of duplicating
it in blockdev_rbd_io.
Change-Id: I3ebfab8dd1303add83bc2206fc87319ba7d605b3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This function needs to check for SGEs that straddle a
2MB page boundary, and ensure it does not return
a length that will cross that boundary.
This cannot happen in practice currently with SPDK
since all buffers are allocated using rte_malloc(),
but an upcoming vhost-scsi target may produce
SGEs from a guest VM's physical memory that span
a 2MB boundary.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8b83c7c39c4cf33815abb22ff2ebc90941b21e28
No functional change, but removes a few assumptions
that will be invalid in a future patch that fixes a
bug in this function. Primarily we no longer assume
that this function will always increment the
iovpos and reset iov_offset to 0.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I770f2f24c37626063e113af850a2af792aed332a
The bdev function table should not be part of the public API.
Change-Id: I5d6f40d1b37c4471041c1c9d6253a3f92e9e9701
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It was written but never read (and the I/O channel is already stored in
the generic spdk_bdev_io).
Change-Id: Id33392e9d3940b2c1439e9fed2553aa091ecedf8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
No need to duplicate the bdev-defined I/O type.
Change-Id: I15cb68c3c68b3f25b286b04500b53081ed5e7881
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The status field in blockdev_rbd_io was only used within
blockdev_rbd_io_poll(), so replace it with a local variable.
Change-Id: I3629225f28b752a3acc7521699c33bc98f1e4b7b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Instead of the next_sge callback returning the physical address
directly, make it return the virtual address and convert to physical
address inside the NVMe library.
This is necessary for NVMe over Fabrics host support, since the RDMA
userspace API requires virtual addresses rather than physical addresses.
It is also more consistent with the normal non-SGL NVMe functions that
already take virtual addresses.
Change-Id: I79a7af64ead987535f6bf3057b2b22aef3171c5b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove the complex list management for pool_name and just strdup() it
directly. It is not worth the trouble to save a few bytes.
Change-Id: I8a4f7eeea619bd824ea593854423e317041c540e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove a DPDK dependency from generic code.
Change-Id: I8e3e2c0a36d980b426a1967ed1f88fb8b855c382
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Custom bdev modules can return any SCSI status and SCSI sense
information to a host by this patch. This is usefull when a custome bdev
module detect an error in the module and need to return meaningful
information to a host.
Make a wrapper that spdk can call a function without thread affinity, and
call this wrapper to open rbd image.
Change-Id: Iadc87a948f43632abf497f88165483a0e269ba54
spdk_nvme_probe() will now provide a struct spdk_nvme_probe_info to the
probe and attach callbacks in place of the PCI device pointer.
This struct contains the useful information that could be retrieved from
the PCI device during probe.
The goal of this change is to allow expansion of the probe information
in the future when other transports (specifically, NVMe over Fabrics)
are added that do not necessarily use PCI addressing or device IDs.
Change-Id: I59a2a9e874e248ce5fa1d7f4b57c8056962ff3cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add a helper function that converts a PCI address from a string into a
struct spdk_pci_addr and use it in place of the various sscanf()
invocations throughout SPDK.
Change-Id: Id2749723f76db741567e01b4bcb0fffb0e425fcd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add an RPC interface to list all blockdevs and their properties.
Change-Id: I50db730d5eff8cffcbe8fe5df6b3461457e8581e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The PCI device claim function does not need the whole spdk_pci_device
structure, just the address.
Change-Id: If59df512043ee062cf9f759bdc104fc522625ba8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
- Split the part that gets a PCI device's address into its own function,
spdk_pci_device_get_addr(). This is useful outside of the comparison
function and is orthogonal to comparing addresses.
- Make the comparison function take two addresses instead of a device
and an address. The more general form will be useful with addresses
that are not directly associated with a device. Because of this, also
rename the function from spdk_pci_device_compare_addr() to
spdk_pci_addr_compare().
- Return a signed value similar to strcmp() so that addresses can be
ordered, not just compared for equality.
Change-Id: Idf304454af09ea57f1e1d5dc3a39b077378cecad
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rename the construct_rbd_bdev "size" parameter to block_size so that it
is consistent with other bdev construct RPCs.
Change-Id: I88f8ed35444495ffce9550dc224fbcbd58231787
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When creating a bdev via the RPC interface, there was no way to know
what name it was assigned (other than predicting it based on the
numbering scheme). Change all of the relevant RPC interfaces to return
an array of bdev names so they can be used to construct LUNs/subsystems
dynamically in scripts.
Change-Id: I8e03349bdc81afd3d69247396a20df5fcf050f40
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
A status member of spdk_bdev_io structure is set after the if block.
Therefore a status parameter should be checked instead of a status
member.
Change-Id: I4030a7fcdb36d9c589802ec5b4e424591dc2a3b6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Previously, we directly assigned the pointer of pool_name
and rbd_name, and this is not safe. After the rpc test,
we found the string value is not correct, so use strdup.
Change-Id: Ibadc57d3cb5b9869b7db5a22c2459769e92edebd
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Four read/write functions share the same code for checking
IO len and offset. Extract this code into separate function.
Change-Id: I40f0021e70a60c591b048ad3a70b22eaa07af3b4
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Only call spdk_bdev_io_complete() where IO error is seen.
Change-Id: I829e4c589dbcb47017e810035837a4c61c3428f9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: Ie2c1f6853bfd54ebd8039df9a0305854ca3297b9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Make sure the function reentrant, prepare for rpc method.
Change-Id: Ie5230e4ac6c9a750e8e779c5e0b67134729c07e3
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This prepares for future scatter-gather support.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie21c4d86c1e932dcaf63cf13d7a7198890595d79
Return void in main I/O path, and have functions
explicitly complete the I/O back to the bdev layer
if any failures are encountered.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia729b0af555f87c2fb36b92e79a47d19a325de7a
Add public function which could be used by rpc method.
Change-Id: Id9d2938801e0acdf0f9827ef2990a54c75aec22a
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This patch removes the lock in RBD module. And it requires
the librbd library supports rbd_poll_io_events function.
Change-Id: I040a7d8369ab4f69f41d1d0233115f885168f019
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Remove #includes for all DPDK headers that weren't
necessary.
Change-Id: Ib02522e0f04e64a1c98afceb7508cc0e8d931a9d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This converts some, but not all, usage of rte_mempool
to spdk_mempool. The remaining rte_mempools use features
we elected not to expose through spdk_mempool such as
constructors, so that will need to be revisited.
Change-Id: I6528809a864ab466b8d19431789bf0f976b648b6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Use the env library to perform all memory allocations
that previously called DPDK directly.
Change-Id: I6d33e85bde99796e0c85277d6d4880521c34f10d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Enforce exactly one trailing \n, and fix all of the existing cases.
Change-Id: I6218e4700e90aeb647eaee78089530c79993c8c8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: I5f401527c2717011ecc21116363bbb722e804112
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Validate the number of unmap descriptors in the generic bdev layer
before calling the blockdev-specific unmap function.
Change-Id: Ib24e7ec63f782f23f2ee3e63393aa8463123fdb4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows users to swap their PCI library from
libpciaccess/dpdk to another mechanism using the standard
method for swapping out the env library.
Change-Id: Ib2248f8b43754a540de2ec01897e571f0302b667
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This allows users to swap out SPDK's third party
libraries for an implementation based on their own
framework.
Change-Id: Ia0b7384ce5e31acba5ad0d7002dec9e95b759c52
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This patch also drops support for automatically unbinding
devices from the kernel - run scripts/setup.sh first.
Our generic pci interface is now hidden behind include/spdk/pci.h
and implemented in lib/util/pci.c. We no longer wrap the calls
in nvme_impl.h or ioat_impl.h. The implementation now only uses
DPDK and the libpciaccess dependency has been removed. If using
a version of DPDK earlier than 16.07, enumerating devices
by class code isn't available and only Intel SSDs will be
discovered. DPDK 16.07 adds enumeration by class code and all
NVMe devices will be correctly discovered.
Change-Id: I0e8bac36b5ca57df604a2b310c47342c67dc9f3c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Provide a convenience wrapper for general purpose dataset
management commands. The previous wrapper for deallocate
was difficult to use correctly and only for deallocate.
Note that the name is "dataset_management" as opposed to
"data_set_management" to match the NVMe specification.
It's questionable whether "dataset" is valid English, but
it is best to match the specification.
Change-Id: Ifc03d66dbabeabe8146968cf8a09f7ac3446ad68
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This patch will add a new bdev module, rbd.
It can make ceph rbd as the backend of iSCSI
target.
Change-Id: Id5eb3b159ee607052e3c33a2e59d721739fd9977
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
off_t is problematic for use as a file/block offset: it is signed, and
on 32-bit platforms, it can be 32 bits (depending on the settings of
_FILE_OFFSET_BITS and _LARGEFILE_SOURCE).
The blockdev layer already uses uint64_t to represent offsets; replace
the blockdev module uses of off_t in internal functions with uint64_t
to match.
Change-Id: I77a2e594572c56f1cd8a7a080f985ea5b27c35f3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
We already require the assert header from the C standard library,
so use that instead of RTE_VERIFY to further isolate DPDK
dependencies.
Change-Id: I4a718af858c88aff6080e33e6c3dd533c077b8f4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Some subsystems may wish to create unique I/O channels
which are not shared across all users of the same I/O
device on the same thread.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ade3675d57338cf85b6a301285e6f392bd6cd2e
Fix the existing cases (all missing void in parameter lists) and enable
the warning to prevent new ones from being introduced.
Change-Id: Ieaf00b3dfd5daf1e21fcbefb124514882e8996c9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
bdev and copy modules no longer have check_io functions
now - all polling is done via pollers registered when
I/O channels are created.
Other default resources are also removed - for example,
a qpair is no longer allocated and assigned per bdev
exposed by the nvme driver - the qpairs are only allocated
via I/O channels. Similar principle also applies to the
aio driver.
ioat channels are no longer allocated and assigned to
lcores - they are dynamically allocated and assigned
to I/O channels when needed. If no ioat channel is
available for an I/O channel, the copy engine framework
will revert to using memcpy/memset instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99435a75fe792a2b91ab08f25962dfd407d6402f
Also implement these functions for all of the bdev drivers in
the tree.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idea97743d601150044b1fe2d9d76e922d46d3ee1
This enables some future changes which will use per-thread
nvme_qpairs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1efcacfa6aedc970656633c9ce1393dc9b4fdbcc
This breaks out the resources needed to perform
aio-based I/O into a separate data structure, as a steps
towards some future patches that will enable per-thread
resources to enable parallel I/O without synchronization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I84b95713133f9c411863ff0aeef8f886a08e0857
This matches the general order (LBA start then LBA count) for
the NVMe API.
While here, fix a copy/paste error in a debug message (write
instead of writev).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ice326af5d6025867dffed4d1f6c7b81fb9eba5eb
The table of bdev function pointers should not need to be modified at
runtime.
Change-Id: I3e8876fc83df9296ce528231269b1a905c96072c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If a bdev doesn't need to be polled, allow it to specify NULL for the
check_io function pointer to indicate that no poller needs to be
registered.
This will be useful for virtual blockdevs that don't have any associated
hardware to poll.
Change-Id: I0ef8f848587b0c200296805ccc710340dde683b5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When an I/O with children is being freed, also free its child I/O
requests that were allocated via spdk_bdev_get_child_io().
Change-Id: I2d44aed845c1035ae8f8cb07c5992da855f1dc99
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This callback was only used for freeing buffers, but the buffers are now
managed by the bdev core, so none of the free_request callbacks actually
do anything.
Change-Id: Icfe2e6169e829159dda5e3d75a27d8f040de07c6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add unmap support to the ramdisk block device for testing purposes.
Change-Id: Ibeb5530b2b5a31603d09d2d1de07760f32dea0f8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The bdev layer can be used independently of iSCSI, so fix the
misleading names.
Change-Id: I3fd5b113403acdd7578ce93234dde0fd4f148e96
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Bdev modules need a separate interface than public
consumers of the blockdevs.
Change-Id: I581ee493570c114f7e96b31a425bc077a791c71e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Some block devices do not support the unmap operation, and we may add
other optional I/O types in the future. Add a method to check which I/O
types a specific block device supports.
Change-Id: I6e6414bf6b6482ea0224022d8326b252bd363c7f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Switch from the non-portable <sys/endian.h> functions (htobeXX/beXXtoh)
to the SPDK endian conversion functions.
Change-Id: Id49b87f2e536c68f0d5d567e78e1990c0a37ef14
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This leaves more flexibility for future changes to the poller
representation without requiring API changes (after this one).
It also prevents the user from accidentally using poller fields in a
non-thread-safe way, since they can't be accessed directly anymore.
Change-Id: I7677d5b93668665d29ae39c5e0ba74333ad3f878
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Replace other critical rte_zmalloc() sites that actually depend on the
memory being zeroed.
Change-Id: If6856ad44a4c50869811d3ce9411c993ce88018d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Allow pollers to be scheduled to be run periodically every N
microseconds instead of every iteration of the reactor loop.
Change-Id: Iaea3e98965d81044e6dc5ce5f406bcb7a455289e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Just getting a reference to a bdev should not claim it.
Change-Id: I21e07160662490ec95b52fa31ea1d2ae93a21f09
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Combine the necessary functionality with the main bdev file.
Change-Id: I96d796bc87ac2a8688cdf1fd3c16d2a7c8aef730
Signed-off-by: Ben Walker <benjamin.walker@intel.com>