Windows Virtio drivers use indirect descriptors without
negotiating their feature flag, which is explicitly
forbidden by the Virtio 1.0 spec. "(2.4.5.3.1 Driver
Requirements: Indirect Descriptors) The driver MUST NOT
set the VIRTQ_DESC_F_INDIRECT flag unless the
VIRTIO_F_INDIRECT_DESC feature was negotiated.".
Violating this rule doesn't cause any issues for SPDK
vhost, but triggers an assert, so we can only run Windows
VMs with non-debug SPDK builds.
This patch removes the assert and allows Windows VMs
to be run with debug versions of SPDK vhost.
Fixes#650
Change-Id: I95f534c33c384a4e1126a8c343c21eb63ec7bcef
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447803
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Introduce EMPTY, PRESENT and REMOVED states for Vhost SCSI targets. This
does not introduce any functional changes but opens a way to stop using
both removed flag and spdk_scsi_dev pointer as indicator if device is
removed or not.
Change-Id: Iecd76ffe9e8121cc1359b1e268eb21679d13598e
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447070
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
rte_vhost has rejected a patch with this feature, so
we implement it using the external rte_vhost msg handling
hooks directly in SPDK.
Change-Id: Ib072fc19b921fe0fa01c7f4892e60430232e3a1c
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make the vdev initialization happen before calling
any vdev related functions. This is mostly needed
for an upcomming patch where additional step is
required after initializing the vdev and before
starting rte vhost.
On the other hand, this patch also fixes a technically
possible scenario where rte vhost starts processing
vhost-user messages and calling our ops before the
related vdev was initialized.
Change-Id: I8fbc7e7bc0b364327cfcec60faa74d4f64d6fad8
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
rte_vhost requires all queues to be fully initialized
in order to start I/O processing. This behavior is not
compliant with the vhost-user specification and doesn't
work with QEMU 2.12+, which will only initialize 1 I/O
queue for the SeaBIOS boot. Theoretically, we should
start polling each virtqueue individually after
receiving its SET_VRING_KICK message, but rte_vhost is
not designed to poll individual queues. So we use
a workaround to detect when a vhost session could be
potentially at that SeaBIOS stage and we mark it to
start polling as soon as its first virtqueue gets
initialized. This doesn't hurt any non-QEMU vhost slaves
and allows QEMU 2.12+ to boot correctly. SET_FEATURES
could be sent at any time, but QEMU will send it at
least once on SeaBIOS initialization - whenever
powered-up or rebooted.
Vhost sessions are still mostly started/stopped from
within rte_vhost callbacks, but now there's additional
concept of "forced" polling, in which SPDK starts
sessions manually, while rte_vhost still thinks the
sessions are stopped. This can potentially lead to cases
where a session is "started" twice, or gets destroyed
while it's still being polled (by force). Those cases
also need to be handled within this patch.
Change-Id: I70636d63e27914906ddece59cec34f1dd37ec5cd
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446086
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
VHOST_USER_SET_VRING_CALL invalidates the previous
file descriptor that SPDK may be using on a different
thread. The new descriptor is stored inside rte_vhost
internals and is queryable with rte_vhost APIs, but
those APIs are too expensive to be called every tick
or every time we need to use that fd. Hence, we will
now stop the entire vhost session before processing
SET_VRING_CALL msg and restart it right after. SPDK
will query the most recent call descriptor on session
start.
We do not necessarily have to stop the device - just
letting the session know that its callfd has changed
would be enough. That's an area for future optimization.
Change-Id: Idccf56fccd21ad0d3c2307eefee7bf35e350fec6
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447639
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DPDK 19.05+ gives us an ability to pre or post-process
any single vhost-user message. The user can either perform
additional actions upon some generic events, or can
implement handling for brand new message types that
rte_vhost doesn't even know about.
In order to smoothly switch to the upstream rte_vhost
and drop our internal copy, we introduce an SPDK wrapper
function to register SPDK-specific message handlers. For
DPDK 19.05+ this will use the new rte_vhost API to
register those message handlers, and for older DPDKs
this function simply won't do anything - as w assume the
internal rte_copy already contains all the necessary
changes and does not need any "external" hooks.
For now we use the message handlers to stop the vhost
device and wait for any pending DMA ops before letting
rte_vhost to process the SET_MEM_TABLE message and unmap
the current shared memory.
Change-Id: Ic0fefa9174254627cb3fc0ed30ab1e54be4dd654
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446085
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It's disabled by default, so no functionality is changed yet.
The intention is to use the upstream rte_vhost from DPDK,
which - starting from DPDK 19.05 - is finally capable of
running with storage device backends.
SPDK still requires a lot of changes in order to support
that upstream version, but the most fundamental change is
dropping vhost-nvme support. It'll remain usable only with
the internal rte_vhost copy and with the upstream rte_vhost
it simply won't be compiled. This allows us at least to
compile with that upstream rte_vhost, where we can pursue
adding the full integration.
Change-Id: Ic8bc5497c4d77bfef77c57f3d5a1f8681ffb6d1f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446082
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This is already done for JSON info dump. In addition, the
spdk_vhost_scsi_dev_get_tgt function might implement additional logic to
no return SCSI targets under removal process.
Change-Id: I21d6f660926091dfd34da553705116926f27b30d
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446910
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Adapted our custom rte_vhost APIs to the upstream DPDK
version which has independently added similar APIs.
This will potentially allow us to remove our internal
rte_vhost copy.
rte_vhost_set_vhost_vring_last_idx() was renamed to
rte_vhost_set_vring_base() and the last vring indices
have to be acquired with a newly introduced rte_vhost_get_vring_base()
rather than rte_vhost_get_vhost_vring().
This is only a refactor, no functionality is changed.
Change-Id: I1ca2c1216635c117832c9d9c784d5661145c04cd
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/446081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Older versions of QEMU (<= 2.11) expose the VGA BIOS
hole (0xA0000-0xBFFFF) by specifying two separate memory
regions - one before and one after the hole. This results
in the "size" not being a 2MB multiple. But the underlying
memory is still mmaped at a 2MB multiple - so that's what
we should be checking to ensure the memory is hugepage backed.
Fixes#673.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1644bb6d8a8fb1fd51a548ae7a17da061c18c669
Reviewed-on: https://review.gerrithub.io/c/445764
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We assumed io_channel allocation always succeeds, but
that's not true. Doing I/O to any vhost session that
failed to allocate an io_channel would most likely
cause a crash.
We'll now detect io_channel allocation failure and
print a proper error message. The SCSI target for
which the channel allocation failed simply won't be
visible to the vhost master. All I/O to that target
will be rejected.
We should probably report the error to the upper
layer and either prevent the device from starting
or fail the SCSI target hotplug request. But for now
let's just prevent the crash.
Change-Id: I735dfb930d8905f70636a236b4fa94288d0aaf3a
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/444874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We explicitly checked for one of the strings in the
parsed RPC request even though it's required for the
entire request to parse successfully. The extra check
is now removed.
Change-Id: I19c446786e4ac88b88f14e18dc5258f31b1a87f1
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Since we no longer use external events and we access
all vhost devices synchronously, we no longer need
to dynamically allocate our RPC request contexts. They
can be put just on the stack.
Change-Id: Ie887607b67451aba4f3404c4b9551e6424335beb
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440380
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Removed their various usages inside the core vhost code
together with the external events themselves. External
events were completely replaced by spdk_vhost_lock()
and spdk_vhost_dev_find().
Change-Id: I1f9d0268c27a06e2eecab9e7d179b1fd54d4223d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440379
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Replaced them with inline code that performs exactly
the same but is shorter and easier to follow. External
events were replaced by spdk_vhost_lock() and
spdk_vhost_dev_find().
Change-Id: Id46a619c592c20a573664b54efc097489e9bb893
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440378
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Vhost external events no longer do any asynchronous
calls, they only lock the vhost mutex and directly
call the provided function. The mutex encapsulation
isn't worth the additional complexity of splitting
each vdev-handling code into multiple functions, so
we expose low-level APIs that should eventually
replace external events entirely.
Instead of:
```
static int do_something_cb(struct spdk_vhost_dev *vdev, void *arg)
{
struct my_data *ctx = arg;
/* access the vdev and ctx */
free(ctx);
}
struct my_data *ctx = calloc(...);
rc = spdk_vhost_call_external_event("my_vdev", do_something_cb, ctx);
if (rc != 0) { /* err handling */ }
```
We can now do just:
```
spdk_vhost_lock();
vdev = spdk_vhost_dev_find("my_vdev");
if (vdev == NULL) { /* err handling */ }
/* access the vdev any context data */
spdk_vhost_unlock();
```
Change-Id: I06e1e149d6dd006720b021d3bef8d9b7bfaeceaa
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440377
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Existing specific vhost socket messages for vhost-nvme target
will get some information from backend target before start_session
call, so we should iterate the associated nvme controller by vid
but not session.
Fix issue #628.
Change-Id: Ia400bf33895a0feee0058a870f26b0ff72b7556f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442498
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We assumed the second descriptor in an I/O descriptor
chain will always point to a payload buffer, but in case
there is no payload, the second descriptor will point to
a response buffer. The vhost code doesn't provide proper
checks to handle such case, so to avoid various errors
down the stack, we just fail all requests with no
payload.
Change-Id: I6785c2843d6db4fc17e68e03562c2a1530bb469b
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437187
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dstepanov.src@gmail.com>
This ensures that SPDK will detect descriptor chains
that are too long.
The additional check in vhost block stands as an
optimization and makes us fail the corrupted I/O early.
Change-Id: Icceaa0dd938dca96a1872e5ee96bf6a151fdd9e7
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Signed-off-by: Dima Stepanov <dstepanov.src@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/433641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK doesn't provide sufficient runtime checks to properly
handle clients with memory sizes that aren't 2MB multiples
and could potentially segfault during I/O processing.
That's why we'll reject such clients now.
Change-Id: I34e85be5b5c6df863371d0ad688f228ed44107ff
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/433640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Those cases should never occur. Klocwork pointed out
possible dereference based on the returns later in
the functions.
Change-Id: I282a56f3f415f85c38e9c451cbb10bc80fc6176b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Although Vhost SCSI code is technically capable
of polling different sessions on different lcores,
the underlying SCSI API won't allow allocating
io_channels on more than one lcore.
That's why we will now let device backends assign
lcores by themselves.
The first Vhost SCSI session will now choose one
core from the available ones, and any subsequent
sessions will stick to the same one.
Change-Id: I616cd195a919960dff68508473cea236abf8d6a3
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441581
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
With all the patches in place, we can finally
enable having more than one simultaneous sessions
to a single vhost device.
This patch adds a unique id to the session structure,
similar to the one in a vhost device and also fills in
the implementation holes in foreach_session().
Vhost-NVMe can support only one session per device
and now has an additional check that prevents it from
starting more than one at a time.
Vhost-SCSI also has the same check now since it needs
additional work on the lcore assignment policy. The
check will be removed once the required work is done.
Change-Id: I13a32c7a0eae808e9bec63a7b8c15ec0bc2e36ed
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Particular backends will now be responsible for sending
events to vsession->lcore. This was previously done by
the generic vhost layer, but since some backends will
need different lcore assignment policies soon, we need
to give them more power now.
Change-Id: I72cbbccb9d5a5b2358acca6d4b6bb882131937af
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441580
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
It's sessions that are tied with the lcores now.
This makes the vhost devices accessible by any
thread that only locks the global vhost mutex.
The mechanism used for external device events was
refactored to serve for foreach_session() API.
Additionally, since we don't want to handle cases
where the entire vhost device gets removed while
an asynchronous foreach_session chain is pending,
a new per-vdev counter of pending async operations
was added. We'll fail the device removal request
if there are any pending operations. Eventually
we would like the device removal to be asynchronous,
but that's a todo for later.
The external events are still there, although
they only lock the mutex and call the provided
function now.
Change-Id: I20618f9420a9bc04270373469deaad8fb2049c7c
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439323
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Each Vhost SCSI session will now keep its local
SCSI targets state that can be accessed without the
global vhost mutex.
Hotplug will still add the SCSI target reference to
the device struct, but will also asynchronously tell
each active session to add it's session-local copy.
Hotremove, on the other hand, will now try to
asynchronously remove a SCSI target from all sessions
first and only afterwards it will remove it from the
device struct.
This allows us to safely hotplug and hotremove SCSI
targets into Vhost SCSI devices that have multiple
active sessions.
Each session will still use its management poller to
try to locally remove those targets that were scheduled
for hotremoval and the additional
spdk_vhost_dev_foreach_session() will now also try to
remove each one of them from the entire Vhost SCSI
device after making sure they were already removed from
all sessions.
Change-Id: Idd080b618768c71cd1cd564efeaf930bf79fb578
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439321
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Before we implement the support for multiple sessions
per device, we still need to make a few intermediate
changes that will require a counter of currently polled
sessions. So here it is.
Change-Id: I0a1d928eafa75efa1b5c2e6670a5ceb282c87fa4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441734
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The backend struct will get some new dependencies
soon, so move its definition further in a header
file.
Change-Id: I39c25027312777c7e570b12511dc9c5e9b9023d4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439322
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Returning negative value from a `foreach` callback
will now break the entire chain. This is required
for refactoring spdk_vhost_dev_foreach_session() to
use the same mechanism as external events. Before
we actually do the refactor, we add the only feature
that external events were missing.
Change-Id: I70bda3df99748de51429e329a056c37a3bc7e348
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Partially decoded request need to be free even if
spdk_json_decode_object() fails.
Change-Id: Icd00f835537dbaf197cc4f05930be8c543a534a6
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439716
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
For the backend block device which can support flush command,
vhost-blk should report such feature to Guest, and leave such
decision to Guest.
Change-Id: I6cd6fd94ed80256ffe268bc1bf2c1dd57a164825
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439605
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Both the scsi device and its state will need an
additional per-session copy, so we put those two
in a single struct now.
This also serves as a cleanup.
Change-Id: I01bc11db4070e9f258de15e6c44b6f4fcd1d51f2
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439320
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Prepared APIs to operate on a session parameter rather
than a device parameter. Some of those functions are
now ready to support more than one session per device.
Change-Id: Ie4a49cd1d1f8ee1b826952bc87e66aa0cabdd925
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Vhost SCSI session specific fields were moved from the
device struct into a new session struct. This is the
same change as Vhost Block had.
Most of the functions inside vhost scsi still accept
vhost device as a parameter and then get the session
object internally. Those functions will be refactored
to accept session object directly in a separate patch
since the amount of changes required is too big to be
done here.
Change-Id: I8b87ba1187413d471b463aa7067821928ac0303e
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439318
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Following the same change with SCSI target hotremove,
we'll now allow hotremoving an underlying bdev without
VIRTIO_SCSI_F_HOTPLUG negotiated. The hotremoval will
be still reported through SCSI sense codes.
Change-Id: I5b3dd09bd72456eda745f4225f76603fad999da6
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Failing the hotremove request will become more tricky once
we allow creating multiple sessions per device, so we try
to get rid of any unnecessary error checks.
VIRTIO_SCSI_F_HOTPLUG tells us just if the host is capable
of receiving hotplug events, but the scsi target can be
hotremoved even without them. The hotremoval will be still
reported through SCSI sense codes. All I/O to a hotremoved
target will be failed with sense key ILLEGAL REQUEST,
asc 0x25, ascq 0x00 (LOGICAL UNIT NOT SUPPORTED).
Change-Id: I2be4e0167eb06804112758a5825cd91745128408
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439316
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This lets us remove the assumption of having only
a single session per device and brings us closer
towards supporting more.
Change-Id: Ibbb7b1ed789ff0690e62c00fb5ed39ce64245028
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438680
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Different Vhost Block sessions could be technically
polled on different threads, so we move the io_channel
from the device struct into the session struct.
Change-Id: I004cad8b6dc6d198844fca3bb11724e3f176dc9d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439315
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Prepared APIs to operate on a session parameter rather
than a device parameter. Some of those functions are
now ready to support more than one session per device.
Change-Id: Id55e70ae521039f5acc47e80ab8b0aa043679d95
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438679
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
With all the core vhost changes in place, we can refactor
the upper layers now. We start with the vhost block since
it's the easiest one.
Vhost Block session specific fields were moved from the
device struct into a new session struct. What's tricky,
is that the blk-specific struct directly contains the
generic session struct. This gives us handy access to
the generic session data from the blk-specific session,
and also allows us to directly upcast generic session
objects.
Most of the functions inside vhost block still accept
vhost device as a parameter and then get the session
object internally. Those functions will be refactored
to accept session object directly in a separate patch
since the amount of changes required is too big to be
done here.
Because of the above, some parts of this patch might
seem overcomplicated. Especially the to_blk_session()
funtion, which checks the device backend inside. The
ultimate goal is to receive session object through
start/stop callbacks - and for that the backend check
does make sense.
Change-Id: If222c31aec16a8cbe2d0cfb98c828e1ac75b91fc
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438678
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
kill a spdk process with aio bdev storage many times will get error messages:
bdev_aio.c: 244:bdev_aio_group_poll: *ERROR*: epoll_wait error(9): Bad file descriptor on ch=0x152a7f0
vhost.c:1010:_spdk_vhost_event_send: *ERROR*: Timeout waiting for event: stop device.
When spdk process exits, the connfd is closed by rte_vhost_driver_unregister, then
other function such as epoll_create1 in aio bdev may allocate the same fd,
but this fd is closed by vhost_user_read_cb again, so epoll_wait return -1
Signed-off-by: Honghui Wang <wanghonghui@ucloud.cn>
Change-Id: Ic3fd938892f004c18fb38d4594c006c40a01efaa
Reviewed-on: https://review.gerrithub.io/c/439851
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: Changpeng Liu <changpeng.liu@intel.com>
Sessions are allocated internally by the core vhost
library whenever DPDK accepts a new connection, so
the only reasonable way to store additional per-sesion
data is to tell the core vhost library how much extra
memory it needs to allocate. Hence, we add a new field
to the vhost device backend struct.
Change-Id: Id6c8285505b2e610e28e5d985aceb271ed232555
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Instead of calculating those settings once and storing
them in the device struct, we'll now recalculate them
whenever a device session is created. This lets us
remove 2 fields from the device struct.
Change-Id: I2cb2bdbc570a41ae78c0666490fb1462a00d0b6f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
dev->mem_table_fds init in vhost_user_set_mem_table
but dev->mem may init later in vhost_user_set_vring_addr,
so if qemu crash or lost conntion after vhost_user_set_mem_table
and before vhost_user_set_vring_addr,
it's hugepage memory is not being freed
Signed-off-by: Honghui Wang <wanghonghui@ucloud.cn>
Change-Id: I782c106078829ff6691ed3265a5d1718493de90c
Reviewed-on: https://review.gerrithub.io/c/440254
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Makes the code slightly more readable.
Change-Id: Iebf8fb07bceacf433d4bdad0a30419a3faab7eee
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439370
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We use those values in various places in SPDK,
so let's define them in a single place now.
Change-Id: Iad9a5745d69166a6e6032370d4e5a0e604914e45
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Keep all coalescing variables inside the session struct.
Interrupt coalescing is still configured with the device-
pecific APIs, but those will now transparently propagate
the change to all active connections.
This is the last piece that held struct spdk_vhost_dev
tied with the session's lcore. Now that device
settings aren't actively polled by any sessions, they
only need to be synchronized with the global vhost lock.
This will potentially let us get rid of the vhost external
events API, allowing user to lock the mutex directly,
set coalescing params directly, and transparently let
the internal spdk_vhost_dev_foreach_session() do the
tricky synchronization.
Change-Id: Ifba96d241c736d33376861fa894c738e7d9b5b40
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437777
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
When device is changed, e.g. the underlying bdev is
hotremoved, all sessions need to be notified. For
instance, Vhost-SCSI would send an additional hotremove
eventq message. That's why we introduce a helper
function to iterate through all active sessions.
Eventually, we may want to poll different sessions
from different lcores, so there will be some kind of
internal cross-lcore message management required
- just like there is one for spdk_vhost_call_external_event_foreach().
For now, though, we can get away with a dumbest
implementation.
We still want to keep this API internal for the time
being. The end-user (RPC) should only modify the
device, and the whole concept of sessions should be
completely encapsulated.
Change-Id: I2e142632c07a23daeac15cabea4cffecf984e455
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/418736
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Session struct will be now allocated inside the
`new_connection` rte_vhost callback. There can be
still only one connection per device, but this
change brings us one step towards supporting more.
Besides the obvious pointer changes, we'll now also
use the session pointer to check if the connection
actually exists. We used to set device vid to -1
when there was no connection but we no longer have
to do that.
Change-Id: I4d062c0b5f093fef132a6a2c9cc29458cbaad414
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Discard and Write Zeroes commands was supported with
commit 1f23816b in Linux virtio-blk driver. While
here, also add the support in SPDK vhost target.
Change-Id: I425be2a4961eac04e27ff71151d40c8d799cd37d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/431723
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This follows the same trend as the mem_map APIs.
Currently, most of the spdk_vtophys() callers manually
detect physically noncontiguous buffers to split them
into multiple physically contiguous chunks. This patch
is a first step towards encapsulating most of that logic
in a single place - in spdk_vtophys() itself.
This patch doesn't change any functionality on its own,
it only extends the API.
Change-Id: I16faa9dea270c370f2a814cd399f59055b5ccc3d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438449
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Each connection is created with the `new_connection`
rte_vhost callback with a unique vid parameter. Storing
the vid inside the device struct was sufficient until
we wanted to have multiple connections per device.
Change-Id: Ic730d3377e1410499bdc163ce961863c530b880d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/437775
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Grouped a few spdk_vhost_dev struct fields into a new
struct spdk_vhost_session. A session will represent the
connection between SPDK vhost device (vhost-user slave)
and QEMU (vhost-user master).
This essentially serves two purposes. The first is to
allow multiple simultaneous connections to a single
vhost device. Each connection (session) will have access
to the same storage, but will use separate virtqueues,
separate features and possibly different memory. For
Vhost-SCSI, this could be used together with the upcoming
SCSI reservations feature.
The other purpose is to untie devices from lcores and tie
sessions instead. This will potentially allow us to modify
the device struct from any thread, meaning we'll be able
to get rid of the external events API and simplify a lot
of the code that manages vhost - vhost RPC for instance.
Device backends themselves would be responsible for
propagating all device events to each session, but it could
be completely transparent to the upper layers.
Change-Id: I39984cc0a3ae2e76e0817d48fdaa5f43d3339607
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/437774
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
By subsequent patches for iSCSI, spdk_scsi_dev_queue_mgmt_task()
will not be called directly from the function that knows TMF code,
and currently setting TMF code to SCSI task is done in
spdk_scsi_dev_queue_mgmt_task().
Hence after subsequent patches for iSCSI, to hand off TMF code to
SCSI task, any dynamic context will be required.
To avoid the dynamic context, extract setting TMF code from
spdk_scsi_dev_queue_mgmt_task() and put appropriate place for
each call of spdk_scsi_dev_queue_mgmt_task().
Additionally, in spdk_abort_transfer_task_in_task_mgmt_resp(),
ref_task_tag is got from PDU but getting it from SCSI task is
much easier. Hence get ref_task_tag from SCSI task in the callback.
Change-Id: I7add9290598d2df7cfcf1506ec75d74c70c0f236
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/436643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
VHOST_USER_NVME_IO_CMD is designed to deliver NVMe IO command
header to slave target via socket, this can be used in BIOS
which will not enable Shadow Doorbell Buffer feature, since
we enabled the shadow BAR feature to support some old Guest
kernel without Shadow Doorbell Buffer feature, so the message
isn't required, just remove it.
Change-Id: I72e55f11176af2405c8cc09da404a9f4e5e71526
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/420821
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
For some old Linux Guest kernels, the new NVMe 1.3 feature: shadow
doorbell buffer is not enabled, while here, make a dummy BAR region
inside slave target, when Guest submits a new request, the doorbell
value will be write to the shared memory between Guest and vhost
target, so that the existing vhost target can support both new
Linux Guest kernel(newer than 4.12) and old Guest kernel.
Also, the shared BAR space can be used in future which we can move
ADMIN queue processing into SPDK vhost target, with this feature,
the QEMU driver will become very small and easy for upstreaming.
Change-Id: I9463e9f13421368f43bfe4076facddd119f4552e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/419157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
We always need to ensure bvdev->bdev is not NULL before
accessing it. GET_CONFIG handler was standing out in this
matter, causing segfaults when connecting to a vhost block
device that went through a bdev hotremove.
If the bdev has been hotremoved we'll now report
blocksize == 0 and blockcount == 0. From what I checked,
QEMU will still expose a virtio-blk device to the VM, but
linux (because that's what I checked) won't create a block
device for it.
Note: The behavior above can be tricky to investigate as
neither QEMU nor VM doesn't log any messages about it.
We should eventually reject any new connections to a vhost-blk
socket with hotremoved bdev, but that's a task for later.
Fixes#460
Change-Id: I266257f4b35d07f35ba03adf46cb18425f3cf39a
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/417934
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Gcc reports error "vhost_user.c:665: error: ‘MADV_DONTDUMP’ undeclared
(first use in this function)" when build in CentOS-6, include
<asm/mman.h> fixes it.
Change-Id: I1b19c0cb6424a8c5ad1e1dd7d1c724edeb06e171
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/428912
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
User can use this optional parameter to get the information of specific
vhost controller.
Change-Id: I3911c6c7d4e7b75e82277d1e4690d5e40019aa06
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/425451
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
according to commit:
bdev: add spdk_bdev_queue_io_wait
This patch will make io_wait to support vhost_nvme
Change-Id: I20f7c9f2005a0d4140db9ff2d522fb5e15356cd5
Signed-off-by: Ni Xun <nixun@baidu.com>
Signed-off-by: Li Lin <lilin24@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Reviewed-on: https://review.gerrithub.io/425884
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
So we don't need to do extra memory allocation when stop vhost device.
This makes code more clean.
Change-Id: I27a1b446621ce4f452fee62acd634737b4ffe174
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/427336
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
according to commit:
bdev: add spdk_bdev_queue_io_wait
This patch will make io_wait to support vhost_blk
Change-Id: I529001fc74427adda63d0d41901a98229364175d
Signed-off-by: Ni Xun <nixun@baidu.com>
Signed-off-by: Li Lin <lilin24@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Reviewed-on: https://review.gerrithub.io/425479
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Sun Zhenyuan <sunzhenyuan@baidu.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_rpc_get_vhost_controllers() forgot to free memory of w when ctx
allocation failed. Defering the invoke of spdk_jsonrpc_begin_result(),
so we can handle the failure conveniently.
Change-Id: I223f5b9a918b068922d5a68c8d688331ebf90e42
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/425358
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is in response to a Scan-build issue reported on Clang 6.
Change-Id: I2edc853145762998db818cbbe0e9ca0d9b8c123d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/424139
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
On hot remove, before closing bdev, put its IO channel.
If we don't do this bdev won't be destroyed until controller
is removed.
This fixes GitHub issue #401.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I92600c2f7136f56afd6593f663f2d2cf468a37dc
Reviewed-on: https://review.gerrithub.io/422258
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7716f1748803872cac85e296d60747752ca046f4
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/422273
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
QEMU will send SET_VRING_ADDR when perform live migration,
it's not correct to update the memory table while the device
is running.
Change-Id: I899d3a996355ab6aa69835d90da14a86f93240fa
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/420944
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We need these errno to make outputs more clear and make
spdk_vhost_blk_construct() conforms to it's declaration.
Change-Id: I1936e7393494a0344d97c8a920f3d719dee27002
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/421859
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
irq_delay must be not less than zero.
Change-Id: I22d8a7df453f07a44a32582d8e880949824bf868
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/421685
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This fixes queue handling for QEMU 2.12 which is sending new memory
table but resetting only some queues
Fixes#339
Change-Id: Ic971725261720d7459e49a4f14bc15c2f2a77b1a
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/420372
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Chen Wang <chenx.wang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This also save CWD on init so socket path is no longer relative.
Change-Id: I303401fe0340f0bc2ea5e4ba468361f68ea84c3d
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/419067
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Kaminski <pawelx.kaminski@intel.com>
Reviewed-by: Paweł Niedźwiecki <pawelx.niedzwiecki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Print an error message instead. The driver can still do
a manual rescan if it wants to see the new SCSI target.
Change-Id: Ieb76ada8625bf00ad068a791b860e4b08ad5cb83
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/417268
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6babd4cf990bf19b510db88bdfb0ca81e29d9252
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/414700
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Madhu Pai <mpai@netapp.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
As of NVMe 1.3b, there is only one command set. But pipe
this through the driver per-spec anyway.
Change-Id: I4faf8596f5ce638e5e2a500b424e00ceb6e89edc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/412102
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
"Dev" was deprecated since v18.01, now we have released v18.04,
so remove the "Dev" support from existing code.
Change-Id: I54c4cf83f78d3b0fdb13e625936c889d7bfaeba9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/409989
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Fix the memory leaks when construct the vhost nvme dev
using the error number of io queues.
Change-Id: Ie63e048b355d8a3d602e0da415dca279c0515a8b
Signed-off-by: Chen Wang <chenx.wang@intel.com>
Reviewed-on: https://review.gerrithub.io/410547
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This clarifies the bdev_io completion callbacks, since they can now
assume that they always have a valid bdev_io. All call sites that
originally passed NULL for bdev_io now set an appropriate status code in
the task and complete it with the new spdk_vhost_nvme_task_complete()
function.
Change-Id: Id74aafb28e83e135bbb0a410ff9766dc1b9ece50
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/410080
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently Get/Set features vhost messages use 4096 data buffer, but
it does need this buffer for real usage scenario.
Change-Id: If84f795209d771670449283cef3143f3019baee0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/409613
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously after IO is finished, we will put completion entry and
irq event notice at the same IO completion callback, for performance
consideration, move the irq event routine into IO poller context, this
can be used to implement interrupt coalescing feature in future.
Change-Id: Ic20b50af47b73ffcb91938802e18b316c07a4d11
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/408943
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For namespace we need to use array of objects. Remove useless check for
bdev != NULL as this is already guarded by ns_active.
As we are here also switch to new JSON named API.
Change-Id: If8ab15c63ea015731829276e06e5cb5a801620d2
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/409025
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Add state_mask to each RPC method and state to RPC server, respectively.
State mask of RPC method is set at registration. State of RPC server
is changed according to the state of the SPDK.
When any RPC method is recieved, if the bit of the RPC server is on in
the state mask of the RPC method, it is allowed. Otherwise, it is
rejected.
When any RPC is rejected by state_mask control, the new error code
is returned to describe the error clearly.
Change-Id: I84e52b8725a286e9329d61c56f498aa2c8664ec1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/407397
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
__rte_always_inline was added in DPDK 17.05; replace the single use with
a regular 'inline' to restore compatibility with older DPDK versions.
Change-Id: Ia8a0f729cc4c39a9aaab0700f3c827a9766d1dd0
Fixes: e30595fbe3 ("rte_vhost: introduce safe API for GPA translation")
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/409077
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The existing code was using sizeof() on a calculation that already
returned the right size, so this would originally just memset() 4 bytes
instead of the whole region.
The spec requires the doorbell buffer to be exactly one physical memory
page, and our controller emulation chooses MPS so that only 4096-byte
pages are supported, so just zero out the page and drop the calculation
entirely.
Change-Id: I71db1bebf0a4d5dbe55fd411786e19a8d6802c20
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/408730
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the user requests an invalid QID or CQID in Create I/O Completion
Queue or Create I/O Submission Queue, we need to return an error of
Invalid Queue Identifier as indicated by the spec.
Change-Id: I7467aa04da9e374bb596731f4e4174967d44cffb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/408767
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Replace the hard-coded 64 and 16 constants, which are the size of the
submission queue entry and completion queue entry, with equivalent
sizeof expressions.
Change-Id: I5a9d8fc1ab98276312445f0699aae3d86beee705
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/408762
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>