Gcc reports error "vhost_user.c:665: error: ‘MADV_DONTDUMP’ undeclared
(first use in this function)" when build in CentOS-6, include
<asm/mman.h> fixes it.
Change-Id: I1b19c0cb6424a8c5ad1e1dd7d1c724edeb06e171
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/428912
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
QEMU will send SET_VRING_ADDR when perform live migration,
it's not correct to update the memory table while the device
is running.
Change-Id: I899d3a996355ab6aa69835d90da14a86f93240fa
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/420944
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This fixes queue handling for QEMU 2.12 which is sending new memory
table but resetting only some queues
Fixes#339
Change-Id: Ic971725261720d7459e49a4f14bc15c2f2a77b1a
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/420372
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Chen Wang <chenx.wang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently Get/Set features vhost messages use 4096 data buffer, but
it does need this buffer for real usage scenario.
Change-Id: If84f795209d771670449283cef3143f3019baee0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/409613
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
__rte_always_inline was added in DPDK 17.05; replace the single use with
a regular 'inline' to restore compatibility with older DPDK versions.
Change-Id: Ia8a0f729cc4c39a9aaab0700f3c827a9766d1dd0
Fixes: e30595fbe3 ("rte_vhost: introduce safe API for GPA translation")
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/409077
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Replace the hard-coded 64 and 16 constants, which are the size of the
submission queue entry and completion queue entry, with equivalent
sizeof expressions.
Change-Id: I5a9d8fc1ab98276312445f0699aae3d86beee705
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/408762
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently QEMU vhost-nvme driver hasn't been pushed to QEMU community,
for vhost-user socket messages, QEMU can pick the opcode at any time,
QEMU 2.12 already picked 27-30 for other driver, for the purpose to
mitigate rebase work in future, while here, we reserve a bigger value
so that it will not conflict with QEMU for very long time.
Change-Id: Ic404bb14330c4acc484aa9c86983030803a31e77
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/408771
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
QEMU added a protocol feature bit to indicate the slave
target can support GET/SET config messages, while here,
add it to SPDK vhost target so that it can work with
QEMU 2.12.
Change-Id: I41a813ef23fba4d3fdf7bb3e3617a9feb4209509
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/408416
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch ensures that all the address range is mapped
when translating addresses from master's addresses
(e.g. QEMU host addressess) to process VAs.
Change-Id: If141670951064a8d2b4b7343bf4cc9ca93fe2e6d
Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/408721
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This new rte_vhost_va_from_guest_pa API takes an extra len parameter,
used to specify the size of the range to be mapped.
Effective mapped range is returned via len parameter.
Change-Id: Ib3830e1da9e0cb477d99860a03684c665bb3f6ec
Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/408720
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib77a219e02a5cde69293eb1f7002507cf4930ae3
Fixes: 90c0e24410 ("vhost_user_nvme: add vhost user nvme target to SPDK")
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/407193
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Similar with exist vhost scsi/blk target, this commit introduces
a new target: vhost nvme I/O slave target, QEMU will present an
emulated NVMe controller to VM, the SPDK I/O slave target will
process the I/Os sent from Guest VM.
Users can follow the example configuation file to evaluate this
feature, refer to etc/spdk/vhost.conf.in [VhostNvme].
Change-Id: Ia2a8a3f719573f3268177234812bd28ed0082d5c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/384213
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch adds support for live migration for vhost-scsi and vhost-blk
backends.
Change-Id: Ibfc8a713dbba14ba8cb38377a71e28fd340b1487
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/394203
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
QEMU always set offset to 0 but for sanity we should take the offset
into account.
Change-Id: I36213cd8fbeb85862b6de59c60bd6bcee7f9d1b2
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/395740
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Some messages like SET_FEATURES, SET_VRING_ADDR etc will change internal
state of VQ or device. To prevent race vs thread polling those queues
stop the device.
Change-Id: I15caf9da0decbaa660e9773c93d45ff148e5e9a8
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/395739
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
New added vhost user messages: GET_CONFIG/SET_CONFIG are
used for get/set virtio device's configuration space, this
commit enable the new added vhost messages.
Change-Id: I5c3e3f8fb6ed55e99299323c39658765b1724bb8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/386545
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DPDK 17.11 removed all rte_config.h includes
from rte_*.h headers, meaning we should either
use gcc param -include rte_config.h (just
like DPDK does), or include this file before
each other rte_*.h include. Since we're using
the latter approach in many places already,
I decided to follow it.
While here, also removed rte_vdev.h dependency
from rte_virtio/virtio_user.c. It's not used
anyway.
Change-Id: I865ee9f828211c03a60fd0446f7a418d5dddd140
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/387653
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There are two separate abstraction layers:
* vsocket - which represents a unix domain socket
* virtio_net - which represents a vsocket connection
There can be many connections
on the same socket. vsocket
provides an API to enable/disable
particular virtio features on
the fly, but it's the virtio_net
that uses these features.
virtio_net used to rely on
vsocket->features during
feature negotiation, breaking
the layer encapsulation (and
yet causing a deadlock - two
locks were being locked in a
separate order). Now each
virtio_net device has it's own
copy of vsocket features, created
at the time of virtio_net creation.
vsocket->features have to be
still present, as features can be
enabled/disabled while no
virtio_net device has been
created yet.
Fixes#214
Change-Id: Ic4b2dd8cae6050813fc9a420b2ed30bc5ae60393
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/386294
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The vhost connection can be closed
concurrently from 2 places:
* the connection thread itself
* rte_vhost_driver_unregister
The connection thread will terminate
the connection if any recv error
occured. The unregister function
will terminate the connection
together with the thread.
However, there is no sychronization
between those two. The connection
thread runs in the background
without any mutex.
The rte_vhost_driver_unregister
now signals the connection thread
to terminate itself and waits
until it's killed.
Change-Id: I012e97ebb8a79edcb2c17c28b2fc7e8041bf92b3
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/383085
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The rte_malloc_socket() call just above that allocates vq is only
allocating sizeof(*vq), but the memcpy() would have tried to copy
sizeof(*vq) * 2.
This code is under #ifdef RTE_LIBRTE_VHOST_NUMA, so it was not normally
enabled with DPDK 17.05, but it breaks when DPDK 17.08 turns on libnuma
support by default.
Change-Id: I75c0c8666a9147346038d313fb419350988d8187
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/377596
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Some rte_vhost files use #ifdef RTE_LIBRTE_VHOST_NUMA, but they don't
explicitly include rte_config.h, which defines this macro. Instruct
the compiler to pre-include rte_config.h in the same way DPDK's build
system does.
Change-Id: Iddde76b8c3d0956ccd5f481956cede650d858586
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/377595
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added new callbacks to notify about socket connection status.
As destroy_device is used for virtqueue processing *pause* as well as
connection close, the user has no distinction between those.
Consider the following scenario:
rte_vhost: received SET_VRING_BASE message,
calling destroy_device() as usual
user: end-user asks to remove the device (together with socket file),
OK, device is not *in use* - that's NOT the behavior we want
calling rte_vhost_driver_unregister() etc.
Instead of changing new_device/destroy_device callbacks and breaking
the ABI, a set of new functions new_connection/destroy_connection
has been added.
Change-Id: I50a8ca4035045892d6c658da7df58c0c97025ec3
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/372074
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2 locks are executed in 2 places in opposite orders.
Consider the following scenario, threads A and B:
(A)
* fdset_event_dispatch() start
* pfdentry->busy = 1; (lock #1)
* vhost_user_read_cb() start
* vhost_destroy_device() start
(B)
* rte_vhost_driver_unregister() start
* pthread_mutex_lock(&vsocket->conn_mutex); (lock #2)
* fdset_del()
* endless loop, waiting for pfdentry->busy == 0 (lock #1)
(A)
* vhost_destroy_device() end
* pthread_mutex_lock(&vsocket->conn_mutex); (lock #2)
(mutex already locked - deadlock at this point)
Thread B has locked vsocket->conn_mutex and is in while(1)
loop waiting for given fd to change it's busy flag to 0.
Thread A would have to finish vhost_user_read_cb() in order
to set busy flag back to 0, but that can't happen due to
the vsocket->conn_mutex lock.
This patch defers the fdset_del(), so that it's called outside of
vsocket->conn_mutex.
Change-Id: Ifb5d4699bdafe96a573444c11ad4eae3adc359f5
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/375910
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This reverts commit 6973898164.
This solution was incomplete, see the next patch which properly
fixes the deadlock issue.
Change-Id: Ib3cc609814276f1c48b05280379b8c2849ad831f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/375909
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Since vhost_user_set_features failure is not handled in any way, a
single error log has been added to at least to let the user know that
something has gone wrong.
Change-Id: Ifcf27320af75ba74347b742643b23e43b7c01149
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/365807
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For each virt queue's kickfd and callfd, there are 2 invalid
status: VIRTIO_UNINITIALIZED_EVENTFD and VIRTIO_INVALID_EVENTFD.
Don't set the virt queue to ready status until got the valid
descriptor.
This is safe for polling mode drivers in Guest OS, the backend
vhost process will not post notification to interrupt vector for
PMD mode in Guest, but the interrupt vector still valid.
Change-Id: Icdf1e67f3c4e8da221843eb1383469ca1fba485c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/365327
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
After changes done in commit f325e71c, closing old FD is delayed until
VHOST_USER_SET_VRING_ADDR. If VM is closed before this call, original FDs
remain during vhost_backend_cleanup. This resolves issue #162.
This patch closes second set of FDs during vhost backend cleanup.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ieb9d123c987009ac451b6214bb74d2720d852781
Reviewed-on: https://review.gerrithub.io/361787
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This fixes spontaneous vhost hangs on SIGINT shutdown.
Apperently during vhost_destroy_device(conn->vid) from
line #284 another QEMU message might arrive, causing
vsocket->conn_mutex deadlock. (line #286)
Change-Id: I4f1c31a52facffd1eb1e1192591095f00da55031
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
LOG_DEBUG is a symbol defined by POSIX, so if sys/log.h
is included the symbols conflict.
We'll need to push this patch to upstream DPDK too.
Change-Id: Ib263731864aca4791226ea6e3abb5ddfe42e97d8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Since we keep a copy of DPDK vhost library, the header file don't
have dependency on DPDK vhost library.
Change-Id: I14d48e10227633547231e4f429e7375ffa76128d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Fix up all existing spacing errors in comments and add an automated
check for patterns like /*comment*/.
Change-Id: I28f61c93612dc0f8aed66bd509da78e91ea9737e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
First step is do not destroy an existing device in
vhost_user_set_mem_table(). This is because we may
still be processing I/O via INT13 while QEMU is setting
up the mem tables for OS boot.
The primary part of this patch though is to defer
using the new mem table until after we receive the
first SET_VRING_ADDR message. SET_VRING_ADDR will be
sent by QEMU when guest OS virtio-scsi driver starts
initialization. At this point it is safe to invalidate
the old mem tables because there will be no more
INT13 I/O at this point.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I45fb5910f45e7fd2cf4a325341ad105a57d8ea40
This patch adds a library, application and test scripts for extending
SPDK to present virtio-scsi controllers to QEMU-based VMs and
process I/O submitted to devices attached to those controllers.
This functionality is dependent on QEMU patches to enable
vhost-scsi in userspace - those patches are currently working their
way through the QEMU mailing list, but temporary patches to enable
this functionality in QEMU will be made available shortly through the
SPDK github repository.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Krzysztof Jakimiak <krzysztof.jakimiak@intel.com>
Signed-off-by: Michal Kosciowski <michal.kosciowski@intel.com>
Signed-off-by: Karol Latecki <karolx.latecki@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Krzysztof Jakimiak <krzysztof.jakimiak@intel.com>
Change-Id: I138e4021f0ac4b1cd9a6e4041783cdf06e6f0efb