VFIO in QEMU uses region 9 as the PCI passthrough devices' migration channel.
The format of the region 9 migration region is as follows:
------------------------------------------------------------------
|vfio_device_migration_info| data section |
------------------------------------------------------------------
QEMU will access vfio_device_migration_info to controll the migration
process.
For SPDK vfio-user target, we also implement the BAR9 via libvfio-user,
and we also define the NVMe device specific migration data stored in
data section of BAR9. QEMU doesn't care about the format in data section,
it will help us to gather the NVMe specific migration data in source VM and
then restore the migration date to data section of BAR9 in destination VM.
The core idea to implement live migration will following the device state
change which is controlled by QEMU. First QEMU will try to STOP the device
in the source VM, and set the destination VM to RESUME state, SPDK will save
NVMe devic state data structure to BAR9 in the source VM once the subsystem
is paused, then QEMU will read BAR9 in source VM and restore the content of
BAR9 in destination VM, finally in the destination VM, we will restore the
NVMe device state include BARs/PCI CFG/queue pairs in the destination VM.
Change-Id: I42e38f28c3ff59831be63290038b50d199d06658
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7617
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
libvfio-user will call quiesce callback when there are
memory region add/remove and device state change requests
from client, and in the quiesce callback, we will pause
the subsystem so that it's safe to do everything after
it, then after quiesce callback, we will resume the
subsystem. The quiesce callback is also used in
live migration, each device state change will quiesce
the device first.
Change-Id: I3a6a0320ad76c6b2d1d65c754b9f79cce5c9c683
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10620
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
According to the specification, we should also post an AER
error event for this error case.
Fix#2171.
Change-Id: Ifb2343453ea5e36ce244938a939537ee6ed1c4e1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9584
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There is no need to map the PRP/SGL list RW since this memory is never written
to. In fact, SeaBIOS might submit a request where the PRP list resides on
read-only memory, so attempting to map it RW can break things.
Change-Id: I7e4e90b1fa7e33e81b8d5cd8dcb9568c038938ec
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7288
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot