virtio: add initial rev of virtio-scsi initiator bdev module

Supports both PCI mode (for usage in guest VMs) and
vhost-user mode (for usage in host processes).  The rte_virtio
subdirectory contains a lot of code lifted from the DPDK
virtio-net driver.  Most of the PCI and vhost-user code is
reused almost exactly as-is, but the virtio code is drastically
rewritten as the DPDK code was very network specific.

Has been lightly tested with both the bdevio and bdevperf
applications in both PCI and vhost-user modes.

Still quite a bit of work needed - a list of todo
items is included in a README in the module's directory.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I85989d3de9ea89a87b719ececdb6d2ac16b77f53
Reviewed-on: https://review.gerrithub.io/374519
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This commit is contained in:
Jim Harris 2017-05-30 14:13:50 -07:00 committed by Ben Walker
parent 00380d62d9
commit c2175d2c51
21 changed files with 3994 additions and 5 deletions

View File

@ -47,7 +47,7 @@ LIBNAME = bdev
DIRS-y += error gpt malloc null nvme rpc split
ifeq ($(OS),Linux)
DIRS-y += aio
DIRS-y += aio virtio
endif
DIRS-$(CONFIG_RBD) += rbd

47
lib/bdev/virtio/Makefile Normal file
View File

@ -0,0 +1,47 @@
#
# BSD LICENSE
#
# Copyright (c) Intel Corporation.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
# * Neither the name of Intel Corporation nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
CFLAGS += $(ENV_CFLAGS) -I$(SPDK_ROOT_DIR)/lib/bdev/ -Irte_virtio
CFLAGS += -I$(SPDK_ROOT_DIR)/lib/vhost/linux
C_SRCS = bdev_virtio.c
C_SRCS += rte_virtio/virtio_ethdev.c rte_virtio/virtio_pci.c rte_virtio/virtio_rxtx.c
C_SRCS += rte_virtio/virtio_user_ethdev.c
C_SRCS += rte_virtio/virtio_user/vhost_user.c rte_virtio/virtio_user/virtio_user_dev.c
LIBNAME = bdev_virtio
include $(SPDK_ROOT_DIR)/mk/spdk.lib.mk

84
lib/bdev/virtio/README.md Normal file
View File

@ -0,0 +1,84 @@
# SPDK virtio bdev module
This directory contains an experimental SPDK virtio bdev module.
It currently supports very basic enumeration capabilities for
virtio-scsi devices as well as read/write operations to any
SCSI LUNs discovered during enumeration.
It supports two different usage models:
* PCI - This is the standard mode of operation when used in a guest virtual
machine, where QEMU has presented the virtio-scsi controller as a virtual
PCI device. The virtio-scsi controller might be implemented in the host OS
by SPDK vhost-scsi, kernel vhost-scsi, or a QEMU virtio-scsi backend.
* User vhost - Can be used to connect to an SPDK vhost-scsi target running on
the same host.
Note that 1GB hugepages is pretty much required to use this driver in
user-vhost mode. vhost protocol requires passing a file descriptor for
each region of memory being shared with the vhost target. Since DPDK opens
every huge page explicitly, it is fairly limited on how many file descriptors
it can pass due to the VHOST_MEMORY_MAX_NREGIONS limit of 8.
Use the following configuration file snippet to enumerate a virtio-scsi PCI
device and present its LUNs as bdevs. Currently it will only work with
a single PCI device.
~~~{.sh}
[Virtio]
Dev Pci
~~~
Use the following configuration file snippet to enumerate an SPDK vhost-scsi
controller and present its LUNs as bdevs. In this case, the SPDK vhost-scsi
target has created an SPDK vhost-scsi controller which is accessible through
the /tmp/vhost.0 domain socket.
~~~{.sh}
[Virtio]
Dev User /tmp/vhost.0
~~~
Todo:
* Support multiple PCI devices, including specifying the PCI device by PCI
bus/domain/function.
* Define SPDK virtio bdev request structure and report it as the context
size during module initialization. This will allow the module to build
its request and response in per-bdev_io memory.
* Asynchronous I/O - currently the driver polls inline for all completions.
Asynchronous I/O should be used for both enumeration (INQUIRY, READ CAPACITY,
etc.) as well as read/write I/O.
* Add unmap support.
* Add I/O channel support. Includes requesting correct number of queues
(based on core count). Fail device initialization if not enough queues
can be allocated.
* Add RPCs.
* Break out the "rte_virtio" code into a separate library that is not
linked directly to the bdev module. This would allow that part of the
code to potentially get used and tested outside of the SPDK bdev framework.
* Check for allocation failures in bdev_virtio.c code.
* Change printfs to SPDK_TRACELOGs (or just remove altogether).
* Add virtio-blk support. This will require some rework in the core
virtio code (in the rte_virtio subdirectory) to allow for multiple
device types.
* Bottom out on whether we should have one virtio driver to cover both
scsi and blk. If these should be separate, then this driver should be
renamed to something scsi specific.
* Add reset support.
* Finish cleaning up "eth" references. This includes filenames like
virtio_ethdev.c and "eth" in various API calls.
* Improve the virtio_xmit_pkts and virtio_recv_pkts interfaces. Should not
reference the virtio_hw tx_queues directly. Should have a more opaque API.
* Understand and handle queue full conditions.
* Clear interrupt flag for completions - since we are polling, we do not
need the virtio-scsi backend to signal completion.
* Check interrupt flag for submission. If the backend requires an interrupt,
we need to signal it.
* Change read/write to use READ_16/WRITE_16 to handle LBA > 4G. We can add
a basic check and bail during enumeration if INQUIRY indicates the LUN does
not support >= SBC-3.
* Automated test scripts for both PCI and vhost-user scenarios.
* Document Virtio config file section in examples. Should wait on this until
enough of the above items are implemented to consider this module as ready
for more general use.
* Specify the name of the bdev in the config file (and RPC) - currently we
just hardcode a single bdev name "Virtio0".

View File

@ -0,0 +1,351 @@
/*-
* BSD LICENSE
*
* Copyright (c) Intel Corporation.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include "spdk/stdinc.h"
#include "spdk/bdev.h"
#include "spdk/conf.h"
#include "spdk/env.h"
#include "spdk/io_channel.h"
#include "spdk/string.h"
#include "spdk/endian.h"
#include "spdk/stdinc.h"
#include "spdk_internal/bdev.h"
#include "spdk_internal/log.h"
#include <getopt.h>
#include <sys/param.h>
#include <linux/virtio_scsi.h>
#include <virtio_ethdev.h>
#include <virtio_user/virtio_user_dev.h>
#include "spdk/scsi_spec.h"
static int bdev_virtio_initialize(void);
static void bdev_virtio_finish(void);
struct virtio_scsi_disk {
struct spdk_bdev bdev;
struct virtio_hw *hw;
uint64_t num_blocks;
uint32_t block_size;
};
static int
bdev_virtio_get_ctx_size(void)
{
return 0;
}
SPDK_BDEV_MODULE_REGISTER(virtio_scsi, bdev_virtio_initialize, bdev_virtio_finish,
NULL, bdev_virtio_get_ctx_size, NULL)
static void
bdev_virtio_rw(struct spdk_io_channel *ch, struct spdk_bdev_io *bdev_io)
{
struct iovec iov[128];
struct virtio_req vreq;
struct virtio_scsi_cmd_req *req;
struct virtio_scsi_cmd_resp *resp;
uint16_t cnt;
struct virtio_req *complete;
struct virtio_scsi_disk *disk = (struct virtio_scsi_disk *)bdev_io->bdev;
bool is_read = (bdev_io->type == SPDK_BDEV_IO_TYPE_READ);
vreq.iov = iov;
req = spdk_dma_malloc(4096, 64, NULL);
resp = spdk_dma_malloc(4096, 64, NULL);
iov[0].iov_base = (void *)req;
iov[0].iov_len = sizeof(*req);
if (is_read) {
iov[1].iov_base = (void *)resp;
iov[1].iov_len = sizeof(struct virtio_scsi_cmd_resp);
memcpy(&iov[2], bdev_io->u.read.iovs, sizeof(struct iovec) * bdev_io->u.read.iovcnt);
vreq.iovcnt = 2 + bdev_io->u.read.iovcnt;
vreq.start_write = 1;
} else {
memcpy(&iov[1], bdev_io->u.write.iovs, sizeof(struct iovec) * bdev_io->u.write.iovcnt);
iov[1 + bdev_io->u.write.iovcnt].iov_base = (void *)resp;
iov[1 + bdev_io->u.write.iovcnt].iov_len = sizeof(struct virtio_scsi_cmd_resp);
vreq.iovcnt = 2 + bdev_io->u.write.iovcnt;
vreq.start_write = vreq.iovcnt - 1;
}
memset(req, 0, sizeof(*req));
req->lun[0] = 1;
req->lun[1] = 0;
if (is_read) {
req->cdb[0] = SPDK_SBC_READ_10;
to_be32(&req->cdb[2], bdev_io->u.read.offset / disk->block_size);
to_be16(&req->cdb[7], bdev_io->u.read.len / disk->block_size);
} else {
req->cdb[0] = SPDK_SBC_WRITE_10;
to_be32(&req->cdb[2], bdev_io->u.write.offset / disk->block_size);
to_be16(&req->cdb[7], bdev_io->u.write.len / disk->block_size);
}
virtio_xmit_pkts(disk->hw->tx_queues[2], &vreq);
do {
cnt = virtio_recv_pkts(disk->hw->tx_queues[2], &complete, 32);
} while (cnt == 0);
spdk_bdev_io_complete(bdev_io, SPDK_BDEV_IO_STATUS_SUCCESS);
spdk_dma_free(req);
spdk_dma_free(resp);
}
static int _bdev_virtio_submit_request(struct spdk_io_channel *ch, struct spdk_bdev_io *bdev_io)
{
switch (bdev_io->type) {
case SPDK_BDEV_IO_TYPE_READ:
spdk_bdev_io_get_buf(bdev_io, bdev_virtio_rw);
return 0;
case SPDK_BDEV_IO_TYPE_WRITE:
bdev_virtio_rw(ch, bdev_io);
return 0;
case SPDK_BDEV_IO_TYPE_RESET:
spdk_bdev_io_complete(bdev_io, SPDK_BDEV_IO_STATUS_SUCCESS);
return 0;
case SPDK_BDEV_IO_TYPE_FLUSH:
case SPDK_BDEV_IO_TYPE_UNMAP:
default:
return -1;
}
return 0;
}
static void bdev_virtio_submit_request(struct spdk_io_channel *ch, struct spdk_bdev_io *bdev_io)
{
if (_bdev_virtio_submit_request(ch, bdev_io) < 0) {
spdk_bdev_io_complete(bdev_io, SPDK_BDEV_IO_STATUS_FAILED);
}
}
static bool
bdev_virtio_io_type_supported(void *ctx, enum spdk_bdev_io_type io_type)
{
switch (io_type) {
case SPDK_BDEV_IO_TYPE_READ:
case SPDK_BDEV_IO_TYPE_WRITE:
case SPDK_BDEV_IO_TYPE_FLUSH:
case SPDK_BDEV_IO_TYPE_RESET:
case SPDK_BDEV_IO_TYPE_UNMAP:
return true;
default:
return false;
}
}
static struct spdk_io_channel *
bdev_virtio_get_io_channel(void *ctx)
{
struct virtio_scsi_disk *disk = ctx;
return spdk_get_io_channel(&disk->hw);
}
static int
bdev_virtio_destruct(void *ctx)
{
return 0;
}
static const struct spdk_bdev_fn_table virtio_fn_table = {
.destruct = bdev_virtio_destruct,
.submit_request = bdev_virtio_submit_request,
.io_type_supported = bdev_virtio_io_type_supported,
.get_io_channel = bdev_virtio_get_io_channel,
};
static int
bdev_virtio_create_cb(void *io_device, void *ctx_buf)
{
return 0;
}
static void
bdev_virtio_destroy_cb(void *io_device, void *ctx_buf)
{
}
static void
scan_target(struct virtio_hw *hw, uint8_t target)
{
struct iovec iov[3];
struct virtio_req vreq;
struct virtio_scsi_cmd_req *req;
struct virtio_scsi_cmd_resp *resp;
struct spdk_scsi_cdb_inquiry *cdb;
uint16_t cnt;
struct virtio_req *complete;
struct virtio_scsi_disk *disk;
struct spdk_bdev *bdev;
vreq.iov = iov;
vreq.iovcnt = 3;
vreq.start_write = 1;
iov[0].iov_base = spdk_dma_malloc(4096, 64, NULL);
iov[1].iov_base = spdk_dma_malloc(4096, 64, NULL);
iov[2].iov_base = spdk_dma_malloc(4096, 64, NULL);
req = iov[0].iov_base;
resp = iov[1].iov_base;
memset(req, 0, sizeof(*req));
req->lun[0] = 1;
req->lun[1] = target;
iov[0].iov_len = sizeof(*req);
cdb = (struct spdk_scsi_cdb_inquiry *)req->cdb;
cdb->opcode = SPDK_SPC_INQUIRY;
cdb->alloc_len[1] = 255;
iov[1].iov_len = sizeof(struct virtio_scsi_cmd_resp);
iov[2].iov_len = 255;
virtio_xmit_pkts(hw->tx_queues[2], &vreq);
do {
cnt = virtio_recv_pkts(hw->tx_queues[2], &complete, 32);
} while (cnt == 0);
if (resp->response != VIRTIO_SCSI_S_OK || resp->status != SPDK_SCSI_STATUS_GOOD) {
return;
}
memset(req, 0, sizeof(*req));
req->lun[0] = 1;
req->lun[1] = target;
iov[0].iov_len = sizeof(*req);
req->cdb[0] = SPDK_SPC_SERVICE_ACTION_IN_16;
req->cdb[1] = SPDK_SBC_SAI_READ_CAPACITY_16;
iov[1].iov_len = sizeof(struct virtio_scsi_cmd_resp);
iov[2].iov_len = 32;
to_be32(&req->cdb[10], iov[2].iov_len);
virtio_xmit_pkts(hw->tx_queues[2], &vreq);
do {
cnt = virtio_recv_pkts(hw->tx_queues[2], &complete, 32);
} while (cnt == 0);
disk = calloc(1, sizeof(*disk));
if (disk == NULL) {
SPDK_ERRLOG("could not allocate disk\n");
return;
}
disk->num_blocks = from_be64((uint64_t *)(iov[2].iov_base)) + 1;
disk->block_size = from_be32((uint32_t *)(iov[2].iov_base + 8));
disk->hw = hw;
bdev = &disk->bdev;
bdev->name = spdk_sprintf_alloc("Virtio0");
bdev->name = "Virtio SCSI Disk";
bdev->write_cache = 0;
bdev->blocklen = disk->block_size;
bdev->blockcnt = disk->num_blocks;
bdev->ctxt = disk;
bdev->fn_table = &virtio_fn_table;
bdev->module = SPDK_GET_BDEV_MODULE(virtio_scsi);
spdk_io_device_register(&disk->hw, bdev_virtio_create_cb, bdev_virtio_destroy_cb, 0);
spdk_bdev_register(bdev);
}
static int
bdev_virtio_initialize(void)
{
struct spdk_conf_section *sp = spdk_conf_find_section(NULL, "Virtio");
struct virtio_hw *hw = NULL;
char *type, *path;
uint32_t i;
if (sp == NULL) {
return 0;
}
for (i = 0; spdk_conf_section_get_nval(sp, "Dev", i) != NULL; i++) {
type = spdk_conf_section_get_nmval(sp, "Dev", i, 0);
if (type == NULL) {
SPDK_ERRLOG("No type specified for index %d\n", i);
continue;
}
if (!strcmp("User", type)) {
path = spdk_conf_section_get_nmval(sp, "Dev", i, 1);
if (path == NULL) {
SPDK_ERRLOG("No path specified for index %d\n", i);
continue;
}
hw = virtio_user_dev_init(path, 1, 512);
} else if (!strcmp("Pci", type)) {
hw = get_pci_virtio_hw();
} else {
SPDK_ERRLOG("Invalid type %s specified for index %d\n", type, i);
continue;
}
}
if (hw == NULL) {
return 0;
}
eth_virtio_dev_init(hw, 3);
virtio_dev_start(hw);
for (i = 0; i < 64; i++) {
scan_target(hw, i);
}
return 0;
}
static void bdev_virtio_finish(void)
{
}
SPDK_LOG_REGISTER_TRACE_FLAG("virtio", SPDK_TRACE_VIRTIO)

View File

@ -0,0 +1,462 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <linux/virtio_scsi.h>
#include <rte_memcpy.h>
#include <rte_string_fns.h>
#include <rte_memzone.h>
#include <rte_malloc.h>
#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_pci.h>
#include <rte_common.h>
#include <rte_errno.h>
#include <rte_memory.h>
#include <rte_eal.h>
#include <rte_dev.h>
#include "virtio_ethdev.h"
#include "virtio_pci.h"
#include "virtio_logs.h"
#include "virtqueue.h"
#include "virtio_rxtx.h"
/*
* The set of PCI devices this driver supports
*/
static const struct rte_pci_id pci_id_virtio_map[] = {
{ RTE_PCI_DEVICE(VIRTIO_PCI_VENDORID, VIRTIO_PCI_DEVICEID_SCSI_MODERN) },
{ .vendor_id = 0, /* sentinel */ },
};
static uint16_t
virtio_get_nr_vq(struct virtio_hw *hw)
{
return hw->max_queues;
}
static void
virtio_init_vring(struct virtqueue *vq)
{
int size = vq->vq_nentries;
struct vring *vr = &vq->vq_ring;
uint8_t *ring_mem = vq->vq_ring_virt_mem;
PMD_INIT_FUNC_TRACE();
/*
* Reinitialise since virtio port might have been stopped and restarted
*/
memset(ring_mem, 0, vq->vq_ring_size);
vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN);
vq->vq_used_cons_idx = 0;
vq->vq_desc_head_idx = 0;
vq->vq_avail_idx = 0;
vq->vq_desc_tail_idx = (uint16_t)(vq->vq_nentries - 1);
vq->vq_free_cnt = vq->vq_nentries;
memset(vq->vq_descx, 0, sizeof(struct vq_desc_extra) * vq->vq_nentries);
vring_desc_init(vr->desc, size);
/*
* Disable device(host) interrupting guest
*/
virtqueue_disable_intr(vq);
}
static int
virtio_init_queue(struct virtio_hw *hw, uint16_t vtpci_queue_idx)
{
char vq_name[VIRTQUEUE_MAX_NAME_SZ];
const struct rte_memzone *mz = NULL;
unsigned int vq_size, size;
struct virtnet_tx *txvq = NULL;
struct virtqueue *vq;
int ret;
PMD_INIT_LOG(DEBUG, "setting up queue: %u", vtpci_queue_idx);
/*
* Read the virtqueue size from the Queue Size field
* Always power of 2 and if 0 virtqueue does not exist
*/
vq_size = VTPCI_OPS(hw)->get_queue_num(hw, vtpci_queue_idx);
PMD_INIT_LOG(DEBUG, "vq_size: %u", vq_size);
if (vq_size == 0) {
PMD_INIT_LOG(ERR, "virtqueue does not exist");
return -EINVAL;
}
if (!rte_is_power_of_2(vq_size)) {
PMD_INIT_LOG(ERR, "virtqueue size is not powerof 2");
return -EINVAL;
}
snprintf(vq_name, sizeof(vq_name), "port%d_vq%d",
hw->port_id, vtpci_queue_idx);
size = RTE_ALIGN_CEIL(sizeof(*vq) +
vq_size * sizeof(struct vq_desc_extra),
RTE_CACHE_LINE_SIZE);
vq = rte_zmalloc_socket(vq_name, size, RTE_CACHE_LINE_SIZE,
SOCKET_ID_ANY);
if (vq == NULL) {
PMD_INIT_LOG(ERR, "can not allocate vq");
return -ENOMEM;
}
hw->vqs[vtpci_queue_idx] = vq;
vq->hw = hw;
vq->vq_queue_index = vtpci_queue_idx;
vq->vq_nentries = vq_size;
/*
* Reserve a memzone for vring elements
*/
size = vring_size(vq_size, VIRTIO_PCI_VRING_ALIGN);
vq->vq_ring_size = RTE_ALIGN_CEIL(size, VIRTIO_PCI_VRING_ALIGN);
PMD_INIT_LOG(DEBUG, "vring_size: %d, rounded_vring_size: %d",
size, vq->vq_ring_size);
mz = rte_memzone_reserve_aligned(vq_name, vq->vq_ring_size,
SOCKET_ID_ANY,
0, VIRTIO_PCI_VRING_ALIGN);
if (mz == NULL) {
if (rte_errno == EEXIST)
mz = rte_memzone_lookup(vq_name);
if (mz == NULL) {
ret = -ENOMEM;
goto fail_q_alloc;
}
}
memset(mz->addr, 0, sizeof(mz->len));
vq->vq_ring_mem = mz->phys_addr;
vq->vq_ring_virt_mem = mz->addr;
PMD_INIT_LOG(DEBUG, "vq->vq_ring_mem: 0x%" PRIx64,
(uint64_t)mz->phys_addr);
PMD_INIT_LOG(DEBUG, "vq->vq_ring_virt_mem: 0x%" PRIx64,
(uint64_t)(uintptr_t)mz->addr);
virtio_init_vring(vq);
txvq = &vq->txq;
txvq->vq = vq;
txvq->port_id = hw->port_id;
txvq->mz = mz;
/* For virtio_user case (that is when hw->dev is NULL), we use
* virtual address. And we need properly set _offset_, please see
* VIRTIO_MBUF_DATA_DMA_ADDR in virtqueue.h for more information.
*/
if (hw->virtio_user_dev) {
vq->vq_ring_mem = (uintptr_t)mz->addr;
}
if (VTPCI_OPS(hw)->setup_queue(hw, vq) < 0) {
PMD_INIT_LOG(ERR, "setup_queue failed");
return -EINVAL;
}
return 0;
fail_q_alloc:
rte_memzone_free(mz);
rte_free(vq);
return ret;
}
static void
virtio_free_queues(struct virtio_hw *hw)
{
uint16_t nr_vq = virtio_get_nr_vq(hw);
struct virtqueue *vq;
uint16_t i;
if (hw->vqs == NULL)
return;
for (i = 0; i < nr_vq; i++) {
vq = hw->vqs[i];
if (!vq)
continue;
rte_memzone_free(vq->txq.mz);
rte_free(vq);
hw->vqs[i] = NULL;
}
rte_free(hw->vqs);
hw->vqs = NULL;
}
static int
virtio_alloc_queues(struct virtio_hw *hw)
{
uint16_t nr_vq = virtio_get_nr_vq(hw);
uint16_t i;
int ret;
hw->vqs = rte_zmalloc(NULL, sizeof(struct virtqueue *) * nr_vq, 0);
if (!hw->vqs) {
PMD_INIT_LOG(ERR, "failed to allocate vqs");
return -ENOMEM;
}
for (i = 0; i < nr_vq; i++) {
ret = virtio_init_queue(hw, i);
if (ret < 0) {
virtio_free_queues(hw);
return ret;
}
}
return 0;
}
static int
virtio_negotiate_features(struct virtio_hw *hw, uint64_t req_features)
{
uint64_t host_features;
/* Prepare guest_features: feature that driver wants to support */
PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %" PRIx64,
req_features);
/* Read device(host) feature bits */
host_features = VTPCI_OPS(hw)->get_features(hw);
PMD_INIT_LOG(DEBUG, "host_features before negotiate = %" PRIx64,
host_features);
/*
* Negotiate features: Subset of device feature bits are written back
* guest feature bits.
*/
hw->guest_features = req_features;
hw->guest_features = vtpci_negotiate_features(hw, host_features);
PMD_INIT_LOG(DEBUG, "features after negotiate = %" PRIx64,
hw->guest_features);
if (hw->modern) {
if (!vtpci_with_feature(hw, VIRTIO_F_VERSION_1)) {
PMD_INIT_LOG(ERR,
"VIRTIO_F_VERSION_1 features is not enabled.");
return -1;
}
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_FEATURES_OK);
if (!(vtpci_get_status(hw) & VIRTIO_CONFIG_STATUS_FEATURES_OK)) {
PMD_INIT_LOG(ERR,
"failed to set FEATURES_OK status!");
return -1;
}
}
hw->req_guest_features = req_features;
return 0;
}
/* reset device and renegotiate features if needed */
static int
virtio_init_device(struct virtio_hw *hw, uint64_t req_features)
{
struct rte_pci_device *pci_dev = NULL;
int ret;
/* Reset the device although not necessary at startup */
vtpci_reset(hw);
/* Tell the host we've noticed this device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK);
/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
if (virtio_negotiate_features(hw, req_features) < 0)
return -1;
vtpci_read_dev_config(hw, offsetof(struct virtio_scsi_config, num_queues),
&hw->max_queues, sizeof(hw->max_queues));
if (!hw->virtio_user_dev) {
hw->max_queues = 3;
}
ret = virtio_alloc_queues(hw);
if (ret < 0)
return ret;
vtpci_reinit_complete(hw);
if (pci_dev)
PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
hw->port_id, pci_dev->id.vendor_id,
pci_dev->id.device_id);
return 0;
}
static void
virtio_set_vtpci_ops(struct virtio_hw *hw)
{
VTPCI_OPS(hw) = &virtio_user_ops;
}
/*
* This function is based on probe() function in virtio_pci.c
* It returns 0 on success.
*/
int
eth_virtio_dev_init(struct virtio_hw *hw, int num_queues)
{
int ret, i;
if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
virtio_set_vtpci_ops(hw);
return 0;
}
if (!hw->virtio_user_dev) {
ret = vtpci_init(hw->pci_dev, hw);
if (ret)
return ret;
}
/* reset device and negotiate default features */
ret = virtio_init_device(hw, VIRTIO_PMD_DEFAULT_GUEST_FEATURES);
if (ret < 0)
return ret;
hw->tx_queues = rte_zmalloc("tx_queues", sizeof(hw->tx_queues[0]) * num_queues, RTE_CACHE_LINE_SIZE);
hw->nb_tx_queues = num_queues;
for (i = 0; i < num_queues; i++) {
virtio_dev_tx_queue_setup(hw, i, 512, -1);
}
return 0;
}
int
virtio_dev_start(struct virtio_hw *hw)
{
struct virtnet_tx *txvq __rte_unused;
/* Enable uio/vfio intr/eventfd mapping: althrough we already did that
* in device configure, but it could be unmapped when device is
* stopped.
*/
/** TODO: interrupt handling for virtio_scsi */
#if 0
if (dev->data->dev_conf.intr_conf.lsc ||
dev->data->dev_conf.intr_conf.rxq) {
rte_intr_disable(dev->intr_handle);
if (rte_intr_enable(dev->intr_handle) < 0) {
PMD_DRV_LOG(ERR, "interrupt enable failed");
return -EIO;
}
}
#endif
PMD_INIT_LOG(DEBUG, "Notified backend at initialization");
hw->started = 1;
return 0;
}
static struct virtio_hw *g_pci_hw = NULL;
struct virtio_hw *
get_pci_virtio_hw(void)
{
printf("%s[%d] %p\n", __func__, __LINE__, g_pci_hw);
return g_pci_hw;
}
static int virtio_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct rte_pci_device *pci_dev)
{
struct virtio_hw *hw;
hw = calloc(1, sizeof(*hw));
hw->pci_dev = pci_dev;
g_pci_hw = hw;
printf("%s[%d]\n", __func__, __LINE__);
return 0;
}
static int virtio_pci_remove(struct rte_pci_device *pci_dev)
{
printf("%s[%d]\n", __func__, __LINE__);
return 0;
}
static struct rte_pci_driver rte_virtio_pmd = {
.driver = {
.name = "net_virtio",
},
.id_table = pci_id_virtio_map,
.drv_flags = 0,
.probe = virtio_pci_probe,
.remove = virtio_pci_remove,
};
RTE_INIT(rte_virtio_pmd_init);
static void
rte_virtio_pmd_init(void)
{
if (rte_eal_iopl_init() != 0) {
PMD_INIT_LOG(ERR, "IOPL call failed - cannot use virtio PMD");
return;
}
rte_pci_register(&rte_virtio_pmd);
}
RTE_PMD_EXPORT_NAME(net_virtio, __COUNTER__);
RTE_PMD_REGISTER_PCI_TABLE(net_virtio, pci_id_virtio_map);
RTE_PMD_REGISTER_KMOD_DEP(net_virtio, "* igb_uio | uio_pci_generic | vfio");

View File

@ -0,0 +1,80 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTIO_ETHDEV_H_
#define _VIRTIO_ETHDEV_H_
#include <stdint.h>
#include <sys/uio.h>
#include "virtio_pci.h"
#define VIRTIO_MAX_RX_QUEUES 128U
#define VIRTIO_MAX_TX_QUEUES 128U
#define VIRTIO_MIN_RX_BUFSIZE 64
struct virtio_req {
struct iovec *iov;
uint32_t iovcnt;
uint32_t start_write;
uint32_t data_transferred;
};
/* Features desired/implemented by this driver. */
#define VIRTIO_PMD_DEFAULT_GUEST_FEATURES \
(1u << VIRTIO_SCSI_F_INOUT | \
1ULL << VIRTIO_F_VERSION_1 | \
1ULL << VIRTIO_F_IOMMU_PLATFORM)
#define VIRTIO_PMD_SUPPORTED_GUEST_FEATURES \
(VIRTIO_PMD_DEFAULT_GUEST_FEATURES)
/*
* RX/TX function prototypes
*/
int virtio_dev_tx_queue_setup(struct virtio_hw *hw, uint16_t tx_queue_id,
uint16_t nb_tx_desc, unsigned int socket_id);
uint16_t virtio_recv_pkts(void *rx_queue, struct virtio_req **reqs,
uint16_t nb_pkts);
uint16_t virtio_xmit_pkts(void *tx_queue, struct virtio_req *req);
int eth_virtio_dev_init(struct virtio_hw *hw, int num_queues);
int virtio_dev_start(struct virtio_hw *hw);
struct virtio_hw *get_pci_virtio_hw(void);
void virtio_interrupt_handler(void *param);
#endif /* _VIRTIO_ETHDEV_H_ */

View File

@ -0,0 +1,70 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTIO_LOGS_H_
#define _VIRTIO_LOGS_H_
#include <rte_log.h>
#ifdef RTE_LIBRTE_VIRTIO_DEBUG_INIT
#define PMD_INIT_LOG(level, fmt, args...) \
RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
#else
#define PMD_INIT_LOG(level, fmt, args...) do { } while(0)
#define PMD_INIT_FUNC_TRACE() do { } while(0)
#endif
#ifdef RTE_LIBRTE_VIRTIO_DEBUG_RX
#define PMD_RX_LOG(level, fmt, args...) \
RTE_LOG(level, PMD, "%s() rx: " fmt "\n", __func__, ## args)
#else
#define PMD_RX_LOG(level, fmt, args...) do { } while(0)
#endif
#ifdef RTE_LIBRTE_VIRTIO_DEBUG_TX
#define PMD_TX_LOG(level, fmt, args...) \
RTE_LOG(level, PMD, "%s() tx: " fmt "\n", __func__, ## args)
#else
#define PMD_TX_LOG(level, fmt, args...) do { } while(0)
#endif
#ifdef RTE_LIBRTE_VIRTIO_DEBUG_DRIVER
#define PMD_DRV_LOG(level, fmt, args...) \
RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
#else
#define PMD_DRV_LOG(level, fmt, args...) do { } while(0)
#endif
#endif /* _VIRTIO_LOGS_H_ */

View File

@ -0,0 +1,704 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#ifdef RTE_EXEC_ENV_LINUXAPP
#include <dirent.h>
#include <fcntl.h>
#endif
#include <rte_io.h>
#include "virtio_pci.h"
#include "virtio_logs.h"
#include "virtqueue.h"
struct virtio_hw_internal virtio_hw_internal[128];
/*
* Following macros are derived from linux/pci_regs.h, however,
* we can't simply include that header here, as there is no such
* file for non-Linux platform.
*/
#define PCI_CAPABILITY_LIST 0x34
#define PCI_CAP_ID_VNDR 0x09
#define PCI_CAP_ID_MSIX 0x11
/*
* The remaining space is defined by each driver as the per-driver
* configuration space.
*/
#define VIRTIO_PCI_CONFIG(hw) (((hw)->use_msix) ? 24 : 20)
static inline int
check_vq_phys_addr_ok(struct virtqueue *vq)
{
/* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit,
* and only accepts 32 bit page frame number.
* Check if the allocated physical memory exceeds 16TB.
*/
if ((vq->vq_ring_mem + vq->vq_ring_size - 1) >>
(VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) {
PMD_INIT_LOG(ERR, "vring address shouldn't be above 16TB!");
return 0;
}
return 1;
}
/*
* Since we are in legacy mode:
* http://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf
*
* "Note that this is possible because while the virtio header is PCI (i.e.
* little) endian, the device-specific region is encoded in the native endian of
* the guest (where such distinction is applicable)."
*
* For powerpc which supports both, qemu supposes that cpu is big endian and
* enforces this for the virtio-net stuff.
*/
static void
legacy_read_dev_config(struct virtio_hw *hw, size_t offset,
void *dst, int length)
{
#ifdef RTE_ARCH_PPC_64
int size;
while (length > 0) {
if (length >= 4) {
size = 4;
rte_pci_ioport_read(VTPCI_IO(hw), dst, size,
VIRTIO_PCI_CONFIG(hw) + offset);
*(uint32_t *)dst = rte_be_to_cpu_32(*(uint32_t *)dst);
} else if (length >= 2) {
size = 2;
rte_pci_ioport_read(VTPCI_IO(hw), dst, size,
VIRTIO_PCI_CONFIG(hw) + offset);
*(uint16_t *)dst = rte_be_to_cpu_16(*(uint16_t *)dst);
} else {
size = 1;
rte_pci_ioport_read(VTPCI_IO(hw), dst, size,
VIRTIO_PCI_CONFIG(hw) + offset);
}
dst = (char *)dst + size;
offset += size;
length -= size;
}
#else
rte_pci_ioport_read(VTPCI_IO(hw), dst, length,
VIRTIO_PCI_CONFIG(hw) + offset);
#endif
}
static void
legacy_write_dev_config(struct virtio_hw *hw, size_t offset,
const void *src, int length)
{
#ifdef RTE_ARCH_PPC_64
union {
uint32_t u32;
uint16_t u16;
} tmp;
int size;
while (length > 0) {
if (length >= 4) {
size = 4;
tmp.u32 = rte_cpu_to_be_32(*(const uint32_t *)src);
rte_pci_ioport_write(VTPCI_IO(hw), &tmp.u32, size,
VIRTIO_PCI_CONFIG(hw) + offset);
} else if (length >= 2) {
size = 2;
tmp.u16 = rte_cpu_to_be_16(*(const uint16_t *)src);
rte_pci_ioport_write(VTPCI_IO(hw), &tmp.u16, size,
VIRTIO_PCI_CONFIG(hw) + offset);
} else {
size = 1;
rte_pci_ioport_write(VTPCI_IO(hw), src, size,
VIRTIO_PCI_CONFIG(hw) + offset);
}
src = (const char *)src + size;
offset += size;
length -= size;
}
#else
rte_pci_ioport_write(VTPCI_IO(hw), src, length,
VIRTIO_PCI_CONFIG(hw) + offset);
#endif
}
static uint64_t
legacy_get_features(struct virtio_hw *hw)
{
uint32_t dst;
rte_pci_ioport_read(VTPCI_IO(hw), &dst, 4, VIRTIO_PCI_HOST_FEATURES);
return dst;
}
static void
legacy_set_features(struct virtio_hw *hw, uint64_t features)
{
if ((features >> 32) != 0) {
PMD_DRV_LOG(ERR,
"only 32 bit features are allowed for legacy virtio!");
return;
}
rte_pci_ioport_write(VTPCI_IO(hw), &features, 4,
VIRTIO_PCI_GUEST_FEATURES);
}
static uint8_t
legacy_get_status(struct virtio_hw *hw)
{
uint8_t dst;
rte_pci_ioport_read(VTPCI_IO(hw), &dst, 1, VIRTIO_PCI_STATUS);
return dst;
}
static void
legacy_set_status(struct virtio_hw *hw, uint8_t status)
{
rte_pci_ioport_write(VTPCI_IO(hw), &status, 1, VIRTIO_PCI_STATUS);
}
static void
legacy_reset(struct virtio_hw *hw)
{
legacy_set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
}
static uint8_t
legacy_get_isr(struct virtio_hw *hw)
{
uint8_t dst;
rte_pci_ioport_read(VTPCI_IO(hw), &dst, 1, VIRTIO_PCI_ISR);
return dst;
}
/* Enable one vector (0) for Link State Intrerrupt */
static uint16_t
legacy_set_config_irq(struct virtio_hw *hw, uint16_t vec)
{
uint16_t dst;
rte_pci_ioport_write(VTPCI_IO(hw), &vec, 2, VIRTIO_MSI_CONFIG_VECTOR);
rte_pci_ioport_read(VTPCI_IO(hw), &dst, 2, VIRTIO_MSI_CONFIG_VECTOR);
return dst;
}
static uint16_t
legacy_set_queue_irq(struct virtio_hw *hw, struct virtqueue *vq, uint16_t vec)
{
uint16_t dst;
rte_pci_ioport_write(VTPCI_IO(hw), &vq->vq_queue_index, 2,
VIRTIO_PCI_QUEUE_SEL);
rte_pci_ioport_write(VTPCI_IO(hw), &vec, 2, VIRTIO_MSI_QUEUE_VECTOR);
rte_pci_ioport_read(VTPCI_IO(hw), &dst, 2, VIRTIO_MSI_QUEUE_VECTOR);
return dst;
}
static uint16_t
legacy_get_queue_num(struct virtio_hw *hw, uint16_t queue_id)
{
uint16_t dst;
rte_pci_ioport_write(VTPCI_IO(hw), &queue_id, 2, VIRTIO_PCI_QUEUE_SEL);
rte_pci_ioport_read(VTPCI_IO(hw), &dst, 2, VIRTIO_PCI_QUEUE_NUM);
return dst;
}
static int
legacy_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
uint32_t src;
if (!check_vq_phys_addr_ok(vq))
return -1;
rte_pci_ioport_write(VTPCI_IO(hw), &vq->vq_queue_index, 2,
VIRTIO_PCI_QUEUE_SEL);
src = vq->vq_ring_mem >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
rte_pci_ioport_write(VTPCI_IO(hw), &src, 4, VIRTIO_PCI_QUEUE_PFN);
return 0;
}
static void
legacy_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
uint32_t src = 0;
rte_pci_ioport_write(VTPCI_IO(hw), &vq->vq_queue_index, 2,
VIRTIO_PCI_QUEUE_SEL);
rte_pci_ioport_write(VTPCI_IO(hw), &src, 4, VIRTIO_PCI_QUEUE_PFN);
}
static void
legacy_notify_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
rte_pci_ioport_write(VTPCI_IO(hw), &vq->vq_queue_index, 2,
VIRTIO_PCI_QUEUE_NOTIFY);
}
const struct virtio_pci_ops legacy_ops = {
.read_dev_cfg = legacy_read_dev_config,
.write_dev_cfg = legacy_write_dev_config,
.reset = legacy_reset,
.get_status = legacy_get_status,
.set_status = legacy_set_status,
.get_features = legacy_get_features,
.set_features = legacy_set_features,
.get_isr = legacy_get_isr,
.set_config_irq = legacy_set_config_irq,
.set_queue_irq = legacy_set_queue_irq,
.get_queue_num = legacy_get_queue_num,
.setup_queue = legacy_setup_queue,
.del_queue = legacy_del_queue,
.notify_queue = legacy_notify_queue,
};
static inline void
io_write64_twopart(uint64_t val, uint32_t *lo, uint32_t *hi)
{
rte_write32(val & ((1ULL << 32) - 1), lo);
rte_write32(val >> 32, hi);
}
static void
modern_read_dev_config(struct virtio_hw *hw, size_t offset,
void *dst, int length)
{
int i;
uint8_t *p;
uint8_t old_gen, new_gen;
do {
old_gen = rte_read8(&hw->common_cfg->config_generation);
p = dst;
for (i = 0; i < length; i++)
*p++ = rte_read8((uint8_t *)hw->dev_cfg + offset + i);
new_gen = rte_read8(&hw->common_cfg->config_generation);
} while (old_gen != new_gen);
}
static void
modern_write_dev_config(struct virtio_hw *hw, size_t offset,
const void *src, int length)
{
int i;
const uint8_t *p = src;
for (i = 0; i < length; i++)
rte_write8((*p++), (((uint8_t *)hw->dev_cfg) + offset + i));
}
static uint64_t
modern_get_features(struct virtio_hw *hw)
{
uint32_t features_lo, features_hi;
rte_write32(0, &hw->common_cfg->device_feature_select);
features_lo = rte_read32(&hw->common_cfg->device_feature);
rte_write32(1, &hw->common_cfg->device_feature_select);
features_hi = rte_read32(&hw->common_cfg->device_feature);
return ((uint64_t)features_hi << 32) | features_lo;
}
static void
modern_set_features(struct virtio_hw *hw, uint64_t features)
{
rte_write32(0, &hw->common_cfg->guest_feature_select);
rte_write32(features & ((1ULL << 32) - 1),
&hw->common_cfg->guest_feature);
rte_write32(1, &hw->common_cfg->guest_feature_select);
rte_write32(features >> 32,
&hw->common_cfg->guest_feature);
}
static uint8_t
modern_get_status(struct virtio_hw *hw)
{
return rte_read8(&hw->common_cfg->device_status);
}
static void
modern_set_status(struct virtio_hw *hw, uint8_t status)
{
rte_write8(status, &hw->common_cfg->device_status);
}
static void
modern_reset(struct virtio_hw *hw)
{
modern_set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
modern_get_status(hw);
}
static uint8_t
modern_get_isr(struct virtio_hw *hw)
{
return rte_read8(hw->isr);
}
static uint16_t
modern_set_config_irq(struct virtio_hw *hw, uint16_t vec)
{
rte_write16(vec, &hw->common_cfg->msix_config);
return rte_read16(&hw->common_cfg->msix_config);
}
static uint16_t
modern_set_queue_irq(struct virtio_hw *hw, struct virtqueue *vq, uint16_t vec)
{
rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
rte_write16(vec, &hw->common_cfg->queue_msix_vector);
return rte_read16(&hw->common_cfg->queue_msix_vector);
}
static uint16_t
modern_get_queue_num(struct virtio_hw *hw, uint16_t queue_id)
{
rte_write16(queue_id, &hw->common_cfg->queue_select);
return rte_read16(&hw->common_cfg->queue_size);
}
static int
modern_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
uint64_t desc_addr, avail_addr, used_addr;
uint16_t notify_off;
if (!check_vq_phys_addr_ok(vq))
return -1;
desc_addr = vq->vq_ring_mem;
avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
ring[vq->vq_nentries]),
VIRTIO_PCI_VRING_ALIGN);
rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
io_write64_twopart(desc_addr, &hw->common_cfg->queue_desc_lo,
&hw->common_cfg->queue_desc_hi);
io_write64_twopart(avail_addr, &hw->common_cfg->queue_avail_lo,
&hw->common_cfg->queue_avail_hi);
io_write64_twopart(used_addr, &hw->common_cfg->queue_used_lo,
&hw->common_cfg->queue_used_hi);
notify_off = rte_read16(&hw->common_cfg->queue_notify_off);
vq->notify_addr = (void *)((uint8_t *)hw->notify_base +
notify_off * hw->notify_off_multiplier);
rte_write16(1, &hw->common_cfg->queue_enable);
PMD_INIT_LOG(DEBUG, "queue %u addresses:", vq->vq_queue_index);
PMD_INIT_LOG(DEBUG, "\t desc_addr: %" PRIx64, desc_addr);
PMD_INIT_LOG(DEBUG, "\t aval_addr: %" PRIx64, avail_addr);
PMD_INIT_LOG(DEBUG, "\t used_addr: %" PRIx64, used_addr);
PMD_INIT_LOG(DEBUG, "\t notify addr: %p (notify offset: %u)",
vq->notify_addr, notify_off);
return 0;
}
static void
modern_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
io_write64_twopart(0, &hw->common_cfg->queue_desc_lo,
&hw->common_cfg->queue_desc_hi);
io_write64_twopart(0, &hw->common_cfg->queue_avail_lo,
&hw->common_cfg->queue_avail_hi);
io_write64_twopart(0, &hw->common_cfg->queue_used_lo,
&hw->common_cfg->queue_used_hi);
rte_write16(0, &hw->common_cfg->queue_enable);
}
static void
modern_notify_queue(struct virtio_hw *hw __rte_unused, struct virtqueue *vq)
{
rte_write16(vq->vq_queue_index, vq->notify_addr);
}
const struct virtio_pci_ops modern_ops = {
.read_dev_cfg = modern_read_dev_config,
.write_dev_cfg = modern_write_dev_config,
.reset = modern_reset,
.get_status = modern_get_status,
.set_status = modern_set_status,
.get_features = modern_get_features,
.set_features = modern_set_features,
.get_isr = modern_get_isr,
.set_config_irq = modern_set_config_irq,
.set_queue_irq = modern_set_queue_irq,
.get_queue_num = modern_get_queue_num,
.setup_queue = modern_setup_queue,
.del_queue = modern_del_queue,
.notify_queue = modern_notify_queue,
};
void
vtpci_read_dev_config(struct virtio_hw *hw, size_t offset,
void *dst, int length)
{
VTPCI_OPS(hw)->read_dev_cfg(hw, offset, dst, length);
}
void
vtpci_write_dev_config(struct virtio_hw *hw, size_t offset,
const void *src, int length)
{
VTPCI_OPS(hw)->write_dev_cfg(hw, offset, src, length);
}
uint64_t
vtpci_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
{
uint64_t features;
/*
* Limit negotiated features to what the driver, virtqueue, and
* host all support.
*/
features = host_features & hw->guest_features;
VTPCI_OPS(hw)->set_features(hw, features);
return features;
}
void
vtpci_reset(struct virtio_hw *hw)
{
VTPCI_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
/* flush status write */
VTPCI_OPS(hw)->get_status(hw);
}
void
vtpci_reinit_complete(struct virtio_hw *hw)
{
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK);
}
void
vtpci_set_status(struct virtio_hw *hw, uint8_t status)
{
if (status != VIRTIO_CONFIG_STATUS_RESET)
status |= VTPCI_OPS(hw)->get_status(hw);
VTPCI_OPS(hw)->set_status(hw, status);
}
uint8_t
vtpci_get_status(struct virtio_hw *hw)
{
return VTPCI_OPS(hw)->get_status(hw);
}
uint8_t
vtpci_isr(struct virtio_hw *hw)
{
return VTPCI_OPS(hw)->get_isr(hw);
}
static void *
get_cfg_addr(struct rte_pci_device *dev, struct virtio_pci_cap *cap)
{
uint8_t bar = cap->bar;
uint32_t length = cap->length;
uint32_t offset = cap->offset;
uint8_t *base;
if (bar > 5) {
PMD_INIT_LOG(ERR, "invalid bar: %u", bar);
return NULL;
}
if (offset + length < offset) {
PMD_INIT_LOG(ERR, "offset(%u) + length(%u) overflows",
offset, length);
return NULL;
}
if (offset + length > dev->mem_resource[bar].len) {
PMD_INIT_LOG(ERR,
"invalid cap: overflows bar space: %u > %" PRIu64,
offset + length, dev->mem_resource[bar].len);
return NULL;
}
base = dev->mem_resource[bar].addr;
if (base == NULL) {
PMD_INIT_LOG(ERR, "bar %u base addr is NULL", bar);
return NULL;
}
return base + offset;
}
static int
virtio_read_caps(struct rte_pci_device *dev, struct virtio_hw *hw)
{
uint8_t pos;
struct virtio_pci_cap cap;
int ret;
if (rte_pci_map_device(dev)) {
PMD_INIT_LOG(DEBUG, "failed to map pci device!");
return -1;
}
ret = rte_pci_read_config(dev, &pos, 1, PCI_CAPABILITY_LIST);
if (ret < 0) {
PMD_INIT_LOG(DEBUG, "failed to read pci capability list");
return -1;
}
while (pos) {
ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
if (ret < 0) {
PMD_INIT_LOG(ERR,
"failed to read pci cap at pos: %x", pos);
break;
}
if (cap.cap_vndr == PCI_CAP_ID_MSIX)
hw->use_msix = 1;
if (cap.cap_vndr != PCI_CAP_ID_VNDR) {
PMD_INIT_LOG(DEBUG,
"[%2x] skipping non VNDR cap id: %02x",
pos, cap.cap_vndr);
goto next;
}
PMD_INIT_LOG(DEBUG,
"[%2x] cfg type: %u, bar: %u, offset: %04x, len: %u",
pos, cap.cfg_type, cap.bar, cap.offset, cap.length);
switch (cap.cfg_type) {
case VIRTIO_PCI_CAP_COMMON_CFG:
hw->common_cfg = get_cfg_addr(dev, &cap);
break;
case VIRTIO_PCI_CAP_NOTIFY_CFG:
rte_pci_read_config(dev, &hw->notify_off_multiplier,
4, pos + sizeof(cap));
hw->notify_base = get_cfg_addr(dev, &cap);
break;
case VIRTIO_PCI_CAP_DEVICE_CFG:
hw->dev_cfg = get_cfg_addr(dev, &cap);
break;
case VIRTIO_PCI_CAP_ISR_CFG:
hw->isr = get_cfg_addr(dev, &cap);
break;
}
next:
pos = cap.cap_next;
}
if (hw->common_cfg == NULL || hw->notify_base == NULL ||
hw->dev_cfg == NULL || hw->isr == NULL) {
PMD_INIT_LOG(INFO, "no modern virtio pci device found.");
return -1;
}
PMD_INIT_LOG(INFO, "found modern virtio pci device.");
PMD_INIT_LOG(DEBUG, "common cfg mapped at: %p", hw->common_cfg);
PMD_INIT_LOG(DEBUG, "device cfg mapped at: %p", hw->dev_cfg);
PMD_INIT_LOG(DEBUG, "isr cfg mapped at: %p", hw->isr);
PMD_INIT_LOG(DEBUG, "notify base: %p, notify off multiplier: %u",
hw->notify_base, hw->notify_off_multiplier);
return 0;
}
/*
* Return -1:
* if there is error mapping with VFIO/UIO.
* if port map error when driver type is KDRV_NONE.
* if whitelisted but driver type is KDRV_UNKNOWN.
* Return 1 if kernel driver is managing the device.
* Return 0 on success.
*/
int
vtpci_init(struct rte_pci_device *dev, struct virtio_hw *hw)
{
/*
* Try if we can succeed reading virtio pci caps, which exists
* only on modern pci device. If failed, we fallback to legacy
* virtio handling.
*/
if (virtio_read_caps(dev, hw) == 0) {
PMD_INIT_LOG(INFO, "modern virtio pci detected.");
virtio_hw_internal[hw->port_id].vtpci_ops = &modern_ops;
hw->modern = 1;
return 0;
}
#if 0
PMD_INIT_LOG(INFO, "trying with legacy virtio pci.");
if (rte_pci_ioport_map(dev, 0, VTPCI_IO(hw)) < 0) {
if (dev->kdrv == RTE_KDRV_UNKNOWN &&
(!dev->device.devargs ||
dev->device.devargs->type !=
RTE_DEVTYPE_WHITELISTED_PCI)) {
PMD_INIT_LOG(INFO,
"skip kernel managed virtio device.");
return 1;
}
return -1;
}
#endif
virtio_hw_internal[hw->port_id].vtpci_ops = &legacy_ops;
hw->modern = 0;
return 0;
}

View File

@ -0,0 +1,289 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTIO_PCI_H_
#define _VIRTIO_PCI_H_
#include <stdint.h>
#include <rte_pci.h>
struct virtqueue;
/* VirtIO PCI vendor/device ID. */
#define VIRTIO_PCI_VENDORID 0x1AF4
#define VIRTIO_PCI_DEVICEID_SCSI_MODERN 0x1004
/* VirtIO ABI version, this must match exactly. */
#define VIRTIO_PCI_ABI_VERSION 0
/*
* VirtIO Header, located in BAR 0.
*/
#define VIRTIO_PCI_HOST_FEATURES 0 /* host's supported features (32bit, RO)*/
#define VIRTIO_PCI_GUEST_FEATURES 4 /* guest's supported features (32, RW) */
#define VIRTIO_PCI_QUEUE_PFN 8 /* physical address of VQ (32, RW) */
#define VIRTIO_PCI_QUEUE_NUM 12 /* number of ring entries (16, RO) */
#define VIRTIO_PCI_QUEUE_SEL 14 /* current VQ selection (16, RW) */
#define VIRTIO_PCI_QUEUE_NOTIFY 16 /* notify host regarding VQ (16, RW) */
#define VIRTIO_PCI_STATUS 18 /* device status register (8, RW) */
#define VIRTIO_PCI_ISR 19 /* interrupt status register, reading
* also clears the register (8, RO) */
/* Only if MSIX is enabled: */
#define VIRTIO_MSI_CONFIG_VECTOR 20 /* configuration change vector (16, RW) */
#define VIRTIO_MSI_QUEUE_VECTOR 22 /* vector for selected VQ notifications
(16, RW) */
/* The bit of the ISR which indicates a device has an interrupt. */
#define VIRTIO_PCI_ISR_INTR 0x1
/* The bit of the ISR which indicates a device configuration change. */
#define VIRTIO_PCI_ISR_CONFIG 0x2
/* Vector value used to disable MSI for queue. */
#define VIRTIO_MSI_NO_VECTOR 0xFFFF
/* VirtIO device IDs. */
#define VIRTIO_ID_NETWORK 0x01
#define VIRTIO_ID_BLOCK 0x02
#define VIRTIO_ID_CONSOLE 0x03
#define VIRTIO_ID_ENTROPY 0x04
#define VIRTIO_ID_BALLOON 0x05
#define VIRTIO_ID_IOMEMORY 0x06
#define VIRTIO_ID_9P 0x09
/* Status byte for guest to report progress. */
#define VIRTIO_CONFIG_STATUS_RESET 0x00
#define VIRTIO_CONFIG_STATUS_ACK 0x01
#define VIRTIO_CONFIG_STATUS_DRIVER 0x02
#define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
#define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
#define VIRTIO_CONFIG_STATUS_FAILED 0x80
/*
* Each virtqueue indirect descriptor list must be physically contiguous.
* To allow us to malloc(9) each list individually, limit the number
* supported to what will fit in one page. With 4KB pages, this is a limit
* of 256 descriptors. If there is ever a need for more, we can switch to
* contigmalloc(9) for the larger allocations, similar to what
* bus_dmamem_alloc(9) does.
*
* Note the sizeof(struct vring_desc) is 16 bytes.
*/
#define VIRTIO_MAX_INDIRECT ((int) (PAGE_SIZE / 16))
#define VIRTIO_SCSI_F_INOUT 0
/* Do we get callbacks when the ring is completely used, even if we've
* suppressed them? */
#define VIRTIO_F_NOTIFY_ON_EMPTY 24
/* Can the device handle any descriptor layout? */
#define VIRTIO_F_ANY_LAYOUT 27
/* We support indirect buffer descriptors */
#define VIRTIO_RING_F_INDIRECT_DESC 28
#define VIRTIO_F_VERSION_1 32
#define VIRTIO_F_IOMMU_PLATFORM 33
/*
* Some VirtIO feature bits (currently bits 28 through 31) are
* reserved for the transport being used (eg. virtio_ring), the
* rest are per-device feature bits.
*/
#define VIRTIO_TRANSPORT_F_START 28
#define VIRTIO_TRANSPORT_F_END 34
/* The Guest publishes the used index for which it expects an interrupt
* at the end of the avail ring. Host should ignore the avail->flags field. */
/* The Host publishes the avail index for which it expects a kick
* at the end of the used ring. Guest should ignore the used->flags field. */
#define VIRTIO_RING_F_EVENT_IDX 29
/* Common configuration */
#define VIRTIO_PCI_CAP_COMMON_CFG 1
/* Notifications */
#define VIRTIO_PCI_CAP_NOTIFY_CFG 2
/* ISR Status */
#define VIRTIO_PCI_CAP_ISR_CFG 3
/* Device specific configuration */
#define VIRTIO_PCI_CAP_DEVICE_CFG 4
/* PCI configuration access */
#define VIRTIO_PCI_CAP_PCI_CFG 5
/* This is the PCI capability header: */
struct virtio_pci_cap {
uint8_t cap_vndr; /* Generic PCI field: PCI_CAP_ID_VNDR */
uint8_t cap_next; /* Generic PCI field: next ptr. */
uint8_t cap_len; /* Generic PCI field: capability length */
uint8_t cfg_type; /* Identifies the structure. */
uint8_t bar; /* Where to find it. */
uint8_t padding[3]; /* Pad to full dword. */
uint32_t offset; /* Offset within bar. */
uint32_t length; /* Length of the structure, in bytes. */
};
struct virtio_pci_notify_cap {
struct virtio_pci_cap cap;
uint32_t notify_off_multiplier; /* Multiplier for queue_notify_off. */
};
/* Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
struct virtio_pci_common_cfg {
/* About the whole device. */
uint32_t device_feature_select; /* read-write */
uint32_t device_feature; /* read-only */
uint32_t guest_feature_select; /* read-write */
uint32_t guest_feature; /* read-write */
uint16_t msix_config; /* read-write */
uint16_t num_queues; /* read-only */
uint8_t device_status; /* read-write */
uint8_t config_generation; /* read-only */
/* About a specific virtqueue. */
uint16_t queue_select; /* read-write */
uint16_t queue_size; /* read-write, power of 2. */
uint16_t queue_msix_vector; /* read-write */
uint16_t queue_enable; /* read-write */
uint16_t queue_notify_off; /* read-only */
uint32_t queue_desc_lo; /* read-write */
uint32_t queue_desc_hi; /* read-write */
uint32_t queue_avail_lo; /* read-write */
uint32_t queue_avail_hi; /* read-write */
uint32_t queue_used_lo; /* read-write */
uint32_t queue_used_hi; /* read-write */
};
struct virtio_hw;
struct virtio_pci_ops {
void (*read_dev_cfg)(struct virtio_hw *hw, size_t offset,
void *dst, int len);
void (*write_dev_cfg)(struct virtio_hw *hw, size_t offset,
const void *src, int len);
void (*reset)(struct virtio_hw *hw);
uint8_t (*get_status)(struct virtio_hw *hw);
void (*set_status)(struct virtio_hw *hw, uint8_t status);
uint64_t (*get_features)(struct virtio_hw *hw);
void (*set_features)(struct virtio_hw *hw, uint64_t features);
uint8_t (*get_isr)(struct virtio_hw *hw);
uint16_t (*set_config_irq)(struct virtio_hw *hw, uint16_t vec);
uint16_t (*set_queue_irq)(struct virtio_hw *hw, struct virtqueue *vq,
uint16_t vec);
uint16_t (*get_queue_num)(struct virtio_hw *hw, uint16_t queue_id);
int (*setup_queue)(struct virtio_hw *hw, struct virtqueue *vq);
void (*del_queue)(struct virtio_hw *hw, struct virtqueue *vq);
void (*notify_queue)(struct virtio_hw *hw, struct virtqueue *vq);
};
struct virtio_net_config;
struct virtio_hw {
uint64_t req_guest_features;
uint64_t guest_features;
uint32_t max_queues;
uint16_t started;
uint8_t use_msix;
uint8_t modern;
uint8_t port_id;
uint32_t notify_off_multiplier;
uint8_t *isr;
uint16_t *notify_base;
struct virtio_pci_common_cfg *common_cfg;
struct rte_pci_device *pci_dev;
struct virtio_scsi_config *dev_cfg;
void *virtio_user_dev;
struct virtqueue **vqs;
void **tx_queues;
uint32_t nb_tx_queues;
};
/*
* While virtio_hw is stored in shared memory, this structure stores
* some infos that may vary in the multiple process model locally.
* For example, the vtpci_ops pointer.
*/
struct virtio_hw_internal {
const struct virtio_pci_ops *vtpci_ops;
struct rte_pci_ioport io;
};
#define VTPCI_OPS(hw) (virtio_hw_internal[(hw)->port_id].vtpci_ops)
#define VTPCI_IO(hw) (&virtio_hw_internal[(hw)->port_id].io)
extern struct virtio_hw_internal virtio_hw_internal[128];
/*
* How many bits to shift physical queue address written to QUEUE_PFN.
* 12 is historical, and due to x86 page size.
*/
#define VIRTIO_PCI_QUEUE_ADDR_SHIFT 12
/* The alignment to use between consumer and producer parts of vring. */
#define VIRTIO_PCI_VRING_ALIGN 4096
static inline int
vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
{
return (hw->guest_features & (1ULL << bit)) != 0;
}
/*
* Function declaration from virtio_pci.c
*/
int vtpci_init(struct rte_pci_device *dev, struct virtio_hw *hw);
void vtpci_reset(struct virtio_hw *);
void vtpci_reinit_complete(struct virtio_hw *);
uint8_t vtpci_get_status(struct virtio_hw *);
void vtpci_set_status(struct virtio_hw *, uint8_t);
uint64_t vtpci_negotiate_features(struct virtio_hw *, uint64_t);
void vtpci_write_dev_config(struct virtio_hw *, size_t, const void *, int);
void vtpci_read_dev_config(struct virtio_hw *, size_t, void *, int);
uint8_t vtpci_isr(struct virtio_hw *);
extern const struct virtio_pci_ops virtio_user_ops;
#endif /* _VIRTIO_PCI_H_ */

View File

@ -0,0 +1,163 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTIO_RING_H_
#define _VIRTIO_RING_H_
#include <stdint.h>
#include <rte_common.h>
/* This marks a buffer as continuing via the next field. */
#define VRING_DESC_F_NEXT 1
/* This marks a buffer as write-only (otherwise read-only). */
#define VRING_DESC_F_WRITE 2
/* This means the buffer contains a list of buffer descriptors. */
#define VRING_DESC_F_INDIRECT 4
/* The Host uses this in used->flags to advise the Guest: don't kick me
* when you add a buffer. It's unreliable, so it's simply an
* optimization. Guest will still kick if it's out of buffers. */
#define VRING_USED_F_NO_NOTIFY 1
/* The Guest uses this in avail->flags to advise the Host: don't
* interrupt me when you consume a buffer. It's unreliable, so it's
* simply an optimization. */
#define VRING_AVAIL_F_NO_INTERRUPT 1
/* VirtIO ring descriptors: 16 bytes.
* These can chain together via "next". */
struct vring_desc {
uint64_t addr; /* Address (guest-physical). */
uint32_t len; /* Length. */
uint16_t flags; /* The flags as indicated above. */
uint16_t next; /* We chain unused descriptors via this. */
};
struct vring_avail {
uint16_t flags;
uint16_t idx;
uint16_t ring[0];
};
/* id is a 16bit index. uint32_t is used here for ids for padding reasons. */
struct vring_used_elem {
/* Index of start of used descriptor chain. */
uint32_t id;
/* Total length of the descriptor chain which was written to. */
uint32_t len;
};
struct vring_used {
uint16_t flags;
volatile uint16_t idx;
struct vring_used_elem ring[0];
};
struct vring {
unsigned int num;
struct vring_desc *desc;
struct vring_avail *avail;
struct vring_used *used;
};
/* The standard layout for the ring is a continuous chunk of memory which
* looks like this. We assume num is a power of 2.
*
* struct vring {
* // The actual descriptors (16 bytes each)
* struct vring_desc desc[num];
*
* // A ring of available descriptor heads with free-running index.
* __u16 avail_flags;
* __u16 avail_idx;
* __u16 available[num];
* __u16 used_event_idx;
*
* // Padding to the next align boundary.
* char pad[];
*
* // A ring of used descriptor heads with free-running index.
* __u16 used_flags;
* __u16 used_idx;
* struct vring_used_elem used[num];
* __u16 avail_event_idx;
* };
*
* NOTE: for VirtIO PCI, align is 4096.
*/
/*
* We publish the used event index at the end of the available ring, and vice
* versa. They are at the end for backwards compatibility.
*/
#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
#define vring_avail_event(vr) (*(uint16_t *)&(vr)->used->ring[(vr)->num])
static inline size_t
vring_size(unsigned int num, unsigned long align)
{
size_t size;
size = num * sizeof(struct vring_desc);
size += sizeof(struct vring_avail) + (num * sizeof(uint16_t));
size = RTE_ALIGN_CEIL(size, align);
size += sizeof(struct vring_used) +
(num * sizeof(struct vring_used_elem));
return size;
}
static inline void
vring_init(struct vring *vr, unsigned int num, uint8_t *p,
unsigned long align)
{
vr->num = num;
vr->desc = (struct vring_desc *) p;
vr->avail = (struct vring_avail *) (p +
num * sizeof(struct vring_desc));
vr->used = (void *)
RTE_ALIGN_CEIL((uintptr_t)(&vr->avail->ring[num]), align);
}
/*
* The following is used with VIRTIO_RING_F_EVENT_IDX.
* Assuming a given event_idx value from the other size, if we have
* just incremented index from old to new_idx, should we trigger an
* event?
*/
static inline int
vring_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old)
{
return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old);
}
#endif /* _VIRTIO_RING_H_ */

View File

@ -0,0 +1,264 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <rte_cycles.h>
#include <rte_memory.h>
#include <rte_memzone.h>
#include <rte_branch_prediction.h>
#include <rte_prefetch.h>
#include "virtio_logs.h"
#include "virtio_ethdev.h"
#include "virtio_pci.h"
#include "virtqueue.h"
#include "virtio_rxtx.h"
#include "spdk/env.h"
static void
vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
{
struct vring_desc *dp, *dp_tail;
struct vq_desc_extra *dxp;
uint16_t desc_idx_last = desc_idx;
dp = &vq->vq_ring.desc[desc_idx];
dxp = &vq->vq_descx[desc_idx];
vq->vq_free_cnt = (uint16_t)(vq->vq_free_cnt + dxp->ndescs);
if ((dp->flags & VRING_DESC_F_INDIRECT) == 0) {
while (dp->flags & VRING_DESC_F_NEXT) {
desc_idx_last = dp->next;
dp = &vq->vq_ring.desc[dp->next];
}
}
dxp->ndescs = 0;
/*
* We must append the existing free chain, if any, to the end of
* newly freed chain. If the virtqueue was completely used, then
* head would be VQ_RING_DESC_CHAIN_END (ASSERTed above).
*/
if (vq->vq_desc_tail_idx == VQ_RING_DESC_CHAIN_END) {
vq->vq_desc_head_idx = desc_idx;
} else {
dp_tail = &vq->vq_ring.desc[vq->vq_desc_tail_idx];
dp_tail->next = desc_idx;
}
vq->vq_desc_tail_idx = desc_idx_last;
dp->next = VQ_RING_DESC_CHAIN_END;
}
static uint16_t
virtqueue_dequeue_burst_rx(struct virtqueue *vq, struct virtio_req **rx_pkts,
uint32_t *len, uint16_t num)
{
struct vring_used_elem *uep;
struct virtio_req *cookie;
uint16_t used_idx, desc_idx;
uint16_t i;
/* Caller does the check */
for (i = 0; i < num ; i++) {
used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1));
uep = &vq->vq_ring.used->ring[used_idx];
desc_idx = (uint16_t) uep->id;
len[i] = uep->len;
cookie = (struct virtio_req *)vq->vq_descx[desc_idx].cookie;
if (unlikely(cookie == NULL)) {
PMD_DRV_LOG(ERR, "vring descriptor with no mbuf cookie at %u",
vq->vq_used_cons_idx);
break;
}
rte_prefetch0(cookie);
rx_pkts[i] = cookie;
vq->vq_used_cons_idx++;
vq_ring_free_chain(vq, desc_idx);
vq->vq_descx[desc_idx].cookie = NULL;
}
return i;
}
#ifndef DEFAULT_TX_FREE_THRESH
#define DEFAULT_TX_FREE_THRESH 32
#endif
/* avoid write operation when necessary, to lessen cache issues */
#define ASSIGN_UNLESS_EQUAL(var, val) do { \
if ((var) != (val)) \
(var) = (val); \
} while (0)
static inline void
virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct virtio_req *req)
{
struct vq_desc_extra *dxp;
struct virtqueue *vq = txvq->vq;
struct vring_desc *start_dp;
uint32_t i;
uint16_t head_idx, idx;
struct iovec *iov = req->iov;
head_idx = vq->vq_desc_head_idx;
idx = head_idx;
dxp = &vq->vq_descx[idx];
dxp->cookie = (void *)req;
dxp->ndescs = req->iovcnt;
start_dp = vq->vq_ring.desc;
for (i = 0; i < req->iovcnt; i++) {
if (vq->hw->virtio_user_dev) {
start_dp[idx].addr = (uintptr_t)iov[i].iov_base;
} else {
start_dp[idx].addr = spdk_vtophys(iov[i].iov_base);
}
start_dp[idx].len = iov[i].iov_len;
start_dp[idx].flags = (i >= req->start_write ? VRING_DESC_F_WRITE : 0);
if ((i + 1) != req->iovcnt) {
start_dp[idx].flags |= VRING_DESC_F_NEXT;
}
idx = start_dp[idx].next;
}
vq->vq_desc_head_idx = idx;
if (vq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END)
vq->vq_desc_tail_idx = idx;
vq->vq_free_cnt = (uint16_t)(vq->vq_free_cnt - req->iovcnt);
vq_update_avail_ring(vq, head_idx);
}
/*
* struct rte_eth_dev *dev: Used to update dev
* uint16_t nb_desc: Defaults to values read from config space
* unsigned int socket_id: Used to allocate memzone
* const struct rte_eth_txconf *tx_conf: Used to setup tx engine
* uint16_t queue_idx: Just used as an index in dev txq list
*/
int
virtio_dev_tx_queue_setup(struct virtio_hw *hw,
uint16_t queue_idx,
uint16_t nb_desc,
unsigned int socket_id __rte_unused)
{
struct virtqueue *vq = hw->vqs[queue_idx];
struct virtnet_tx *txvq;
PMD_INIT_FUNC_TRACE();
if (nb_desc == 0 || nb_desc > vq->vq_nentries)
nb_desc = vq->vq_nentries;
vq->vq_free_cnt = RTE_MIN(vq->vq_free_cnt, nb_desc);
txvq = &vq->txq;
txvq->queue_id = queue_idx;
hw->tx_queues[queue_idx] = txvq;
return 0;
}
#define VIRTIO_MBUF_BURST_SZ 64
#define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc))
uint16_t
virtio_recv_pkts(void *rx_queue, struct virtio_req **reqs, uint16_t nb_pkts)
{
struct virtnet_tx *rxvq = rx_queue;
struct virtqueue *vq = rxvq->vq;
struct virtio_hw *hw = vq->hw;
struct virtio_req *rxm;
uint16_t nb_used, num, nb_rx;
uint32_t len[VIRTIO_MBUF_BURST_SZ];
struct virtio_req *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
uint32_t i;
nb_rx = 0;
if (unlikely(hw->started == 0))
return nb_rx;
nb_used = VIRTQUEUE_NUSED(vq);
virtio_rmb();
num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts);
num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ);
if (likely(num > DESC_PER_CACHELINE))
num = num - ((vq->vq_used_cons_idx + num) % DESC_PER_CACHELINE);
num = virtqueue_dequeue_burst_rx(vq, rcv_pkts, len, num);
PMD_RX_LOG(DEBUG, "used:%d dequeue:%d", nb_used, num);
for (i = 0; i < num ; i++) {
rxm = rcv_pkts[i];
PMD_RX_LOG(DEBUG, "packet len:%d", len[i]);
rxm->data_transferred = (uint16_t)(len[i]);
reqs[nb_rx++] = rxm;
}
return nb_rx;
}
uint16_t
virtio_xmit_pkts(void *tx_queue, struct virtio_req *req)
{
struct virtnet_tx *txvq = tx_queue;
struct virtqueue *vq = txvq->vq;
struct virtio_hw *hw = vq->hw;
if (unlikely(hw->started == 0))
return 0;
virtio_rmb();
virtqueue_enqueue_xmit(txvq, req);
vq_update_avail_idx(vq);
if (unlikely(virtqueue_kick_prepare(vq))) {
virtqueue_notify(vq);
PMD_TX_LOG(DEBUG, "Notified backend after xmit");
}
return 1;
}

View File

@ -0,0 +1,46 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTIO_RXTX_H_
#define _VIRTIO_RXTX_H_
struct virtnet_tx {
struct virtqueue *vq;
uint16_t queue_id; /**< DPDK queue index. */
uint8_t port_id; /**< Device port identifier. */
const struct rte_memzone *mz; /**< mem zone to populate TX ring. */
};
#endif /* _VIRTIO_RXTX_H_ */

View File

@ -0,0 +1,120 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VHOST_NET_USER_H
#define _VHOST_NET_USER_H
#include <stdint.h>
#include <linux/types.h>
#include <linux/ioctl.h>
#include "../virtio_pci.h"
#include "../virtio_logs.h"
#include "../virtqueue.h"
struct vhost_vring_state {
unsigned int index;
unsigned int num;
};
struct vhost_vring_file {
unsigned int index;
int fd;
};
struct vhost_vring_addr {
unsigned int index;
/* Option flags. */
unsigned int flags;
/* Flag values: */
/* Whether log address is valid. If set enables logging. */
#define VHOST_VRING_F_LOG 0
/* Start of array of descriptors (virtually contiguous) */
uint64_t desc_user_addr;
/* Used structure address. Must be 32 bit aligned */
uint64_t used_user_addr;
/* Available structure address. Must be 16 bit aligned */
uint64_t avail_user_addr;
/* Logging support. */
/* Log writes to used structure, at offset calculated from specified
* address. Address must be 32 bit aligned.
*/
uint64_t log_guest_addr;
};
enum vhost_user_request {
VHOST_USER_NONE = 0,
VHOST_USER_GET_FEATURES = 1,
VHOST_USER_SET_FEATURES = 2,
VHOST_USER_SET_OWNER = 3,
VHOST_USER_RESET_OWNER = 4,
VHOST_USER_SET_MEM_TABLE = 5,
VHOST_USER_SET_LOG_BASE = 6,
VHOST_USER_SET_LOG_FD = 7,
VHOST_USER_SET_VRING_NUM = 8,
VHOST_USER_SET_VRING_ADDR = 9,
VHOST_USER_SET_VRING_BASE = 10,
VHOST_USER_GET_VRING_BASE = 11,
VHOST_USER_SET_VRING_KICK = 12,
VHOST_USER_SET_VRING_CALL = 13,
VHOST_USER_SET_VRING_ERR = 14,
VHOST_USER_GET_PROTOCOL_FEATURES = 15,
VHOST_USER_SET_PROTOCOL_FEATURES = 16,
VHOST_USER_GET_QUEUE_NUM = 17,
VHOST_USER_SET_VRING_ENABLE = 18,
VHOST_USER_MAX
};
extern const char * const vhost_msg_strings[VHOST_USER_MAX];
struct vhost_memory_region {
uint64_t guest_phys_addr;
uint64_t memory_size; /* bytes */
uint64_t userspace_addr;
uint64_t mmap_offset;
};
struct virtio_user_dev;
struct virtio_user_backend_ops {
int (*setup)(struct virtio_user_dev *dev);
int (*send_request)(struct virtio_user_dev *dev,
enum vhost_user_request req,
void *arg);
};
extern struct virtio_user_backend_ops ops_user;
extern struct virtio_user_backend_ops ops_kernel;
#endif

View File

@ -0,0 +1,453 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/un.h>
#include <string.h>
#include <errno.h>
#include "vhost.h"
#include "virtio_user_dev.h"
/* The version of the protocol we support */
#define VHOST_USER_VERSION 0x1
#define VHOST_MEMORY_MAX_NREGIONS 8
struct vhost_memory {
uint32_t nregions;
uint32_t padding;
struct vhost_memory_region regions[VHOST_MEMORY_MAX_NREGIONS];
};
struct vhost_user_msg {
enum vhost_user_request request;
#define VHOST_USER_VERSION_MASK 0x3
#define VHOST_USER_REPLY_MASK (0x1 << 2)
uint32_t flags;
uint32_t size; /* the following payload size */
union {
#define VHOST_USER_VRING_IDX_MASK 0xff
#define VHOST_USER_VRING_NOFD_MASK (0x1 << 8)
uint64_t u64;
struct vhost_vring_state state;
struct vhost_vring_addr addr;
struct vhost_memory memory;
} payload;
int fds[VHOST_MEMORY_MAX_NREGIONS];
} __attribute((packed));
#define VHOST_USER_HDR_SIZE offsetof(struct vhost_user_msg, payload.u64)
#define VHOST_USER_PAYLOAD_SIZE \
(sizeof(struct vhost_user_msg) - VHOST_USER_HDR_SIZE)
static int
vhost_user_write(int fd, void *buf, int len, int *fds, int fd_num)
{
int r;
struct msghdr msgh;
struct iovec iov;
size_t fd_size = fd_num * sizeof(int);
char control[CMSG_SPACE(fd_size)];
struct cmsghdr *cmsg;
memset(&msgh, 0, sizeof(msgh));
memset(control, 0, sizeof(control));
iov.iov_base = (uint8_t *)buf;
iov.iov_len = len;
msgh.msg_iov = &iov;
msgh.msg_iovlen = 1;
if (fds && fd_num > 0) {
msgh.msg_control = control;
msgh.msg_controllen = sizeof(control);
cmsg = CMSG_FIRSTHDR(&msgh);
cmsg->cmsg_len = CMSG_LEN(fd_size);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
memcpy(CMSG_DATA(cmsg), fds, fd_size);
} else {
msgh.msg_control = NULL;
msgh.msg_controllen = 0;
}
do {
r = sendmsg(fd, &msgh, 0);
} while (r < 0 && errno == EINTR);
return r;
}
static int
vhost_user_read(int fd, struct vhost_user_msg *msg)
{
uint32_t valid_flags = VHOST_USER_REPLY_MASK | VHOST_USER_VERSION;
int ret, sz_hdr = VHOST_USER_HDR_SIZE, sz_payload;
ret = recv(fd, (void *)msg, sz_hdr, 0);
if (ret < sz_hdr) {
PMD_DRV_LOG(ERR, "Failed to recv msg hdr: %d instead of %d.",
ret, sz_hdr);
goto fail;
}
/* validate msg flags */
if (msg->flags != (valid_flags)) {
PMD_DRV_LOG(ERR, "Failed to recv msg: flags %x instead of %x.",
msg->flags, valid_flags);
goto fail;
}
sz_payload = msg->size;
if (sz_payload) {
ret = recv(fd, (void *)((char *)msg + sz_hdr), sz_payload, 0);
if (ret < sz_payload) {
PMD_DRV_LOG(ERR,
"Failed to recv msg payload: %d instead of %d.",
ret, msg->size);
goto fail;
}
}
return 0;
fail:
return -1;
}
struct hugepage_file_info {
uint64_t addr; /**< virtual addr */
size_t size; /**< the file size */
char path[PATH_MAX]; /**< path to backing file */
};
/* Two possible options:
* 1. Match HUGEPAGE_INFO_FMT to find the file storing struct hugepage_file
* array. This is simple but cannot be used in secondary process because
* secondary process will close and munmap that file.
* 2. Match HUGEFILE_FMT to find hugepage files directly.
*
* We choose option 2.
*/
static int
get_hugepage_file_info(struct hugepage_file_info huges[], int max)
{
int idx;
FILE *f;
char buf[BUFSIZ], *tmp, *tail;
char *str_underline, *str_start;
int huge_index;
uint64_t v_start, v_end;
f = fopen("/proc/self/maps", "r");
if (!f) {
PMD_DRV_LOG(ERR, "cannot open /proc/self/maps");
return -1;
}
idx = 0;
while (fgets(buf, sizeof(buf), f) != NULL) {
if (sscanf(buf, "%" PRIx64 "-%" PRIx64, &v_start, &v_end) < 2) {
PMD_DRV_LOG(ERR, "Failed to parse address");
goto error;
}
tmp = strchr(buf, ' ') + 1; /** skip address */
tmp = strchr(tmp, ' ') + 1; /** skip perm */
tmp = strchr(tmp, ' ') + 1; /** skip offset */
tmp = strchr(tmp, ' ') + 1; /** skip dev */
tmp = strchr(tmp, ' ') + 1; /** skip inode */
while (*tmp == ' ') /** skip spaces */
tmp++;
tail = strrchr(tmp, '\n'); /** remove newline if exists */
if (tail)
*tail = '\0';
/* Match HUGEFILE_FMT, aka "%s/%smap_%d",
* which is defined in eal_filesystem.h
*/
str_underline = strrchr(tmp, '_');
if (!str_underline)
continue;
str_start = str_underline - strlen("map");
if (str_start < tmp)
continue;
if (sscanf(str_start, "map_%d", &huge_index) != 1)
continue;
if (idx >= max) {
PMD_DRV_LOG(ERR, "Exceed maximum of %d", max);
goto error;
}
huges[idx].addr = v_start;
huges[idx].size = v_end - v_start;
snprintf(huges[idx].path, PATH_MAX, "%s", tmp);
idx++;
}
fclose(f);
return idx;
error:
fclose(f);
return -1;
}
static int
prepare_vhost_memory_user(struct vhost_user_msg *msg, int fds[])
{
int i, num;
struct hugepage_file_info huges[VHOST_MEMORY_MAX_NREGIONS];
struct vhost_memory_region *mr;
num = get_hugepage_file_info(huges, VHOST_MEMORY_MAX_NREGIONS);
if (num < 0) {
PMD_INIT_LOG(ERR, "Failed to prepare memory for vhost-user");
return -1;
}
for (i = 0; i < num; ++i) {
mr = &msg->payload.memory.regions[i];
mr->guest_phys_addr = huges[i].addr; /* use vaddr! */
mr->userspace_addr = huges[i].addr;
mr->memory_size = huges[i].size;
mr->mmap_offset = 0;
fds[i] = open(huges[i].path, O_RDWR);
}
msg->payload.memory.nregions = num;
msg->payload.memory.padding = 0;
return 0;
}
static struct vhost_user_msg m;
const char * const vhost_msg_strings[] = {
[VHOST_USER_SET_OWNER] = "VHOST_SET_OWNER",
[VHOST_USER_RESET_OWNER] = "VHOST_RESET_OWNER",
[VHOST_USER_SET_FEATURES] = "VHOST_SET_FEATURES",
[VHOST_USER_GET_FEATURES] = "VHOST_GET_FEATURES",
[VHOST_USER_SET_VRING_CALL] = "VHOST_SET_VRING_CALL",
[VHOST_USER_SET_VRING_NUM] = "VHOST_SET_VRING_NUM",
[VHOST_USER_SET_VRING_BASE] = "VHOST_SET_VRING_BASE",
[VHOST_USER_GET_VRING_BASE] = "VHOST_GET_VRING_BASE",
[VHOST_USER_SET_VRING_ADDR] = "VHOST_SET_VRING_ADDR",
[VHOST_USER_SET_VRING_KICK] = "VHOST_SET_VRING_KICK",
[VHOST_USER_SET_MEM_TABLE] = "VHOST_SET_MEM_TABLE",
[VHOST_USER_SET_VRING_ENABLE] = "VHOST_SET_VRING_ENABLE",
};
static int
vhost_user_sock(struct virtio_user_dev *dev,
enum vhost_user_request req,
void *arg)
{
struct vhost_user_msg msg;
struct vhost_vring_file *file = 0;
int need_reply = 0;
int fds[VHOST_MEMORY_MAX_NREGIONS];
int fd_num = 0;
int i, len;
int vhostfd = dev->vhostfd;
RTE_SET_USED(m);
PMD_DRV_LOG(INFO, "%s", vhost_msg_strings[req]);
msg.request = req;
msg.flags = VHOST_USER_VERSION;
msg.size = 0;
switch (req) {
case VHOST_USER_GET_FEATURES:
case VHOST_USER_GET_QUEUE_NUM:
need_reply = 1;
break;
case VHOST_USER_SET_FEATURES:
case VHOST_USER_SET_LOG_BASE:
msg.payload.u64 = *((__u64 *)arg);
msg.size = sizeof(m.payload.u64);
break;
case VHOST_USER_SET_OWNER:
case VHOST_USER_RESET_OWNER:
break;
case VHOST_USER_SET_MEM_TABLE:
if (prepare_vhost_memory_user(&msg, fds) < 0)
return -1;
fd_num = msg.payload.memory.nregions;
msg.size = sizeof(m.payload.memory.nregions);
msg.size += sizeof(m.payload.memory.padding);
msg.size += fd_num * sizeof(struct vhost_memory_region);
break;
case VHOST_USER_SET_LOG_FD:
fds[fd_num++] = *((int *)arg);
break;
case VHOST_USER_SET_VRING_NUM:
case VHOST_USER_SET_VRING_BASE:
case VHOST_USER_SET_VRING_ENABLE:
memcpy(&msg.payload.state, arg, sizeof(msg.payload.state));
msg.size = sizeof(m.payload.state);
break;
case VHOST_USER_GET_VRING_BASE:
memcpy(&msg.payload.state, arg, sizeof(msg.payload.state));
msg.size = sizeof(m.payload.state);
need_reply = 1;
break;
case VHOST_USER_SET_VRING_ADDR:
memcpy(&msg.payload.addr, arg, sizeof(msg.payload.addr));
msg.size = sizeof(m.payload.addr);
break;
case VHOST_USER_SET_VRING_KICK:
case VHOST_USER_SET_VRING_CALL:
case VHOST_USER_SET_VRING_ERR:
file = arg;
msg.payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
msg.size = sizeof(m.payload.u64);
if (file->fd > 0)
fds[fd_num++] = file->fd;
else
msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
break;
default:
PMD_DRV_LOG(ERR, "trying to send unhandled msg type");
return -1;
}
len = VHOST_USER_HDR_SIZE + msg.size;
if (vhost_user_write(vhostfd, &msg, len, fds, fd_num) < 0) {
PMD_DRV_LOG(ERR, "%s failed: %s",
vhost_msg_strings[req], strerror(errno));
return -1;
}
if (req == VHOST_USER_SET_MEM_TABLE)
for (i = 0; i < fd_num; ++i)
close(fds[i]);
if (need_reply) {
if (vhost_user_read(vhostfd, &msg) < 0) {
PMD_DRV_LOG(ERR, "Received msg failed: %s",
strerror(errno));
return -1;
}
if (req != msg.request) {
PMD_DRV_LOG(ERR, "Received unexpected msg type");
return -1;
}
switch (req) {
case VHOST_USER_GET_FEATURES:
case VHOST_USER_GET_QUEUE_NUM:
if (msg.size != sizeof(m.payload.u64)) {
PMD_DRV_LOG(ERR, "Received bad msg size");
return -1;
}
*((__u64 *)arg) = msg.payload.u64;
break;
case VHOST_USER_GET_VRING_BASE:
if (msg.size != sizeof(m.payload.state)) {
PMD_DRV_LOG(ERR, "Received bad msg size");
return -1;
}
memcpy(arg, &msg.payload.state,
sizeof(struct vhost_vring_state));
break;
default:
PMD_DRV_LOG(ERR, "Received unexpected msg type");
return -1;
}
}
return 0;
}
/**
* Set up environment to talk with a vhost user backend.
*
* @return
* - (-1) if fail;
* - (0) if succeed.
*/
static int
vhost_user_setup(struct virtio_user_dev *dev)
{
int fd;
int flag;
struct sockaddr_un un;
fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd < 0) {
PMD_DRV_LOG(ERR, "socket() error, %s", strerror(errno));
return -1;
}
flag = fcntl(fd, F_GETFD);
if (fcntl(fd, F_SETFD, flag | FD_CLOEXEC) < 0)
PMD_DRV_LOG(WARNING, "fcntl failed, %s", strerror(errno));
memset(&un, 0, sizeof(un));
un.sun_family = AF_UNIX;
snprintf(un.sun_path, sizeof(un.sun_path), "%s", dev->path);
if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno));
close(fd);
return -1;
}
dev->vhostfd = fd;
return 0;
}
struct virtio_user_backend_ops ops_user = {
.setup = vhost_user_setup,
.send_request = vhost_user_sock,
};

View File

@ -0,0 +1,315 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <stdio.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/eventfd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include "vhost.h"
#include "virtio_user_dev.h"
#include "../virtio_ethdev.h"
static int
virtio_user_create_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
{
/* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL come
* firstly because vhost depends on this msg to allocate virtqueue
* pair.
*/
struct vhost_vring_file file;
file.index = queue_sel;
file.fd = dev->callfds[queue_sel];
dev->ops->send_request(dev, VHOST_USER_SET_VRING_CALL, &file);
return 0;
}
static int
virtio_user_kick_queue(struct virtio_user_dev *dev, uint32_t queue_sel)
{
struct vhost_vring_file file;
struct vhost_vring_state state;
struct vring *vring = &dev->vrings[queue_sel];
struct vhost_vring_addr addr = {
.index = queue_sel,
.desc_user_addr = (uint64_t)(uintptr_t)vring->desc,
.avail_user_addr = (uint64_t)(uintptr_t)vring->avail,
.used_user_addr = (uint64_t)(uintptr_t)vring->used,
.log_guest_addr = 0,
.flags = 0, /* disable log */
};
state.index = queue_sel;
state.num = vring->num;
dev->ops->send_request(dev, VHOST_USER_SET_VRING_NUM, &state);
state.index = queue_sel;
state.num = 0; /* no reservation */
dev->ops->send_request(dev, VHOST_USER_SET_VRING_BASE, &state);
dev->ops->send_request(dev, VHOST_USER_SET_VRING_ADDR, &addr);
/* Of all per virtqueue MSGs, make sure VHOST_USER_SET_VRING_KICK comes
* lastly because vhost depends on this msg to judge if
* virtio is ready.
*/
file.index = queue_sel;
file.fd = dev->kickfds[queue_sel];
dev->ops->send_request(dev, VHOST_USER_SET_VRING_KICK, &file);
return 0;
}
static int
virtio_user_queue_setup(struct virtio_user_dev *dev,
int (*fn)(struct virtio_user_dev *, uint32_t))
{
uint32_t i;
for (i = 0; i < dev->max_queues; ++i) {
if (fn(dev, i) < 0) {
PMD_DRV_LOG(INFO, "setup tx vq fails: %u", i);
return -1;
}
}
return 0;
}
int
virtio_user_start_device(struct virtio_user_dev *dev)
{
uint64_t features;
int ret;
/* Step 0: tell vhost to create queues */
if (virtio_user_queue_setup(dev, virtio_user_create_queue) < 0)
goto error;
/* Step 1: set features */
features = dev->features;
printf("features = %jx\n", features);
ret = dev->ops->send_request(dev, VHOST_USER_SET_FEATURES, &features);
if (ret < 0)
goto error;
PMD_DRV_LOG(INFO, "set features: %" PRIx64, features);
/* Step 2: share memory regions */
ret = dev->ops->send_request(dev, VHOST_USER_SET_MEM_TABLE, NULL);
if (ret < 0)
goto error;
/* Step 3: kick queues */
if (virtio_user_queue_setup(dev, virtio_user_kick_queue) < 0)
goto error;
return 0;
error:
/* TODO: free resource here or caller to check */
return -1;
}
int virtio_user_stop_device(struct virtio_user_dev *dev)
{
return 0;
}
int
is_vhost_user_by_type(const char *path)
{
struct stat sb;
if (stat(path, &sb) == -1)
return 0;
return S_ISSOCK(sb.st_mode);
}
static int
virtio_user_dev_init_notify(struct virtio_user_dev *dev)
{
uint32_t i, j;
int callfd;
int kickfd;
for (i = 0; i < VIRTIO_MAX_VIRTQUEUES; ++i) {
if (i >= dev->max_queues) {
dev->kickfds[i] = -1;
dev->callfds[i] = -1;
continue;
}
/* May use invalid flag, but some backend uses kickfd and
* callfd as criteria to judge if dev is alive. so finally we
* use real event_fd.
*/
callfd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
if (callfd < 0) {
PMD_DRV_LOG(ERR, "callfd error, %s", strerror(errno));
break;
}
kickfd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
if (kickfd < 0) {
PMD_DRV_LOG(ERR, "kickfd error, %s", strerror(errno));
break;
}
dev->callfds[i] = callfd;
dev->kickfds[i] = kickfd;
}
if (i < VIRTIO_MAX_VIRTQUEUES) {
for (j = 0; j <= i; ++j) {
close(dev->callfds[j]);
close(dev->kickfds[j]);
}
return -1;
}
return 0;
}
static int
virtio_user_dev_setup(struct virtio_user_dev *dev)
{
dev->vhostfd = -1;
dev->vhostfds = NULL;
dev->tapfds = NULL;
dev->ops = &ops_user;
if (dev->ops->setup(dev) < 0)
return -1;
if (virtio_user_dev_init_notify(dev) < 0)
return -1;
return 0;
}
/* Use below macro to filter features from vhost backend */
#define VIRTIO_USER_SUPPORTED_FEATURES \
(1ULL << VIRTIO_SCSI_F_INOUT | \
1ULL << VIRTIO_F_VERSION_1)
struct virtio_hw *
virtio_user_dev_init(char *path, int queues, int queue_size)
{
struct virtio_hw *hw;
struct virtio_user_dev *dev;
uint64_t max_queues;
hw = calloc(1, sizeof(*hw));
dev = calloc(1, sizeof(struct virtio_user_dev));
hw->virtio_user_dev = dev;
virtio_hw_internal[hw->port_id].vtpci_ops = &virtio_user_ops;
snprintf(dev->path, PATH_MAX, "%s", path);
/* Account for control and event queue. */
dev->max_queues = queues + 2;
dev->queue_size = queue_size;
if (virtio_user_dev_setup(dev) < 0) {
PMD_INIT_LOG(ERR, "backend set up fails");
free(hw);
free(dev);
return NULL;
}
if (dev->ops->send_request(dev, VHOST_USER_GET_QUEUE_NUM, &max_queues) < 0) {
PMD_INIT_LOG(ERR, "get_queue_num fails: %s", strerror(errno));
free(hw);
free(dev);
return NULL;
}
if (dev->max_queues > max_queues) {
PMD_INIT_LOG(ERR, "%d queues requested but only %d supported", dev->max_queues, max_queues);
free(hw);
free(dev);
return NULL;
}
if (dev->ops->send_request(dev, VHOST_USER_SET_OWNER, NULL) < 0) {
PMD_INIT_LOG(ERR, "set_owner fails: %s", strerror(errno));
free(hw);
free(dev);
return NULL;
}
if (dev->ops->send_request(dev, VHOST_USER_GET_FEATURES,
&dev->device_features) < 0) {
PMD_INIT_LOG(ERR, "get_features failed: %s", strerror(errno));
free(hw);
free(dev);
return NULL;
}
dev->device_features &= VIRTIO_USER_SUPPORTED_FEATURES;
return hw;
}
void
virtio_user_dev_uninit(struct virtio_user_dev *dev)
{
uint32_t i;
virtio_user_stop_device(dev);
for (i = 0; i < dev->max_queues; ++i) {
close(dev->callfds[i]);
close(dev->kickfds[i]);
}
close(dev->vhostfd);
if (dev->vhostfds) {
for (i = 0; i < dev->max_queues; ++i)
close(dev->vhostfds[i]);
free(dev->vhostfds);
free(dev->tapfds);
}
}

View File

@ -0,0 +1,76 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTIO_USER_DEV_H
#define _VIRTIO_USER_DEV_H
#include <limits.h>
#include "../virtio_pci.h"
#include "../virtio_ring.h"
#include "vhost.h"
#define VIRTIO_MAX_VIRTQUEUES 0x100
struct virtio_user_dev {
/* for vhost_user backend */
int vhostfd;
/* for vhost_kernel backend */
char *ifname;
int *vhostfds;
int *tapfds;
/* for both vhost_user and vhost_kernel */
int callfds[VIRTIO_MAX_VIRTQUEUES];
int kickfds[VIRTIO_MAX_VIRTQUEUES];
uint32_t max_queues;
uint32_t num_queues;
uint32_t queue_size;
uint64_t features; /* the negotiated features with driver,
* and will be sync with device
*/
uint64_t device_features; /* supported features by device */
uint8_t status;
uint8_t port_id;
char path[PATH_MAX];
struct vring vrings[VIRTIO_MAX_VIRTQUEUES];
struct virtio_user_backend_ops *ops;
};
int is_vhost_user_by_type(const char *path);
int virtio_user_start_device(struct virtio_user_dev *dev);
int virtio_user_stop_device(struct virtio_user_dev *dev);
struct virtio_hw *virtio_user_dev_init(char *path, int queues, int queue_size);
void virtio_user_dev_uninit(struct virtio_user_dev *dev);
void virtio_user_handle_cq(struct virtio_user_dev *dev, uint16_t queue_idx);
#endif

View File

@ -0,0 +1,223 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdint.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/virtio_scsi.h>
#include <rte_malloc.h>
#include <rte_vdev.h>
#include <rte_alarm.h>
#include "virtio_ethdev.h"
#include "virtio_logs.h"
#include "virtio_pci.h"
#include "virtqueue.h"
#include "virtio_rxtx.h"
#include "virtio_user/virtio_user_dev.h"
#define virtio_user_get_dev(hw) \
((struct virtio_user_dev *)(hw)->virtio_user_dev)
static void
virtio_user_read_dev_config(struct virtio_hw *hw, size_t offset,
void *dst, int length)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
if (offset == offsetof(struct virtio_scsi_config, num_queues))
*(uint16_t *)dst = dev->max_queues;
}
static void
virtio_user_write_dev_config(struct virtio_hw *hw, size_t offset,
const void *src, int length)
{
PMD_DRV_LOG(ERR, "not supported offset=%zu, len=%d", offset, length);
}
static void
virtio_user_reset(struct virtio_hw *hw)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
if (dev->status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
virtio_user_stop_device(dev);
}
static void
virtio_user_set_status(struct virtio_hw *hw, uint8_t status)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
if (status & VIRTIO_CONFIG_STATUS_DRIVER_OK)
virtio_user_start_device(dev);
else if (status == VIRTIO_CONFIG_STATUS_RESET)
virtio_user_reset(hw);
dev->status = status;
}
static uint8_t
virtio_user_get_status(struct virtio_hw *hw)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
return dev->status;
}
static uint64_t
virtio_user_get_features(struct virtio_hw *hw)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
/* unmask feature bits defined in vhost user protocol */
return dev->device_features & VIRTIO_PMD_SUPPORTED_GUEST_FEATURES;
}
static void
virtio_user_set_features(struct virtio_hw *hw, uint64_t features)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
dev->features = features & dev->device_features;
}
static uint8_t
virtio_user_get_isr(struct virtio_hw *hw __rte_unused)
{
/* rxq interrupts and config interrupt are separated in virtio-user,
* here we only report config change.
*/
return VIRTIO_PCI_ISR_CONFIG;
}
static uint16_t
virtio_user_set_config_irq(struct virtio_hw *hw __rte_unused,
uint16_t vec __rte_unused)
{
return 0;
}
static uint16_t
virtio_user_set_queue_irq(struct virtio_hw *hw __rte_unused,
struct virtqueue *vq __rte_unused,
uint16_t vec)
{
/* pretend we have done that */
return vec;
}
/* This function is to get the queue size, aka, number of descs, of a specified
* queue. Different with the VHOST_USER_GET_QUEUE_NUM, which is used to get the
* max supported queues.
*/
static uint16_t
virtio_user_get_queue_num(struct virtio_hw *hw, uint16_t queue_id __rte_unused)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
/* Currently, each queue has same queue size */
return dev->queue_size;
}
static int
virtio_user_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
uint16_t queue_idx = vq->vq_queue_index;
uint64_t desc_addr, avail_addr, used_addr;
desc_addr = (uintptr_t)vq->vq_ring_virt_mem;
avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
ring[vq->vq_nentries]),
VIRTIO_PCI_VRING_ALIGN);
dev->vrings[queue_idx].num = vq->vq_nentries;
dev->vrings[queue_idx].desc = (void *)(uintptr_t)desc_addr;
dev->vrings[queue_idx].avail = (void *)(uintptr_t)avail_addr;
dev->vrings[queue_idx].used = (void *)(uintptr_t)used_addr;
return 0;
}
static void
virtio_user_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
/* For legacy devices, write 0 to VIRTIO_PCI_QUEUE_PFN port, QEMU
* correspondingly stops the ioeventfds, and reset the status of
* the device.
* For modern devices, set queue desc, avail, used in PCI bar to 0,
* not see any more behavior in QEMU.
*
* Here we just care about what information to deliver to vhost-user
* or vhost-kernel. So we just close ioeventfd for now.
*/
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
close(dev->callfds[vq->vq_queue_index]);
close(dev->kickfds[vq->vq_queue_index]);
}
static void
virtio_user_notify_queue(struct virtio_hw *hw, struct virtqueue *vq)
{
uint64_t buf = 1;
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
if (write(dev->kickfds[vq->vq_queue_index], &buf, sizeof(buf)) < 0)
PMD_DRV_LOG(ERR, "failed to kick backend: %s",
strerror(errno));
}
const struct virtio_pci_ops virtio_user_ops = {
.read_dev_cfg = virtio_user_read_dev_config,
.write_dev_cfg = virtio_user_write_dev_config,
.reset = virtio_user_reset,
.get_status = virtio_user_get_status,
.set_status = virtio_user_set_status,
.get_features = virtio_user_get_features,
.set_features = virtio_user_set_features,
.get_isr = virtio_user_get_isr,
.set_config_irq = virtio_user_set_config_irq,
.set_queue_irq = virtio_user_set_queue_irq,
.get_queue_num = virtio_user_get_queue_num,
.setup_queue = virtio_user_setup_queue,
.del_queue = virtio_user_del_queue,
.notify_queue = virtio_user_notify_queue,
};

View File

@ -0,0 +1,200 @@
/*-
* BSD LICENSE
*
* Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* * Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _VIRTQUEUE_H_
#define _VIRTQUEUE_H_
#include <stdint.h>
#include <rte_atomic.h>
#include <rte_memory.h>
#include <rte_memzone.h>
#include <rte_mempool.h>
#include "virtio_pci.h"
#include "virtio_ring.h"
#include "virtio_logs.h"
#include "virtio_rxtx.h"
/*
* Per virtio_config.h in Linux.
* For virtio_pci on SMP, we don't need to order with respect to MMIO
* accesses through relaxed memory I/O windows, so smp_mb() et al are
* sufficient.
*
*/
#define virtio_mb() rte_smp_mb()
#define virtio_rmb() rte_smp_rmb()
#define virtio_wmb() rte_smp_wmb()
#ifdef RTE_PMD_PACKET_PREFETCH
#define rte_packet_prefetch(p) rte_prefetch1(p)
#else
#define rte_packet_prefetch(p) do {} while(0)
#endif
#define VIRTQUEUE_MAX_NAME_SZ 32
/**
* The maximum virtqueue size is 2^15. Use that value as the end of
* descriptor chain terminator since it will never be a valid index
* in the descriptor table. This is used to verify we are correctly
* handling vq_free_cnt.
*/
#define VQ_RING_DESC_CHAIN_END 32768
struct vq_desc_extra {
void *cookie;
uint16_t ndescs;
};
struct virtqueue {
struct virtio_hw *hw; /**< virtio_hw structure pointer. */
struct vring vq_ring; /**< vring keeping desc, used and avail */
/**
* Last consumed descriptor in the used table,
* trails vq_ring.used->idx.
*/
uint16_t vq_used_cons_idx;
uint16_t vq_nentries; /**< vring desc numbers */
uint16_t vq_free_cnt; /**< num of desc available */
uint16_t vq_avail_idx; /**< sync until needed */
uint16_t vq_free_thresh; /**< free threshold */
void *vq_ring_virt_mem; /**< linear address of vring*/
unsigned int vq_ring_size;
union {
struct virtnet_tx txq;
};
phys_addr_t vq_ring_mem; /**< physical address of vring,
* or virtual address for virtio_user. */
/**
* Head of the free chain in the descriptor table. If
* there are no free descriptors, this will be set to
* VQ_RING_DESC_CHAIN_END.
*/
uint16_t vq_desc_head_idx;
uint16_t vq_desc_tail_idx;
uint16_t vq_queue_index; /**< PCI queue index */
uint16_t offset; /**< relative offset to obtain addr in mbuf */
uint16_t *notify_addr;
struct vq_desc_extra vq_descx[0];
};
/* Chain all the descriptors in the ring with an END */
static inline void
vring_desc_init(struct vring_desc *dp, uint16_t n)
{
uint16_t i;
for (i = 0; i < n - 1; i++)
dp[i].next = (uint16_t)(i + 1);
dp[i].next = VQ_RING_DESC_CHAIN_END;
}
/**
* Tell the backend not to interrupt us.
*/
static inline void
virtqueue_disable_intr(struct virtqueue *vq)
{
vq->vq_ring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
}
/**
* Tell the backend to interrupt us.
*/
static inline void
virtqueue_enable_intr(struct virtqueue *vq)
{
vq->vq_ring.avail->flags &= (~VRING_AVAIL_F_NO_INTERRUPT);
}
/**
* Dump virtqueue internal structures, for debug purpose only.
*/
void virtqueue_dump(struct virtqueue *vq);
static inline int
virtqueue_full(const struct virtqueue *vq)
{
return vq->vq_free_cnt == 0;
}
#define VIRTQUEUE_NUSED(vq) ((uint16_t)((vq)->vq_ring.used->idx - (vq)->vq_used_cons_idx))
static inline void
vq_update_avail_idx(struct virtqueue *vq)
{
virtio_wmb();
vq->vq_ring.avail->idx = vq->vq_avail_idx;
}
static inline void
vq_update_avail_ring(struct virtqueue *vq, uint16_t desc_idx)
{
uint16_t avail_idx;
/*
* Place the head of the descriptor chain into the next slot and make
* it usable to the host. The chain is made available now rather than
* deferring to virtqueue_notify() in the hopes that if the host is
* currently running on another CPU, we can keep it processing the new
* descriptor.
*/
avail_idx = (uint16_t)(vq->vq_avail_idx & (vq->vq_nentries - 1));
if (unlikely(vq->vq_ring.avail->ring[avail_idx] != desc_idx))
vq->vq_ring.avail->ring[avail_idx] = desc_idx;
vq->vq_avail_idx++;
}
static inline int
virtqueue_kick_prepare(struct virtqueue *vq)
{
return !(vq->vq_ring.used->flags & VRING_USED_F_NO_NOTIFY);
}
static inline void
virtqueue_notify(struct virtqueue *vq)
{
/*
* Ensure updated avail->idx is visible to host.
* For virtio on IA, the notificaiton is through io port operation
* which is a serialization instruction itself.
*/
VTPCI_OPS(vq->hw)->notify_queue(vq->hw, vq);
}
#endif /* _VIRTQUEUE_H_ */

View File

@ -38,7 +38,7 @@ BLOCKDEV_MODULES_DEPS += -libverbs -lrdmacm
endif
ifeq ($(OS),Linux)
BLOCKDEV_MODULES_LIST += bdev_aio
BLOCKDEV_MODULES_LIST += bdev_aio bdev_virtio
BLOCKDEV_MODULES_DEPS += -laio
endif

View File

@ -16,7 +16,8 @@ if hash astyle; then
# as-is to enable ongoing work to synch with a generic upstream DPDK vhost library,
# rather than making diffs more complicated by a lot of changes to follow SPDK
# coding standards.
git ls-files '*.[ch]' '*.cpp' '*.cc' '*.cxx' '*.hh' '*.hpp' | grep -v rte_vhost | grep -v cpp_headers | \
git ls-files '*.[ch]' '*.cpp' '*.cc' '*.cxx' '*.hh' '*.hpp' | \
grep -v rte_vhost | grep -v rte_virtio | grep -v cpp_headers | \
xargs astyle --options=.astylerc >> astyle.log
if grep -q "^Formatted" astyle.log; then
echo " errors detected"
@ -38,7 +39,7 @@ fi
echo -n "Checking comment style..."
git grep --line-number -e '/[*][^ *-]' -- '*.[ch]' > comment.log || true
git grep --line-number -e '[^ ][*]/' -- '*.[ch]' ':!lib/vhost/rte_vhost*/*' >> comment.log || true
git grep --line-number -e '[^ ][*]/' -- '*.[ch]' ':!lib/vhost/rte_vhost*/*' ':!lib/bdev/virtio/rte_virtio*/*' >> comment.log || true
git grep --line-number -e '^[*]' -- '*.[ch]' >> comment.log || true
if [ -s comment.log ]; then
@ -63,7 +64,7 @@ fi
rm -f eofnl.log
echo -n "Checking for POSIX includes..."
git grep -I -i -f scripts/posix.txt -- './*' ':!include/spdk/stdinc.h' ':!lib/vhost/rte_vhost*/**' ':!scripts/posix.txt' > scripts/posix.log || true
git grep -I -i -f scripts/posix.txt -- './*' ':!include/spdk/stdinc.h' ':!lib/vhost/rte_vhost*/**' ':!lib/bdev/virtio/rte_virtio*/**' ':!scripts/posix.txt' > scripts/posix.log || true
if [ -s scripts/posix.log ]; then
echo "POSIX includes detected. Please include spdk/stdinc.h instead."
cat scripts/posix.log

View File

@ -76,6 +76,19 @@ function configure_linux {
done
rm $TMP
# virtio-scsi
TMP=`mktemp`
#collect all the device_id info of virtio-scsi devices.
grep "VIRTIO_PCI_DEVICEID_SCSI" $rootdir/lib/bdev/virtio/rte_virtio/virtio_pci.h \
| awk -F"x" '{print $2}' > $TMP
for dev_id in `cat $TMP`; do
for bdf in $(linux_iter_pci_dev_id 1af4 $dev_id); do
linux_bind_driver "$bdf" "$driver_name"
done
done
rm $TMP
echo "1" > "/sys/bus/pci/rescan"
hugetlbfs_mount=$(linux_hugetlbfs_mount)
@ -137,6 +150,20 @@ function reset_linux {
done
rm $TMP
# virtio-scsi
TMP=`mktemp`
#collect all the device_id info of virtio-scsi devices.
grep "VIRTIO_PCI_DEVICEID_SCSI" $rootdir/lib/bdev/virtio/rte_virtio/virtio_pci.h \
| awk -F"x" '{print $2}' > $TMP
modprobe virtio-pci || true
for dev_id in `cat $TMP`; do
for bdf in $(linux_iter_pci_dev_id 1af4 $dev_id); do
linux_bind_driver "$bdf" virtio-pci
done
done
rm $TMP
echo "1" > "/sys/bus/pci/rescan"
hugetlbfs_mount=$(linux_hugetlbfs_mount)
@ -171,6 +198,20 @@ function status_linux {
echo -e "$bdf\t$node\t\t$driver"
done
done
echo "virtio"
#collect all the device_id info of virtio-scsi devices.
TMP=`grep "VIRTIO_PCI_DEVICEID_SCSI" $rootdir/lib/bdev/virtio/rte_virtio/virtio_pci.h \
| awk -F"x" '{print $2}'`
echo -e "BDF\t\tNuma Node\tDriver Name"
for dev_id in $TMP; do
for bdf in $(linux_iter_pci_dev_id 1af4 $dev_id); do
driver=`grep DRIVER /sys/bus/pci/devices/$bdf/uevent |awk -F"=" '{print $2}'`
node=`cat /sys/bus/pci/devices/$bdf/numa_node`;
echo -e "$bdf\t$node\t\t$driver"
done
done
}
function configure_freebsd {