Commit Graph

320 Commits

Author SHA1 Message Date
Darek Stojaczyk
3759b87082 env_dpdk/pci: remove driver->is_registered
Now that we support only DPDK 18.11+ and always have
to register pci drivers to DPDK on initialization we
don't need that flag - it's always true.

Change-Id: Ibf1d79155595609fe9093f58e056bea25db6fdb2
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3446
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
45528bfef6 env: add spdk_pci_id->class_id
This follows struct rte_pci_id which had class_id as well.
We'll need it to make some additional DPDK APIs public through
the env abstraction.

Change-Id: I794a6cd6b17e48daf53b48fa5abe3d3dcfeaa403
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3182
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
e8e46cb615 env_dpdk/pci: remove device detach callback
You don't get notified when someone starts using your hooked
device, so there's not much gain from knowing when someone
stops.

Remove that callback and also move DPDK device detach under
the same lock which sets the pending_removal flag. This eliminates
a data race window when hotremove notification could arrive
after device was detached, but before it was scheduled to be
removed.

vmd and ioat nest the spdk_pci_device struct and abigail complains
even though the parent structs only have forward declarations in
public headers. Adding those two structs to the suppression list
doesn't help though. Abidiff still complains about the pci device
struct being changed, probably because ioat.h and vmd.h both include
env.h. Abidiff suppresion list should eventually be split per-lib,
but for now ignore struct spdk_pci_device changes globally.

$ abidiff [...]/libspdk_ioat.so [...]

'struct spdk_pci_device at env.h:652:1' changed:
  type size changed from 1024 to 960 (in bits)
  1 data member deletion:
    <SNIP>

Change-Id: I9b113572c661f0e0786b6d625e16dc07fe77e778
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2939
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
814072fa4e env_dpdk/pci: delay device initialization on hotplug
A workaround for kernel deadlocks surfaced in #1275.

DPDK basically offers two APIs for hotplugging all PCI devices:
rte_bus_scan() and rte_bus_probe(). Scan iterates through
/sys/bus/pci/devices/* and creates corresponding rte_pci_device-s,
then rte_bus_probe() tries to initialize each device with the
supporting driver.

Previously we did scan and probe together, one after another, now
we'll have an intermediate step. After scanning the bus, we'll
iterate through all rte_pci_device-s and temporarily blacklist any
newly detected devices. We'll use devargs->data field to a store
a timeout value (integer) after which the device can be un-blacklisted
and initialized. devargs->data is documented in DPDK as "Device
string storage" and it's a char*, but it's not referenced anywhere
in DPDK. rte_bus_probe() respects the blacklist and doesn't do
absolutely anything with blacklisted ones.

The timeout value is 2 seconds, which should be plenty enough
for an NVMe device to reset, leave the critical lock sections in
kernel, and let us initialize it safely.

Note that direct attach by BDF doesn't respect the blacklist,
so an NVMe attach RPC won't be delayed in any way, it will continue
to work as it always did. Only the automatic discovery & enumeration
is deferred.

Change-Id: I62b719271bd0755bc2882331ea33f69897b1e5e5
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1733
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
701d17f6d6 env_dpdk/pci: ignore rte_bus_scan() errors
Extensive testing showed it can fail:
> EAL: eal_parse_sysfs_value(): cannot open sysfs value
> /sys/bus/pci/devices/0000:02:00.0/vendor
> EAL: Scan for (pci) bus failed.

spdk_pci_enumerate() would previously return with error because
of this and e.g. the test nvme hotplug app could immediately exit
with failure. A mis-timed scan shouldn't cause this kind of failure,
so ignore it's return code. This shouldn't cause any issues.

Change-Id: I9253219c218981a747774a8632335963cfb0db53
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2941
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
3554970375 env_dpdk: drop DPDK 18.08 support
DPDK versions 17.11 to 18.08 reached EOL.

Change-Id: Icfec27b0099f53d6ab00ec3aed63e5d30d94ee4d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2940
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
7c6f0ef001 env_dpdk/pci: fix segfault on simultaneous VFIO hotremove and user detach
There was a chance we scheduled a device removal to the DPDK thread
while that thread was already removing the device from a VFIO hotremove
notification (on the DPDK interrupt thread). The second hotremove
attempt touches some freed memory and segfaults.

The VFIO hotremove notification already checks pending_removal flag
under a mutex and sets it to true, so do the same in spdk_detach_rte()
(called from the SPDK init thread).

Change-Id: Ib3f0eb7c0c5c6e1ab8cf253b7711fd149925a143
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1730
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
d3bcd1ca5b env_dpdk/pci: split dpdk device detach and removal
Simplify the code path a bit. VFIO notification is the only
place where detach callback is called from the dpdk intr thread.
Detach checks the current thread and behaves differently in this
case, but it could be the VFIO notification that simply calls
a different function.

So instead of carrying the VFIO notification through the generic
detach routine, carry it just through the DPDK-thread specific
subset. This lets us remove some ifs in the generic routine.

Change-Id: I5e8866e4643ef08fb3cd12621e2d262b5e827c74
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1731
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-07-23 20:48:47 +00:00
Darek Stojaczyk
b71ee92e3b Revert "pci: fix the hotplug issue"
This reverts commit 301c5aeec9.

The patch doesn't fix anything as the hotremoval could be still
called twice and the second call would do use-after-free.

Change-Id: I78a1120707dbdf36c871ec378a312c4a058fc76b
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1729
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-07-23 20:48:47 +00:00
Jin Yu
eb76afe78b event: add iova-mode parameter
Export iova-mode parameters in spdk which is useful in
VM environment.

Change-Id: I3f4756b2c3b6cf5d1964a50bbf63f9c596997696
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2910
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2020-06-24 08:22:24 +00:00
Jacek Kalwas
b3767a239d env_dpdk: expose base virtaddr as an option
This might be helpful if secondary processes cannot start due to
conflicts in address map.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I180dc09b4cad3b0064f009b0f553f5929de6566c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2776
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2020-06-16 07:45:07 +00:00
Tomasz Zawadzki
7b8964c5c8 lib/env_dpdk: rename pci_init/fini() to pci_env_init/fini()
Patch below removed spdk_* prefix from functions in env_dpdk:
(15d0ae62) lib/env_dpdk: remove spdk prefix from internal functions.

This resulted in name conflict with libpci PCI Utilities library
pci_init() function.

Fixes #1407

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie6d6eea3a7b8a0f0223bd14bbe258061460a81dd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2611
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-05-28 07:12:42 +00:00
Seth Howell
3456377b45 lib: accel, bdev, blob, env_dpdk remove spdk_ prefix.
Hitting only the static functions from the above libraries
with the spdk_ prefix.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic6df38dfbeb53f0b1c30d350921f7216acba3170
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2362
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-05-21 09:19:00 +00:00
Seth Howell
d18e63206a mk/lib: add a check that major and minor version is set for libs.
Also, while we are here, consolidate setting SO_SUFFIX to one spot.

Previously, it was possible for a library to slip through
without an SO version.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I4db5fa5839502d266c6259892e5719b05134518c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2020-05-21 09:19:00 +00:00
Ben Walker
76aed8e4ff Revert "env: Use rte_malloc in spdk_mem_register code path when possible"
This reverts commit 6d6052ac96.

This approach is no longer necessary given the patch immediately
preceeding this one.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I5aab14346fa5a14dbf33c94ffcf88b045cdb4999
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2512
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-05-20 14:14:21 +00:00
Ben Walker
cf450c0d7c env: Add spdk_mem_reserve
The spdk_mem_reserve() function reserves a memory region in SPDK's
memory maps. This pre-allocates all of the required data structures
to hold memory address translations for that region without actually
populating the region.

After a region is reserved, calls to spdk_mem_register() for
addresses in that range will not require any internal memory
allocations. This is useful when overlaying a custom memory allocator
on top of SPDK's hugepage memory, such as tcmalloc.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ia4e8a770e8b5c956814aa90e9119013356dfab46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2511
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-20 14:14:21 +00:00
Jin Yu
132fffd4fd env: add the device ID of virtio device
From the QEMU doc:
1af4:1000  network device (legacy)
1af4:1001  block device (legacy)
1af4:1002  balloon device (legacy)
1af4:1003  console device (legacy)
1af4:1004  SCSI host bus adapter device (legacy)
1af4:1005  entropy generator device (legacy)
1af4:1009  9p filesystem device (legacy)

1af4:1041  network device (modern)
1af4:1042  block device (modern)
1af4:1043  console device (modern)
1af4:1044  entropy generator device (modern)
1af4:1045  balloon device (modern)
1af4:1048  SCSI host bus adapter device (modern)
1af4:1049  9p filesystem device (modern)
1af4:1050  virtio gpu device (modern)
1af4:1052  virtio input device (modern)

Change-Id: I75b04c331aebae9f20fecbca85d64415114ae5ff
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2369
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2020-05-12 08:12:22 +00:00
paul luse
5b03dd938c module/idxd: accel framework plug-in for idxd
Docs, RPC, unit tests, etc., will follow.  Notes:

* The current implementation will only work with VFIO.

* The current implementation supports only the existing accel
framework API. The API will be expanded for DSA exclusive features
in a subsequent patch.

* SW is required to manage flow control, to not over-run the work queues.
This is provided in the accel plug-in module. The upper layers use public
API to manage this.

* As we need to support any number of channels (we can't limit ourselves
to the number of work queues) we need to dynamically size/resize our
per channel descriptor rings based on the number of current channels. This
is done from upper layers via public API into the lib.

* As channels are created, the total number of work queue slots is divided
across the channels evenly.  Same thing when they are destroyed, remaining
channels will see the ring sizes increase. This is done from upper layers
via public API into the lib.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ifaa39935107206a2d990cec992854675e5502057
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1722
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-04-23 15:48:32 +00:00
paul luse
e58e9fbda8 lib/idxd: add low level idxd library
Module, etc., will follow. Notes:

* IDXD is an Intel silicon feature available in future Intel CPUs.
Initial development is being done on a simulator. Once HW is
available and the code fully tested the experimental label will be
lifted. Spec can be found here: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

* The current implementation will only work with VFIO.

* DSA has a number of engines that can be grouped based on application
need such as type of memory being served or QoS. Engines are processing
units and are assigned to groups. Work queues are on device structures
that act as front-end groups for queueing descriptors. Full details on
what is configurable & how will come in later doc patches.

* There is a finite number of work queue slots that are divided amongst
the number of desired work queues in some fashion (ie evenly).

* SW (outside of the idxd lib) is required to manage flow control, to not
over-run the work queues.This is provided in the accel plug-in module.
The upper layers use public API to manage this.

* Work queue submissions are done with a 64 byte atomic instruction

* The design here creates a set of descriptor rings per channel that match
the size of the work queues. Then, an spdk_bit_array is used to make sure
we don't overrun a queue.  If there are not slots available, the operation
is put on a linked list to be retried later from the poller.

* As we need to support any number of channels (we can't limit ourselves
to the number of work queues) we need to dynamically size/resize our
per channel descriptor rings based on the number of current channels. This
is done from upper layers via public API into the lib.

* As channels are created, the total number of work queue slots is divided
across the channels evenly. Same thing when they are destroyed, remaining
channels with see the ring sizes increase. This is done from upper layers
via public API into the lib.

* The sim has 64 total work queue entries (WQE) that get dolled out to the
work queues (WQ) evenly.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I899bbeda3cef3db05bea4197b8757e89dddb579d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1809
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-04-23 15:48:32 +00:00
Seth Howell
15d0ae628d lib/env_dpdk: remove spdk prefix from internal functions.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I20fddc974cdbd7763e7f148f060ddb76d59e0923
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1709
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-04-22 09:21:55 +00:00
Seth Howell
229ef16bb9 lib/env_dpdk: add map file and rev so major version.
There were 9 function symbols removed from the global list
of the library. They were all symbols declared in env_internal.h

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I23210f27dc2bf23ae9e9cf76babb54e623fbc917
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1708
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-04-22 09:21:55 +00:00
Darek Stojaczyk
23a7e4b94f env/pci: cleanup spdk_detach_rte by moving code
Change-Id: I1dee015e44ecfbd7fcb1680cce0e6d527d083c99
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1728
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-04-10 07:07:05 +00:00
Michael Haeuptle
55df83ceb6 ENV_DPDK/VFIO: Increase PCI tear down timeout
When removing large number of devices (>8) in parallel,
the 20ms timeout is not long enough.

As part of spdk_detach_cb, DPDK calls into the VFIO driver
which may get delayed due to multiple hot removes being
processed by pciehp driver (pciehp IRQ thread function
is handling the actual removal of a device in paralle but
all of the IRQ thread function compete for a global mutex
increasing processing time and race conditions).

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: I470fbbee92dac9677082c873781efe41e2941cd5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1588
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-04-03 06:31:40 +00:00
Darek Stojaczyk
e03861f138 memory.h: move to public headers
There's no reason not to publish those. Especially if
they're needed in other public headers.

Change-Id: I7dfc6922fcc0dfc46822ad8a16a375f997b98e84
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1041
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-03-19 08:50:45 +00:00
Darek Stojaczyk
046bdc4abb dpdkbuild: build and link with rte_hash if RAID5 is built
Compiling raid5 has a direct dependency on rte_hash,
which was only built if vhost was built.

The following didn't work:
./configure --with-raid5 --without-vhost

Change-Id: Id36a7d4a21c2e0db00b0641581542e244c4cbbb4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1013
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-03-18 08:02:43 +00:00
Darek Stojaczyk
87e7eed129 env_dpdk: link rte_vhost always when it's requested
We used to link with rte_vhost only if it was present as a file,
which doesn't make much sense - we should try to link with it
when we think it's needed.

Change-Id: I9609972d419fdf6e8d3b4644eff3f5dba83abe42
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-03-18 08:02:43 +00:00
Seth Howell
193927830d make: rev SO versions individually for libraries.
This will allow us to keep track of compatibility issues on a
per-library basis.

Change-Id: Ib0c796adb1efe1570212a503ed660bef6f142b6e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1067
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-18 08:02:30 +00:00
Richael Zhuang
a9c79c3337 env_dpdk: Fix error when using vfio with noiommu
When using vfio with enable_unsafe_noiommu_mode=Y, force iova-mode
as "pa" here for DPDK guesses it's "va", which cause the following
error: "EAL:   Expecting 'PA' IOVA mode but current mode is 'VA',
not initializing".

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Change-Id: I7c343498c5d6976a7c75d75438d6f9c35f1b6160
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1071
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-03-06 10:28:07 +00:00
Vitaliy Mysak
d4653a31e0 env_dpdk: dont treat NULL as error in spdk_map_bar_rte()
We use `spdk_map_bar_rte()` to read mapped addresses
from PCI BARs.
This function is currently checking for NULL in each pair.
But in PCI memory, some registers can be left unused,
in which case they are set to 0.
As a result, we may read some NULL pointers from BARs,
which is OK.
To check if given address is indeed invalid, we should first
check if it is used.
So it is best to delegate such checks to the
user of this function.
In fact, users already do the NULL check where it is needed
(ex: virtio_pci.c:390, nvme_pcie.c:589)
so this patch just removes them from `spdk_map_bar_rte()`.

This solves github issue #1206

Change-Id: I88021ceca1b9e9d503b224f790819999cd16da01
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1129
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-03-05 13:31:50 +00:00
Darek Stojaczyk
dcac8e9706 memory: reverse the order of calling mem_map unregister cb
Memory maps might be dependant on one another, so
make sure their dependencies are unregistered after
the dependees.

Change-Id: I3853dfe51bacc70d0b27976a3df9c0ae9253ebac
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/833
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-02-17 10:05:10 +00:00
Darek Stojaczyk
3ac9ba25f3 env_dpdk: fix potential null dereference
Change-Id: Iff5cfa780506191b3a7fb218c6b5948df2ac16a4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/523
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2020-02-06 09:45:42 +00:00
Seth Howell
f4a63bb8b3 env_dpdk: keep a memmap refcount of physical addresses
This allows us to avoid trying to map the same physical address to the
IOMMU in physical mode while still making sure that we don't
accidentally unmap that physical address before we are done referencing
it.

Change-Id: I947408411538b921bdc5a89ce8d5e40fd826e971
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483133
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-29 14:15:21 +00:00
Ben Walker
dd7cd80c6d env/dpdk: Detect DPDK's iova mode
Match DPDK's iova assignment strategy so there are never
any conflicts.

Change-Id: I3863487f9bd247c40edbf0d0d3a8c880bdad1708
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477362
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-01-07 12:14:56 +00:00
Jim Harris
58938d09bf env_dpdk: fix DPDK 18.05 legacy-mem check
In this case, we want to add --legacy-mem if it was
not already specified.  This means we need to check
if strstr() returned NULL.

Reported-by: Alok Kataria <alok.kataria@nutanix.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib99dd015ce6e3ee824e4b543a8379d7291e2671e

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478634
Reviewed-by: <alok.kataria@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-30 11:44:54 +00:00
lradomsk
87f0dab8cf pci: map bar fix
spdk_map_bar_rte did not return error in case bar was not mapped successfully

Signed-off-by: Lukasz Radomski <lukasz.radomski@intel.com>
Change-Id: I662cc189d47c65af8f135a3ab4b27ff1785233d0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-12-18 09:37:37 +00:00
Seth Howell
a85f36b35e env: add a new function for printing memory layout
This is a useful utility function.

The end goal of this patch series is to create a python utility that can
be called upon to dump information about DPDK allocated memory in a
human readable way.

Change-Id: I18978732c9decbb39dce5b5151f5eff6b59f6591
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477510
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-13 11:05:57 +00:00
Jim Harris
b3d982036a env_dpdk: fix --legacy-mem checks
opts->env_context could have more options specified
than just --legacy-mem.  So strcmp() is not a valid
comparison operator - we need to use strstr instead.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie4c8cbcbe7c141693a07a11648d6673ec8c012e5

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477087
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-11 11:05:10 +00:00
alokkataria
6d6052ac96 env: Use rte_malloc in spdk_mem_register code path when possible
For TCMalloc regions which we register with spdk at runtime in the MMapHook, we
need to ensure that SPDK doesn't do any allocations in that path otherwise we
will hit a livelock situation. MmapHook is invoked when TCMalloc is out of free
memory and needs to get more memory from the system, for the hugepage case it
gets via mmap.

In the current code, we could end up calling malloc in the spdk_mem_register
call via the following call path.

spdk_mem_register -> spdk_mem_map_set_translation -> spdk_mem_map_get_map_1gb

To avoid this livelock situation we call rte_malloc instead which shouldn't
invoke the system allocator. Note that in try_expand_heap_primary() which is
invoked in the rte_malloc code path, we can still call malloc, so we need to
only use this when dynamic memory allocation is disabled via --legacy-mem.
It is possible in the future we could work around even this limitation,
but for now this implementation will be much simpler.

Have verified this change fixes the livelock condition which I was hitting in
my setup without this fix.

Change-Id: I69d0813a70da1f26f8c4d9d8895e406c026be18b
Signed-off-by: Alok N Kataria <alok.kataria@nutanix.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475943
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-11 11:05:10 +00:00
Jim Harris
396c445cb1 env_dpdk: tell spdk_mem_map_init whether legacy_mem was specified
We will use this in a future patch to determine whether it's safe
to use DPDK allocated memory when allocating new 1gb page entries.

We could use it in this patch to decide whether or not to register
the memory hotplug handler, but there's really no harm registering
it even when it's not needed.

Ideally DPDK would provide some kind of API to query how DPDK was
configured.  In the normal case we know whether legacy-mem was
specified, but if users initialize DPDK themselves and then call
spdk_env_dpdk_post_init(), we won't know if legacy-mem was specified.
So in that case, we will just assume that it wasn't specified.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ied0e5ff777c8ee651043f46a37ce62e44bfcc5fe

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477086
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-11 11:05:10 +00:00
Jim Harris
d7b5ca749b env: reuse set_translation code for clear_translation
The code is exactly the same - we can just have
spdk_mem_map_clear_translation call spdk_mem_map_set_translation
with translation = map->default_translation.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6a2ce39b0397be9d29b1a4c1cdfba15025afba7a

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476529
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-12-04 15:29:57 +00:00
Karol Latecki
ebe62e1453 Revert "env_dpdk: Detect DPDK's iova mode when programming the IOMMU"
This reverts commit a68effe709.

Reason: introduces assertion:
vhost: rte_vhost_compat.c:103: vhost_session_mem_unregister: Assertion `false' failed.

Reported in GH issue 1085.

Change-Id: I00926844c1e00f19547f03487156f7b4238b446c
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476133
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-11-29 09:31:39 +00:00
Ben Walker
a68effe709 env_dpdk: Detect DPDK's iova mode when programming the IOMMU
If DPDK is using virtual addresses, we should use virtual addresses.
If DPDK is using physical addresses, we should use physical addresses.
This way there can never be a conflict and everything is consistent.

Change-Id: Ie4b0e885e9a52dd6cbc81000a87908102a9771cb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475928
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-11-28 12:36:20 +00:00
Ben Walker
97b0f7733f env: Check supported iommu address width before using iova-mode=va
DPDK by default guesses that it should be using iova-mode=va
so that it can support running as an unprivileged user. However,
some systems (especially virtual machines) don't have an IOMMU capable
of handling the full virtual address space and DPDK doesn't
currently catch that. Add a check in SPDK and force iova-mode=pa
here.

Change-Id: Ib3a5691a584190feaab4b9064b5a500e361328f2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475149
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-11-27 07:08:32 +00:00
Ben Walker
07ca02210a env: Force iova-mode=pa on ppc
In DPDK, the ppc iommu support does not currently allow for
iova-mode=va, but DPDK doesn't detect ppc and so still attempts
to guess iova-mode=va in some modes. Force iova-mode=pa from
SPDK to fix this.

Change-Id: I6a1ee25ab74873826ac211c3e0dfdf54afc74502
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475148
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: JinYu <jin.yu@intel.com>
2019-11-21 14:34:42 +00:00
Jin Yu
301c5aeec9 pci: fix the hotplug issue
DPDK intr thread is designed that it can't unregister the src
callback in this callback handler. So I think we can't detach
the PCI device in the hotremove callback as it needs to unregister
the VFIO notification callback which will be not successful
but it still can free the device. So at the next req notification
in the handler function, we meet the freed device.

Fix #994

Change-Id: Id4b45a2d0fe6b45b132355d59471bc80240fad70
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473176
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-11-18 12:54:01 +00:00
Jim Harris
570d89a24a env_dpdk: modify error message when DPDK already initialized
Ideally we'd have a way to query if DPDK is already
initialized but we don't have that yet.  We want that
for the case where we have an SPDK application that's part
of a framework that may (or may not) have already initialized
DPDK.  If it's already been initialized, let's print an
error message that isn't quite as inflammatory.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifc095245dcdef24cdeeaab2dbe791ca4e840870e

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471422
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-10-24 17:15:55 +00:00
Jim Harris
37c0a02e1c env_dpdk: make spdk_env_init return real errnos
The header file already says it returns negative errnos,
but the env_dpdk implementation was just returning -1
on failure.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie2236f83094672548327dba945b33e3f28fee338

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471421
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-10-24 17:15:55 +00:00
Konrad Sztyber
6caed6bac8 env: add spdk_pci_device_get_type
The function allows the user to get string representation of the type of
a PCI device.

Change-Id: I02abcd9fc98ba912ca4d7936be22e9d5b4950ea2
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470648
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
2019-10-24 17:04:04 +00:00
Ziye Yang
d9561c444f env/dpdk: Exclude the orig cpuset in spdk_unaffinitize_thread
The patch of this purpose is to exclude the CPU cores
occupied by the DPDK thread. To mitigate the corner
case, we only do it when the number of online CPU cores
is larger than then DPDK thread occupied cpu cores.

The purpose is uset to improve the performance and avoid the
contention between DPDK thread and user's own thread.

Change-Id: I1a4a28074df97c55ac531440aea41059a75543f6
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471000
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: JinYu <jin.yu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-10-15 16:31:33 +00:00
Jim Harris
916f3d1471 env_dpdk: add functions to define dpdk_env make variables
Create dpdk_lib_list_to_libs and dpdk_env_linker_args
functions to generate the library filename list and the
linker arguments respectively.  Use these functions
internally as well.

These will be useful as part of the Seastar work, where
Seastar pkg-config includes a bunch of the DPDK libraries,
and SPDK needs to just add a few more.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa6b49a8e1defacf63b3f6b414cd2e947670f8eb

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469751
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-10-01 14:01:58 +00:00