Fully port the remaining uses of rte_ functions from DPDK to the SPDK
env library abstraction layer.
This also simplifies buffer allocation: each task only needs to be
allocated once during the initial submit_io() call, rather than using a
mempool to get/put the task on every I/O.
Change-Id: I39c8caff81bbb1467101ba3b24a389c437075c61
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/378220
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add a new struct spdk_nvme_io_qpair_opts to allow the user to override
controller options on a per-I/O qpair basis.
Existing callers with qprio == 0 can be updated to:
... = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, NULL, 0);
Callers that need to specify a non-default qprio should be updated to:
struct spdk_nvme_io_qpair_opts opts;
spdk_nvme_ctrlr_get_default_io_qpair_opts(ctrlr, &opts, sizeof(opts));
opts.qprio = SPDK_NVME_QPRIO_...;
... = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, &opts, sizeof(opts));
Change-Id: I8ac3ea369535cfde759abbe75e1d974b6450a800
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/369676
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1aee7aba522cc816f69709cfc95d12c50a5d0f4b
Reviewed-on: https://review.gerrithub.io/367279
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ibbe601489d16a9585e56de1c95fe31e9a602a7e0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/366387
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Remove dpdk_ prefix in spdk_app_opts and spdk_env_opts
Change-Id: I6f231f67072b808e84945d41b1fe31a180beb350
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/365787
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7ccff922465195c7fe9836633196cd7a8816c11c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/365071
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
In this patch, we also update perf and identify
examples. If there is no local nvme device info
parsing, we will set dpdk initialization with no-pci
choice.
Change-Id: I58b2d291b7b53894aeb194a16798ff1c72cf25b4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/365361
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia0083365b2da63cb38aebb9f7bbc02f4dfd1ae94
Reviewed-on: https://review.gerrithub.io/365263
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
DPDK's use of getopt() needs special handling of the optind global
variable since we are passing it a separate array of arguments (not the
typical argv and argc). Set optind to 1 internally to env_dpdk so that
the apps don't need to know about it, and restore optind in case the
calling app is also using getopt().
Change-Id: Icbf07002c99fa9f94c866e8eff707124b0ef679b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/365062
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also provide an option in perf tool let users to
disable it.
Change-Id: If4952513d77cecaa4f9403fbea811d86916ee87c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/363311
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Gracefully handle ns_count == 0 in print_performance() rather than
asserting.
Change-Id: If8f8d56a2dd4d21ddc61069555c2b90d027431f4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/363614
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: Ben Walker <benjamin.walker@intel.com>
- rename spdk_malloc_socket to spdk_dma_malloc_socket
- rename spdk_malloc to spdk_dma_malloc
- rename spdk_zmalloc to spdk_dma_zmalloc
- rename spdk_realloc to spdk_dma_realloc
- rename spdk_free to spdk_dma_free
Change-Id: I52a11b7a4243281f9c56f503e826fd7c4a1fd883
Signed-off-by: John Meneghini <johnm@netapp.com>
Reviewed-on: https://review.gerrithub.io/362604
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A single -L can be used to get the latency summary.
Two -L's (or -LL) can be used to get both the latency
summary and the detailed histogram.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3fc0f4e2dfff7b041a665fe35aa33f11e4c3ebad
Reviewed-on: https://review.gerrithub.io/362270
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0c47f2086d4f895cd75f32efc7df30d7182adcb1
Reviewed-on: https://review.gerrithub.io/362269
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The latency tracking is done with ranges of bucket arrays.
The bucket for any given I/O is determined solely by TSC
deltas - any translation to microseconds is only done after
the test is finished and statistics are printed.
Each range has a number of buckets determined by a
NUM_BUCKETS_PER_RANGE value which is currently set to 128.
The buckets in ranges 0 and 1 each map to one specific TSC
delta. The buckets in subsequent ranges each map to twice
as many TSC deltas as buckets in the previous range:
Range 0: 1 TSC each - 128 buckets cover deltas 0 to 127
Range 1: 1 TSC each - 128 buckets cover deltas 128 to 255
Range 2: 2 TSC each - 128 buckets cover deltas 256 to 511
Range 3: 4 TSC each - 128 buckets cover deltas 512 to 1023
Range 4: 8 TSC each - 128 buckets cover deltas 1024 to 2047
Range 5: 16 TSC each - 128 buckets cover deltas 2048 to 4095
etc.
While here, change some variable names and usage
messages to differentiate between the existing latency
tracking via vendor-specific NVMe log pages on Intel
NVMe SSDs, and the newly added latency tracking done
in software.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I299f1c1f6dbfa7ea0e73085f7a685e71fc687a2b
Fix up all existing spacing errors in comments and add an automated
check for patterns like /*comment*/.
Change-Id: I28f61c93612dc0f8aed66bd509da78e91ea9737e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is required by libaio. Previously, buffers were aligned to 512
bytes, but 4K devices need 4K-aligned buffers.
Change-Id: I96080e72dc77e0e72f426f7c9fe98b6724f66e1b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Several examples have a function to associate workers
with threads. Simplify that algorithm.
This seems to just shift some of the complexity
from register_workers down to main, but in the long
run the DPDK threading will get abstracted into
env as well and greatly simplify that part.
Change-Id: Ic106dde58fa5351a1ce0a058161b08062e121d3b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Since we know the queue depth that we will be using during the test,
request that as the queue size when attaching to NVMe controllers.
Reserve one extra queue entry above the expected queue depth since NVMe
queues must always have one entry free to distinguish between queue
empty and queue full cases.
Change-Id: I809982207edb4894148aec09b10c4e2de4a040d3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
By default, all SPDK applications will not share memory.
To share memory, start the applications with the same
shared memory id.
Change-Id: Ib6180369ef0ed12d05983a21d7943e467402b21a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Map the DPDK hugepage virtual address space to an area that should not
interfere with randomized mmap() addresses.
Change-Id: Iffc657858f861fc1316f77b68f9f121167d604b1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows the user to connect to multiple remote NVMe-oF targets or to
specify multiple specific PCIe device addresses to test.
Change-Id: I05b2072b8aa1480891b37b17b5207369344b617d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The memory allocation is based on user specified queue depth,
number of attached active namespaces(aio files) and number of
cores involved in the IO operations.
Change-Id: I370b9fdacc1bb40d110bec7e96adac2424d39431
Signed-off-by: GangCao <gang.cao@intel.com>
They were very close to the same already, so finish the job.
Change-Id: Ifba9e3b2d11a3e70cbfbe46f57a67552db2757ed
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This isn't used yet in the NVMe library, but it will be necessary later
for supporting non-IPv4 addresses.
Change-Id: I167ce63ad25b0e0c9aa192b12d764c8d078e67f9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The probe_info was reduced to just containing a
transport_id, so remove probe_info entirely.
Change-Id: Ica9a22d126cd14e282decd3eea1a0afe0460f099
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This can be obtained by parsing traddr into a pci_addr,
then getting a handle to the pci_dev and asking for all
of the pci information.
Change-Id: I1948cbd3ec65611293192ef5558ace19dd444d4c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Instead of repeating the fields, just embed a transport_id.
Change-Id: I282704c9d59784abd5f7c93be4e47c673fcf6dde
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is a small step toward making discovery more like
scanning a local PCI bus.
Change-Id: Ie7149ad060f2eeb56939b1241187bdf09681f2aa
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
spdk_nvme_probe() will now provide a struct spdk_nvme_probe_info to the
probe and attach callbacks in place of the PCI device pointer.
This struct contains the useful information that could be retrieved from
the PCI device during probe.
The goal of this change is to allow expansion of the probe information
in the future when other transports (specifically, NVMe over Fabrics)
are added that do not necessarily use PCI addressing or device IDs.
Change-Id: I59a2a9e874e248ce5fa1d7f4b57c8056962ff3cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the env library to perform all memory allocations
that previously called DPDK directly.
Change-Id: I6d33e85bde99796e0c85277d6d4880521c34f10d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This allows users to swap their PCI library from
libpciaccess/dpdk to another mechanism using the standard
method for swapping out the env library.
Change-Id: Ib2248f8b43754a540de2ec01897e571f0302b667
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This patch also drops support for automatically unbinding
devices from the kernel - run scripts/setup.sh first.
Our generic pci interface is now hidden behind include/spdk/pci.h
and implemented in lib/util/pci.c. We no longer wrap the calls
in nvme_impl.h or ioat_impl.h. The implementation now only uses
DPDK and the libpciaccess dependency has been removed. If using
a version of DPDK earlier than 16.07, enumerating devices
by class code isn't available and only Intel SSDs will be
discovered. DPDK 16.07 adds enumeration by class code and all
NVMe devices will be correctly discovered.
Change-Id: I0e8bac36b5ca57df604a2b310c47342c67dc9f3c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Writing 0's hits SSD firmware special cases and gives
unrealistically high performance numbers.
Change-Id: I73c72ee52494075e354dcddd067e3ce49c156204
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This will allow removal notifications to be propagated to the library
user (e.g. for hotplug).
The callback is currently unused, but this at least prepares the API for
the future hotplug support.
Based on a patch by Dave Jiang <dave.jiang@intel.com>
Change-Id: I20b1c2dbf5e084e0b45a7e51205aba4514ee9a95
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use O_WRONLY flag for write IO
Cleanup io_context_t and io_event when perf exits
Change-Id: Iefa1d8be5e017a1ca5719489c1ec4b868df94722
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Free memory of worker_thread, ns_entry and ns_worker_ctx when perf
exits.
Change-Id: I4707eea31ca1a1c4a9ce6ded857c4576e57b4532
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Provide a new structure, spdk_nvme_ctrlr_opts, to let the user modify
the default controller initialization options during probe/attach.
Currently, only the number of queue pairs can be modified in this way;
other options will be added later.
Change-Id: Ie27b9429291d93a9353c0d820f0ad467d3b0e7cb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The previous method for registering I/O queues did not allow the user
to specify queue priority for weighted round robin arbitration, and it
limited the application to one queue per controller per thread.
Change the API to require explicit allocation of each queue for each
controller using the new function spdk_nvme_ctrlr_alloc_io_qpair().
Each function that submits a command on an I/O queue now takes an
explicit qpair parameter rather than implicitly using the thread-local
queue.
This also allows the application to allocate different numbers of
threads per controller; previously, the number of queues was capped at
the smallest value supported by any attached controller.
Weighted round robin arbitration is not supported yet; additional
changes to the controller startup process are required to enable
alternate arbitration methods.
Change-Id: Ia33be1050a6953bc5a3cca9284aefcd95b01116e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This function returns true if the namespace is active or false if it is
inactive (e.g. no namespace has been attached to the specified namespace
ID yet).
Also use the new function to add checks in the examples and tests where
applicable.
Change-Id: I35465b315ae1a1677c5a82191ad9b1da1c216d50
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rename all functions with a spdk_ prefix, and provide enough of an API
to avoid apps needing to #include <pciaccess.h>.
The opaque type used in the public API for a PCI device is now
struct spdk_pci_device *.
Change-Id: I1e7a09bbc5328c624bec8cf5c8a69ab0ea8e8254
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The new probing API will find all NVMe devices on the system and ask the
caller whether to attach to each one. The caller will then receive a
callback once each controller has finished initializing and has been
attached to the driver.
This will enable cleanup of the PCI abstraction layer (enabling us to
use DPDK PCI functionality) as well as allowing future work on parallel
NVMe controller startup and PCIe hotplug support.
Change-Id: I3cdde7bfab0bc0bea1993dd549b9b0e8d36db9be
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Optionally enable and display the I/O latency histograms as reported by
the hardware if supported (e.g. Intel DC P3x00 NVMe devices).
Change-Id: I5c0138d51a282138b74f36fe8e1461c9444e6d0f
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Also add a space between Copyright and (c).
The copyright year can be determined using git metadata.
Also remove the duplicated "All rights reserved." - every instance of
this line already has a corresponding "All rights reserved" immediately
below it, except for examples/ioat/kperf/kmod/dma_perf.c, where I have
added it manually.
Performed using this command:
git ls-files | xargs sed -i -e 's/Copyright(c) \(.*\) Intel Corporation. All rights reserved./Copyright (c) Intel Corporation./'
Change-Id: I3779f404966800709024eb1eb66a50068af2716c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Support for the Force Unit Access and Limited Retry
bits on reads and writes.
Change-Id: I9860848358377d63a967a4ba6ee9c061faf284d4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This more accurately represents what function it performs.
Also remove pci_device_has_uio_driver() from the public API. Callers
should use pci_device_has_non_uio_driver() instead.
Change-Id: I9623fe1345b43e981d5823804e33d01ac0d3bb1c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This doesn't really matter, since the program will be exiting
immediately if associate_workers_with_ns() fails, but it makes static
analyzers happy.
Change-Id: Ic21d234dec50bd2b6684b5fe2caa78d616f93052
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If the I/O size is larger than the total namespace size or smaller than
the block size, ignore that namespace in the perf utility.
Change-Id: I297303d8c41ceb36eef91c6c33da809a35758f4e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch is used to remove the unused pci_dev parameter from the nvme
perf utility functions that no longer need it.
Change-Id: Ib139b080b7668aed712b4489c5ee95bd2fa2b350
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvme_ctrlr_process_io_completions() now takes a second parameter,
max_completions, to let the user limit the number of I/Os completed on
each poll.
If there are many I/Os waiting to be completed, the
nvme_ctrlr_process_io_completions() function could run for a long time
before returning control to the user, so the max_completions parameter
lets the user have more control of latency.
Change-Id: I3173059d94ec1cc5dbb636fc0ffd3dc09f3bfe4b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, as soon as a worker thread failed, the program would exit
before printing results.
Also add a message at exit time if any errors occurred during test
execution.
Change-Id: I7b3920f0acb8ce364e2bc5cbb78bbe88f3fa7146
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This matches the size of the request pool and enables running with a
higher concurrency level.
Ideally, these limits should be calculated from the requested queue
depth and number of workers, but for now, just increase the hardcoded
limit.
Change-Id: I6e890efc78a1336dddc0ab61db20c68004b30f54
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, if the work function on the master lcore failed, the perf
program would still return with a successful exit code.
Change-Id: Iec91c1f60824759ae62476ef6dce670fd402ddc5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Do not continue running the thread work function if that thread could
not get an I/O queue.
Change-Id: I89033250bde0663f073ff35c76d1558d55b72ece
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Intelligently allocate cores and devices to handle
the following cases:
1) Equal cores and devices
2) More cores than devices by using multiple cores per device
3) More devices than cores by using multiple devices from a single core
Change-Id: I3703f5c523268539bd00d399fe104c474a8e8c99
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This helps weed out functions that should be static, functions that are
not declared in public header files, and .c files that don't include
their .h interface headers.
Change-Id: Ie39f83ad4b320847e4a938bd1d4d0b4fa21c2ffa
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fix all of the uses of __thread so they are at the beginning (similar to
e.g. static).
Don't actually enable -Wold-style-declaration, since clang doesn't
understand that.
Change-Id: I0dcbb758143eab90fc978334c8f256c6602cc4cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows comparing the Linux kernel driver's performance to the SPDK
user-mode NVMe driver.
Change-Id: I71c70163a4133c2f237c8c57b3c698ec261455f5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Like sprintf() with automatic buffer allocation.
This should help to avoid fixed-size buffers in
non-performance-sensitive code that formats strings.
Change-Id: I35209ae84014ed5daf41baa5b03af8a5f6b02b8e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This helps enable FreeBSD, where pciaccess pci_device_has_kernel_driver()
is not functional. The function will return 0 if there is no driver
attached, or the Linux uio or FreeBSD nic_uio driver is attached. It will
return 1 otherwise.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0921e61c9040b1e0411b5dc40b36fc7f2721c8c5