This is a new feature for NVMEoF RDMA target, that is intended to save resource allocation (by sharing them) and utilize the locality (completions and memory) to get the best performance with Shared Receive Queues (SRQs). We'll create a SRQ per core (poll group), per device and associate each created QP/CQ with an appropriate SRQ. Our testing environment has 2 hosts. Host 1: CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz dual socket (8 cores total) Network: ConnectX-5, ConnectX-5 VPI , 100GbE, single-port QSFP28, PCIe3.0 x16 Disk: Intel Optane SSD 900P Series OS: Fedora 27 x86_64 Host 2: CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz dual-socket (24 cores total) Network: ConnectX-4 VPI , 100GbE, dual-port QSFP28 Disk: Intel Optane SSD 900P Series OS : CentOS 7.5.1804 x86_64 Hosts are connected via Spectrum switch. Host 1 is running SPDK NVMeoF target. Host 2 is used as initiator running fio with SPDK plugin. Configuration: - SPDK NVMeoF target: cpu mask 0x0F (4 cores), max queue depth 128, max SRQ depth 1024, max QPs per controller 1024 - Single NVMf subsystem with single namespace backed by physical SSD disk - fio with SPDK plugin: randread pattern, 1-256 jobs, block size 4k, IO depth 16, cpu_mask 0xFFF0, IO rate 10k, rate process “poisson” Here is a full fio command line: fio --name=Job --stats=1 --group_reporting=1 --idle-prof=percpu \ --loops=1 --numjobs=1 --thread=1 --time_based=1 --runtime=30s \ --ramp_time=5s --bs=4k --size=4G --iodepth=16 --readwrite=randread \ --rwmixread=75 --randrepeat=1 --ioengine=spdk --direct=1 \ --gtod_reduce=0 --cpumask=0xFFF0 --rate_iops=10k \ --rate_process=poisson \ --filename='trtype=RDMA adrfam=IPv4 traddr=1.1.79.1 trsvcid=4420 ns=1' SPDK allocates the following entities for every work request in receive queue (shared or not): reqs (1024 bytes), recvs (96 bytes), cmds (64 bytes), cpls (16 bytes), in_capsule_buffer. All except the last one are fixed size. In capsule data size is configured to 4096. Memory consumption calculation (target): - Multiple SRQ: core_num * ib_devs_num * SRQ_depth * (1200 + in_capsule_data_size) - Multiple RQ: queue_num * RQ_depth * (1200 + in_capsule_data_size) We ignore admin queues in calculations for simplicity. Cases: 1. Multiple SRQ with 1024 entries: - Mem = 4 * 1 * 1024 * (1200 + 4096) = 20.7 MiB (Constant number – does not depend on initiators number) 2. RQ with 128 entries for 64 initiators: - Mem = 64 * 128 * (1200 + 4096) = 41.4 MiB Results: FIO_JOBS kIOPS Bandwidth,MiB/s AvgLatency,us MaxResidentSize,kiB RQ SRQ RQ SRQ RQ SRQ RQ SRQ 1 8.623 8.623 33.7 33.7 13.89 14.03 144376 155624 2 17.3 17.3 67.4 67.4 14.03 14.1 145776 155700 4 34.5 34.5 135 135 14.15 14.23 146540 156184 8 69.1 69.1 270 270 14.64 14.49 148116 156960 16 138 138 540 540 14.84 15.38 151216 158668 32 276 276 1079 1079 16.5 16.61 157560 161936 64 513 502 2005 1960 1673 1612 170408 168440 128 535 526 2092 2054 3329 3344 195796 181524 256 571 571 2232 2233 6854 6873 246484 207856 We can see the benefit in memory consumption. Change-Id: I40c70f6ccbad7754918bcc6cb397e955b09d1033 Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com> Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/428458 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> |
||
---|---|---|
.githooks | ||
app | ||
build/lib | ||
doc | ||
dpdk@67b915b09b | ||
dpdkbuild | ||
etc/spdk | ||
examples | ||
go | ||
include | ||
intel-ipsec-mb@134c90c912 | ||
ipsecbuild | ||
isa-l@09e787231b | ||
isalbuild | ||
lib | ||
mk | ||
ocf@e235500472 | ||
pkg | ||
scripts | ||
shared_lib | ||
test | ||
.astylerc | ||
.gitignore | ||
.gitmodules | ||
.travis.yml | ||
autobuild.sh | ||
autopackage.sh | ||
autorun_post.py | ||
autorun.sh | ||
autotest.sh | ||
CHANGELOG.md | ||
CONFIG | ||
configure | ||
CONTRIBUTING.md | ||
ISSUE_TEMPLATE.md | ||
LICENSE | ||
Makefile | ||
README.md |
Storage Performance Development Kit
The Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into userspace and operating in a polled mode instead of relying on interrupts, which avoids kernel context switches and eliminates interrupt handling overhead.
The development kit currently includes:
- NVMe driver
- I/OAT (DMA engine) driver
- NVMe over Fabrics target
- iSCSI target
- vhost target
- Virtio-SCSI driver
In this readme:
- Documentation
- Prerequisites
- Source Code
- Build
- Unit Tests
- Vagrant
- Advanced Build Options
- Shared libraries
- Hugepages and Device Binding
- Example Code
- Contributing
Documentation
Doxygen API documentation is available, as well as a Porting Guide for porting SPDK to different frameworks and operating systems.
Source Code
git clone https://github.com/spdk/spdk
cd spdk
git submodule update --init
Prerequisites
The dependencies can be installed automatically by scripts/pkgdep.sh
.
./scripts/pkgdep.sh
Build
Linux:
./configure
make
FreeBSD: Note: Make sure you have the matching kernel source in /usr/src/ and also note that CONFIG_COVERAGE option is not available right now for FreeBSD builds.
./configure
gmake
Unit Tests
./test/unit/unittest.sh
You will see several error messages when running the unit tests, but they are part of the test suite. The final message at the end of the script indicates success or failure.
Vagrant
A Vagrant setup is also provided to create a Linux VM with a virtual NVMe controller to get up and running quickly. Currently this has only been tested on MacOS and Ubuntu 16.04.2 LTS with the VirtualBox provider. The VirtualBox Extension Pack must also be installed in order to get the required NVMe support.
Details on the Vagrant setup can be found in the SPDK Vagrant documentation.
Advanced Build Options
Optional components and other build-time configuration are controlled by
settings in the Makefile configuration file in the root of the repository. CONFIG
contains the base settings for the configure
script. This script generates a new
file, mk/config.mk
, that contains final build settings. For advanced configuration,
there are a number of additional options to configure
that may be used, or
mk/config.mk
can simply be created and edited by hand. A description of all
possible options is located in CONFIG
.
Boolean (on/off) options are configured with a 'y' (yes) or 'n' (no). For
example, this line of CONFIG
controls whether the optional RDMA (libibverbs)
support is enabled:
CONFIG_RDMA?=n
To enable RDMA, this line may be added to mk/config.mk
with a 'y' instead of
'n'. For the majority of options this can be done using the configure
script.
For example:
./configure --with-rdma
Additionally, CONFIG
options may also be overridden on the make
command
line:
make CONFIG_RDMA=y
Users may wish to use a version of DPDK different from the submodule included in the SPDK repository. Note, this includes the ability to build not only from DPDK sources, but also just with the includes and libraries installed via the dpdk and dpdk-devel packages. To specify an alternate DPDK installation, run configure with the --with-dpdk option. For example:
Linux:
./configure --with-dpdk=/path/to/dpdk/x86_64-native-linuxapp-gcc
make
FreeBSD:
./configure --with-dpdk=/path/to/dpdk/x86_64-native-bsdapp-clang
gmake
The options specified on the make
command line take precedence over the
values in mk/config.mk
. This can be useful if you, for example, generate
a mk/config.mk
using the configure
script and then have one or two
options (i.e. debug builds) that you wish to turn on and off frequently.
Shared libraries
By default, the build of the SPDK yields static libraries against which
the SPDK applications and examples are linked.
Configure option --with-shared
provides the ability to produce SPDK shared
libraries, in addition to the default static ones. Use of this flag also
results in the SPDK executables linked to the shared versions of libraries.
SPDK shared libraries by default, are located in ./build/lib
. This includes
the single SPDK shared lib encompassing all of the SPDK static libs
(libspdk.so
) as well as individual SPDK shared libs corresponding to each
of the SPDK static ones.
In order to start a SPDK app linked with SPDK shared libraries, make sure to do the following steps:
- run ldconfig specifying the directory containing SPDK shared libraries
- provide proper
LD_LIBRARY_PATH
Linux:
./configure --with-shared
make
ldconfig -v -n ./build/lib
LD_LIBRARY_PATH=./build/lib/ ./app/spdk_tgt/spdk_tgt
Hugepages and Device Binding
Before running an SPDK application, some hugepages must be allocated and any NVMe and I/OAT devices must be unbound from the native kernel drivers. SPDK includes a script to automate this process on both Linux and FreeBSD. This script should be run as root.
sudo scripts/setup.sh
Users may wish to configure a specific memory size. Below is an example of configuring 8192MB memory.
sudo HUGEMEM=8192 scripts/setup.sh
Example Code
Example code is located in the examples directory. The examples are compiled automatically as part of the build process. Simply call any of the examples with no arguments to see the help output. You'll likely need to run the examples as a privileged user (root) unless you've done additional configuration to grant your user permission to allocate huge pages and map devices through vfio.
Contributing
For additional details on how to get more involved in the community, including contributing code and participating in discussions and other activities, please refer to spdk.io