Spdk/include/spdk
Evgeniy Kochetov ed0b611fc5 nvmf/rdma: Add shared receive queue support
This is a new feature for NVMEoF RDMA target, that is intended to save
resource allocation (by sharing them) and utilize the
locality (completions and memory) to get the best performance with
Shared Receive Queues (SRQs). We'll create a SRQ per core (poll
group), per device and associate each created QP/CQ with an
appropriate SRQ.

Our testing environment has 2 hosts.
Host 1:
  CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz dual socket (8 cores total)
  Network: ConnectX-5, ConnectX-5 VPI , 100GbE, single-port QSFP28, PCIe3.0 x16
  Disk: Intel Optane SSD 900P Series
  OS: Fedora 27 x86_64
Host 2:
  CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz dual-socket (24 cores total)
  Network: ConnectX-4 VPI , 100GbE, dual-port QSFP28
  Disk: Intel Optane SSD 900P Series
  OS : CentOS 7.5.1804 x86_64
Hosts are connected via Spectrum switch.
Host 1 is running SPDK NVMeoF target.
Host 2 is used as initiator running fio with SPDK plugin.

Configuration:
- SPDK NVMeoF target: cpu mask 0x0F (4 cores), max queue depth 128,
  max SRQ depth 1024, max QPs per controller 1024
- Single NVMf subsystem with single namespace backed by physical SSD disk
- fio with SPDK plugin: randread pattern, 1-256 jobs, block size 4k,
  IO depth 16, cpu_mask 0xFFF0, IO rate 10k, rate process “poisson”

Here is a full fio command line:
fio  --name=Job --stats=1 --group_reporting=1 --idle-prof=percpu \
--loops=1 --numjobs=1 --thread=1 --time_based=1 --runtime=30s \
--ramp_time=5s --bs=4k --size=4G --iodepth=16 --readwrite=randread \
--rwmixread=75 --randrepeat=1 --ioengine=spdk --direct=1 \
--gtod_reduce=0 --cpumask=0xFFF0 --rate_iops=10k \
--rate_process=poisson \
--filename='trtype=RDMA adrfam=IPv4 traddr=1.1.79.1 trsvcid=4420 ns=1'

SPDK allocates the following entities for every work request in
receive queue (shared or not): reqs (1024 bytes), recvs (96 bytes),
cmds (64 bytes), cpls (16 bytes), in_capsule_buffer. All except the
last one are fixed size. In capsule data size is configured to 4096.
Memory consumption calculation (target):
- Multiple SRQ: core_num * ib_devs_num * SRQ_depth * (1200 +
  in_capsule_data_size)
- Multiple RQ: queue_num * RQ_depth * (1200 + in_capsule_data_size)
We ignore admin queues in calculations for simplicity.

Cases:
1. Multiple SRQ with 1024 entries:
   - Mem = 4 * 1 * 1024 * (1200 + 4096) = 20.7 MiB
     (Constant number – does not depend on initiators number)
2. RQ with 128 entries for 64 initiators:
   - Mem = 64 * 128 * (1200 + 4096) = 41.4 MiB

Results:
FIO_JOBS   kIOPS     Bandwidth,MiB/s  AvgLatency,us  MaxResidentSize,kiB
       RQ       SRQ     RQ      SRQ    RQ       SRQ      RQ       SRQ
1      8.623    8.623   33.7    33.7   13.89    14.03    144376   155624
2      17.3     17.3    67.4    67.4   14.03    14.1     145776   155700
4      34.5     34.5    135     135    14.15    14.23    146540   156184
8      69.1     69.1    270     270    14.64    14.49    148116   156960
16     138      138     540     540    14.84    15.38    151216   158668
32     276      276     1079    1079   16.5     16.61    157560   161936
64     513      502     2005    1960   1673     1612     170408   168440
128    535      526     2092    2054   3329     3344     195796   181524
256    571      571     2232    2233   6854     6873     246484   207856

We can see the benefit in memory consumption.

Change-Id: I40c70f6ccbad7754918bcc6cb397e955b09d1033
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/428458
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-03-15 19:19:17 +00:00
..
assert.h include: move standard includes into spdk/stdinc.h 2017-05-08 10:11:01 -07:00
barrier.h barrier/x86: use lock+add for smp_mb() instead of mfence 2019-02-11 23:08:57 +00:00
base64.h util/base64: add base64 lib and unit tests 2018-07-19 00:50:54 +00:00
bdev_module.h bdev: Not assert but pass completion status to spdk_bdev_io_get_buf_cb 2019-02-27 01:59:11 +00:00
bdev.h bdev: Increase the size of small and large buffers to store DIF 2019-03-13 01:40:02 +00:00
bit_array.h util: added bit array bitmask load, store and clear 2018-12-14 15:34:53 +00:00
blob_bdev.h bdev: rename spdk_bdev_module_if -> spdk_bdev_module 2018-03-13 00:55:12 -04:00
blob.h lvol: add option to change clear method for lvol store creation 2019-02-28 20:50:27 +00:00
blobfs.h blobfs: add a new API to return file's unique ID 2018-08-29 16:29:22 +00:00
conf.h include/conf: add comments for public APIs 2018-02-26 11:59:09 -05:00
copy_engine.h include/copy_engine.h: add comments for callback functions 2018-05-28 01:45:03 +00:00
cpuset.h doc: fix a bunch of parameter-related Doxygen comments 2018-06-19 17:29:06 +00:00
crc16.h util/crc16: Add spdk_crc16_t10dif_copy to use in read strip and write insert 2018-12-20 17:52:29 +00:00
crc32.h util: Move architecture detection to crc32c.c 2019-02-04 19:14:22 +00:00
dif.h dif: Insert DIF into newly read data block by stream fashion 2019-03-13 01:40:02 +00:00
endian.h include: move standard includes into spdk/stdinc.h 2017-05-08 10:11:01 -07:00
env_dpdk.h env_dpdk: add spdk_env_dpdk_external_init() 2019-03-04 14:00:16 +00:00
env.h memory: add way of checking iommu usage 2019-03-05 06:45:11 +00:00
event.h event: Remove arg2 from spdk_app_start() 2019-03-05 08:43:12 +00:00
fd.h include/fd.h: add comments for pubclic APIs 2018-01-04 12:12:10 -05:00
ftl.h lib/ftl: Remove NULL pointer checks in external APIs 2019-02-08 16:35:34 +00:00
gpt_spec.h bdev/gpt: dump partition name 2017-07-12 18:12:52 -04:00
histogram_data.h histograms: add function to merge histograms 2018-11-15 23:03:26 +00:00
io_channel.h thread: Rename io_channel.h to thread.h 2018-06-12 15:24:07 +00:00
ioat_spec.h ioat: clear the internal channel error register on reset 2018-08-13 16:59:18 +00:00
ioat.h ioat: add APIs to only build descriptors 2019-02-18 07:44:17 +00:00
iscsi_spec.h iscsi: fix layout of logout request reason field 2017-09-22 16:11:11 -04:00
json.h json: add utilities function enabling itaration over JSON object 2018-10-18 16:07:37 +00:00
jsonrpc.h jsonrpc: add connection close callback 2019-01-10 14:31:37 +00:00
likely.h include: move standard includes into spdk/stdinc.h 2017-05-08 10:11:01 -07:00
log.h log: remove "trace" from public API 2018-12-03 19:50:15 +00:00
lvol.h lvol: ensure enum for lvol clear method is the same as blobstore 2019-02-28 20:50:27 +00:00
mmio.h mmio: add functions for 1 and 2 byte I/O accesses 2017-10-13 10:46:00 -04:00
nbd.h nbd: correct notes of spdk_nbd_start API 2019-02-20 01:14:18 +00:00
net.h net: make the net initialization in a correct way 2018-12-20 01:37:50 +00:00
nvme_intel.h include: move standard includes into spdk/stdinc.h 2017-05-08 10:11:01 -07:00
nvme_ocssd_spec.h ocssd: add chunk notification log struct 2018-09-27 01:30:45 +00:00
nvme_ocssd.h ocssd: add chunk notification log struct 2018-09-27 01:30:45 +00:00
nvme_spec.h nvme_spec: Add data structures for NVMe Telemetry Log page and Interrupt Coalescing Feature 2019-03-07 07:01:56 +00:00
nvme.h nvme: add spdk_nvme_connect_async() API 2019-03-14 22:37:02 +00:00
nvmf_fc_spec.h nvmf: FC-NVMe spec. header file 2018-07-06 22:49:20 +00:00
nvmf_spec.h nvme: Add the NVMe over fabrics TCP/IP transport support 2018-11-19 20:36:05 +00:00
nvmf.h nvmf/rdma: Add shared receive queue support 2019-03-15 19:19:17 +00:00
pci_ids.h nvme: add SHST_COMPLETE quirk for VMWare emulated SSDs 2019-02-27 01:46:32 +00:00
queue_extras.h scripts/check_format: check for spaces before tabs 2018-03-05 11:09:13 -05:00
queue.h check_format: Verify #include syntax 2019-01-29 00:12:07 +00:00
reduce.h reduce: remove close callback 2019-01-16 22:25:13 +00:00
rpc.h rpc: add spdk_rpc_is_method_allowed 2018-12-05 00:35:35 +00:00
scsi_spec.h scsi: add iSCSI initiator port TransportID 2018-12-05 16:04:06 +00:00
scsi.h scsi: Add an API to return DIF context of bdev and CDB 2019-03-08 01:21:26 +00:00
sock.h sock: Add spdk_sock_readv(sock, iov, iovcnt) 2019-03-08 01:21:26 +00:00
stdinc.h ftl: Added unit tests for FTL library 2019-01-22 23:22:16 +00:00
string.h string: spdk_strtol to delegate additional error checking 2019-01-29 00:10:57 +00:00
thread.h thread: add spdk_thread_is_idle() 2019-03-01 21:38:02 +00:00
trace.h lib/trace: add trace_record tool 2019-01-30 06:36:25 +00:00
util.h util: added spdk_divide_round_up() 2018-12-18 17:26:49 +00:00
uuid.h util/uuid: add a new uuid copy API. 2018-12-06 22:25:09 +00:00
version.h version: 19.04 pre 2019-02-01 09:29:12 +00:00
vhost.h vhost: remove vhost external events 2019-02-06 19:04:21 +00:00