Commit Graph

174 Commits

Author SHA1 Message Date
Ben Walker
d31eb732af nvmf/tcp: Allocate pdu pool out of hugepages
It is faster for the kernel to pin memory in hugepages, so allocate
the pdu pool from hugepages. This will help more
with upcoming changes to leverage MSG_ZEROCOPY.

Change-Id: I9ce581acca9c6edb71bd8119258966e3b405db77
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475801
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
2020-01-08 15:47:08 +00:00
Ben Walker
053fa66b10 nvmf/tcp: Minimize the places where the tqpair state changes
All transitions to the EXITING state go through the disconnect function
now

Change-Id: Ia55816351b2998bfef26130b6ffdc4a1010567a1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470533
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-08 15:47:08 +00:00
Ben Walker
04a4aab2e0 nvmf/tcp: Simplify handling of spdk_nvmf_tcp_pdu_get failures
This function can't actually return NULL. It aborts if we get
our math wrong.

Change-Id: Iaf77112addc3c14c70755a56043c5dba3427890d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478911
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-01-08 15:47:08 +00:00
dongx.yi
f7e8827aa6 nvmf/tcp: Using spdk_min instead of multi-lines codes.
We can use spdk_min to get the copy_len in spdk_nvmf_tcp_send_c2h_term_req.
It confirms copy_len it's not larger than SPDK_NVME_TCP_TERM_REQ_ERROR_DATA_MAX_SIZE

Signed-off-by: dongx.yi <dongx.yi@intel.com>
Change-Id: Id343928e1911e4ab77fca7463f3f0cc55889db30
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479118
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-01-08 09:12:20 +00:00
Jacek Kalwas
5b87daa92f nvmf/tcp: remove redundant memset
Minor optimisation done by code analysis, both cmd and dif are
overridden in TCP_REQUEST_STATE_NEW.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I6bae4ddae175035d029c0693f7e4351b95a296ab
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478604
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-03 08:31:52 +00:00
dongx.yi
cb7da325bb lib/nvmf: Remove unnecessary return.
It's not wrong, just to keep consistency with other functions.
So remove these.

Signed-off-by: dongx.yi <dongx.yi@intel.com>
Change-Id: I833211ea8ee6c6b02c874ea340a3f936a0c4c00f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478684
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-24 08:12:40 +00:00
Ziye Yang
8d51277046 nvmf/tcp: remove the unnecessary error info.
It will be the expected behavior when the error message will
printed if we use asynchrounous I/O. And the real
error message for not getting the tcp_req is located in
spdk_nvmf_tcp_capsule_cmd_hdr_handle.

Change-Id: I1a608fbd3a04050eacb6cb68eafd50e5128925ab
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477872
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:42:11 +00:00
dongx.yi
6b5f764856 nvmf/tcp: fix wrong judgement of ipv6.
Here should check spdk_sock_is_ipv6.

Signed-off-by: dongx.yi <dongx.yi@intel.com>
Change-Id: I828c322b79f6d1ac3f9e004d6062358c1d567d4e
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478142
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-12-18 09:37:12 +00:00
Jacek Kalwas
94507133eb nvmf/tcp: rm set_state in spdk_nvmf_tcp_capsule_cmd_hdr_handle
TCP_REQUEST_STATE_NEW is already set in spdk_nvmf_tcp_req_get.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ia835f3763cd74ef9b504901c719d9954317f49af
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476164
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-16 12:34:28 +00:00
Ben Walker
5d497f6cf5 nvmf/tcp: Use writev_async for sending data on sockets
This eliminates the flushing logic, simplifying the tcp
transport.

This also happens to greatly improve performance, especially
on random read tests. The batching done in spdk_sock_writev_async seems
to be more effectively than the previous batching logic in the tcp
transport.

Change-Id: Id980ac6073e380dc75f95df3f69cb224f50fb01b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470532
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-12-16 12:34:02 +00:00
Jacek Kalwas
f206551388 nvmf: fix status override in case parse_sgl fails
It is valuable to have more detail status instead
SPDK_NVME_SC_INTERNAL_DEVICE_ERROR.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: Ifd003b490a7ae9af017645c97636ceaf2f93d4b0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476634
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-09 14:02:37 +00:00
Changpeng Liu
bc13d02237 nvmf: move transport spdk_nvmf_*_req_get_xfer() function into the common nvmf library
Change-Id: I1619cc9b3feea1feb16282dc6c9cc8d5a380282c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475952
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-12-06 14:43:41 +00:00
Jacek Kalwas
155c3babce nvmf/tcp: rm qpair destroy from poll_group_add
Destroy in poll_group_add results in heap-use-after-free because
upper layer calls qpair_fini in case poll_group_add returns
error.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I3e921a21b7ab5f7c15c80bc5919cb97cbda0b5d2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475858
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-11-28 12:36:36 +00:00
Ziye Yang
4579a16f30 lib/nvmf: Add a new state to wait for the req slot
Also need to update the spdk_nvmf_tcp_poll_group_poll.
Since if the tqpair recv state in wait_for_req,
we may already received the data, and there could be
not epoll event.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9c5a202e47e57aaba63da143f954a20c135a98ae
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473626
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-11-15 20:25:15 +00:00
Ziye Yang
08273e77de tcp: Fix no tcp_req issue while using async writev later.
Purpose: But if we use asynchronous writev
for pdu sending, the call_back of writev may occur
after the new data coming. So it means that the
free tcp request may not be available.
So we use the strategy to check the request status
in TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST.

So the strategy is checking the state_cntr of all the
reqs in TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST
state.
1 If the state_cntr > 0, we should queue
the new request.
2 If the statec_cntr == 0, it means that
there is no available slot for the new tcp request
, i.e., the new nvme command comming from the initiator.
If we receive this, it means that the initiator sends more
requests,and we should reject it.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ifbeb510e669082cb7b80faf2e7987075af31d176
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472912
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-11-08 22:17:42 +00:00
Ziye Yang
e19fd311fc nvmf/tcp: Add ttransport variable in spdk_nvmf_tcp_sock_process
To avoid the allocation of ttransport in the sub functions,
and it makes the code much efficient.

Change-Id: Ie4c5a1755ddbecf10dc364ff811f74a7af5f9c3b
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473003
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-11-08 22:17:42 +00:00
Ziye Yang
e9be9df45f nvmf/tcp: Fix the potential issue of connection construction.
When we use async writev (e.g., lib io_uring), we find that
the callback of writev is executed after recving the new
data from the initiator, and this is possible.

For example, if the NVMe-oF TCP target receives the ic_req from the
initiator, and sendout the ic_resp, the state  of tqpair will change from
invalid to running until the callback is executed. And the data of ic_resp
is already sent to the initiator, and we receive the new command later. However,
we may still not get the call back function executed
(i.e, spdk_nvmf_tcp_send_icresp_complete). And it is possible
for using lib io_uring, I faced this issue when using lib uring.

And this patch can fix this issue.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7f4332522866d475e106ac6d36a8ec715133f0dc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472770
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-11-07 23:08:17 +00:00
Jim Harris
262ecf0ec5 nvmf/tcp: stop trying to accept when no more socks
The loop is intended to accept multiple socks when
available, but once accept returns NULL, there's no
reason to keep trying.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I896908d276da35bc3fff172c1c17e22abd2a5343

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473234
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-11-06 14:47:05 +00:00
Ben Walker
34385d80a3 nvmf/tcp: Add pointer to qpair from PDU
It's important to be able to recover full context from just
the PDU in the future.

Change-Id: I3d1f3c326299b1237b42dbe33d340a282c3bc5bb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470531
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
2019-11-01 17:56:16 +00:00
Ben Walker
83ffb2075e nvme/tcp: Rename pdu->ctx to pdu->req
This is always the request pointer, so rename it for clarity.

Change-Id: Ifbda7db7787c65f0deb190a1e94f0676b2c0d99a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470530
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-11-01 17:56:16 +00:00
Ben Walker
78a11548da nvmf/tcp: Move duplicated disconnect code to a function
Change-Id: Ib3daec83ec518a0934911e04d771c19cb34b6167
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470529
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-11-01 17:56:16 +00:00
Ben Walker
811a66e97e nvmf/tcp: Use the new sock_is_connected function during shutdown
Change-Id: I3cf8765bbbcddaeda731188c7911b1966b953bc4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470514
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-11-01 17:56:16 +00:00
Ben Walker
5f856f4d65 nvmf/tcp: No longer set sndbuf size
Use whatever size the socket layer thinks is best. In practice,
this is the same size as before.

Change-Id: I4820e16d8da6e566d1f8f078a75d345399f64ab5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470511
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-11-01 17:56:16 +00:00
Ziye Yang
2ec99adad9 nvmf/tcp: fix the state machine issue if data is already read.
Since we use big buffer to read the data, so the incoming
data may already be read when the req is waiting for the buffer.
So if we use the orginalstatement machine, there will be no
read event will be generated again.

The quick solution is to restore the original code, since
for req which has incapsule data, we not need to wait for the
buffer from the shared buffer pool.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ib195d57cc2969235203c34664115c3322d1c9eae
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472047
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-24 18:00:00 +00:00
Jan Kryl
0a04c076ea nvmf: Add context parameter to new_qpair() callback
It can be useful for passing additional information about nvmf
target to a handler for new nvmf connections. Context can be
stored in globals as it is currently done in nvmf code. However
in case of multiple targets or languages where accessing global
state is challenging (i.e. Rust), this becomes inconvenient.

Change-Id: Ia6a2fdba4601531822b3e5fda7ac5ab89d46f6c5
Signed-off-by: Jan Kryl <jan.kryl@mayadata.io>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469263
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
2019-10-17 16:29:36 +00:00
Alexey Marchuk
fcd652f5e3 tcp: Use nvmf_request dif structure
Change-Id: I215da84d9f27fbc2614ce70ae36ed024ce107a4d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470467
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-11 15:36:19 +00:00
Ziye Yang
fd98a83ce7 nvmf/tcp: re-organize spdk_nvmf_tcp_req
Run the following command:
pahole ./app/nvmf_tgt/nvmf_tgt -R -C spdk_nvmf_tcp_req

It tells me change the bool definition location
of dif_insert_or_strip.

Change-Id: Ia43ab62bcc223a07e6415b2c769fe4af2b097f18
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470401
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-10-08 01:45:47 +00:00
Shuhei Matsumoto
c8734543bc nvmf/tcp: Simplify spdk_nvmf_tcp_req_parse_sgl()
By passing the pointer to struct spdk_nvmf_transport_poll_group
to spdk_nvmf_tcp_req_parse_sgl(), we can remove spdk_nvmf_tcp_req_fill_iovs()
and inline spdk_nvmf_request_get_buffers() into spdk_nvmf_tcp_req_parse_sgl().

Pointers to struct spdk_nvmf_request are used in many lines of
spdk_nvmf_tcp_req_parse_sgl(). Caching and using them simplifies and
improves readability a little for spdk_nvmf_tcp_req_parse_sgl().

We can pass pointer to not struct spdk_nvmf_tcp_transport but struct
spdk_nvmf_transport to spdk_nvmf_tcp_req_parse_sgl().

Ordering the pointer to struct spdk_nvmf_tcp_req first in parameters
of spdk_nvmf_tcp_req_parse_sgl() matches the function name.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9f0d33b48383800c3b0a738eb24b11ffed7e6e60
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-01 14:04:19 +00:00
Shuhei Matsumoto
c0ee8ef7d5 nvmf: Merge each transport's fill_buffers() into spdk_nvmf_request_get_buffers()
This patch is close to the end of the effort to unify buffer allocation
among NVMe-oF transports.

Merge each transport's fill_buffers() into common
spdk_nvmf_request_get_buffers() of the generic NVMe-oF transport.

One noticeable change is to set req->data_from_pool to true not in
each specific transport but in the generic transport.

The next patch will add spdk_nvmf_request_get_multi_buffers() for
multi SGL case of RDMA transport.

This relatively long patch series is a preparation to support
zcopy APIs in NVMe-oF target.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Icb04e3a1fa4f5a360b1b26d2ab7c67606ca7c9a0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469205
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-09-30 21:11:52 +00:00
Shuhei Matsumoto
7c7a0c0a68 nvmf: Pass not num_buffers but length to spdk_nvmf_request_get_buffers()
The subsequent patches unifies getting buffers, filling iovecs, and
filling WRs in a single API. This is a preparation.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I077c4ea8957dcb3c7e4f4181f18b04b343e9927d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468953
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
2019-09-26 16:12:28 +00:00
Shuhei Matsumoto
79945ef0ed nvmf: Hold number of allocated buffers in struct spdk_nvmf_request
This patch makes multi SGL case possible to call spdk_nvmf_request_get_buffers()
per WR.

This patch has an unrelated fix to clear req->iovcnt in
reset_nvmf_rdma_request() in UT. We can do the fix in a separate patch
but include it in this patch because it is very small.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If6e5af0505fb199c95ef5d0522b579242a7cef29
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-09-26 16:12:28 +00:00
Shuhei Matsumoto
34a0d851f6 nvmf/tcp: Return DIF error to initiator instead of severe disconnection
On a DIF verification error, fail the read command with a status code
of APPLICATION_TAG_CHECK_ERROR, GUARD_CHECK_ERROR, or
REFERENCE_TAG_CHECK_ERROR and a status code type of SCT_MEDIA_ERROR.

The state of the request is TCP_REQUEST_STATE_TRANSFERRING_CONTROLLER_TO_HOST
when a DIF verification error is detected. So dequeue the request
from C2H data queue, return the response PDU, and then send the command
response.

This was an item on the TODO list. RDMA transport do this right
behavior from the start and so TCP transport follows it by this patch.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I102bbd253cc8c1379d0937c9536bf2bfe04cbf6a
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468911
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-09-24 17:04:28 +00:00
Shuhei Matsumoto
ddd97a8b3b nvmf/tcp: Move setting orig_length to the location the value is fixed at
tcp_req->orig_length had been set just before I/O submission but
the value is already fixed in spdk_nvmf_tcp_req_parse_sgl().
Hence move setting tcp_req->orig_length accordingly.

This follows the good practice of RDMA transport.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I99f6e266d8f7027bce810864314f3ee24a1af10c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468910
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-09-24 17:04:28 +00:00
Michal Ben Haim
62615117f7 SPDK: changing TREQ value from 'not specified' to 'not required'.
Signed-off-by: Michal Ben Haim <michal.benhaim@kaminario.com>
Change-Id: Ia7bda5b18db24df97172d4500a499c4635d592d5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467499
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-09-10 17:51:26 +00:00
Ben Walker
59e34aa865 nvmf/tcp: Don't set socket recvbuf size anymore
The default behavior is to set it to 2MB, so this isn't
required anymore.

Change-Id: I62d7605cd4d5bc41347128f32f9a1aa373a15744
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466993
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-09-10 17:48:49 +00:00
Ziye Yang
24eb7a84b0 nvme/tcp: fix the iov vector count.
Since we use pdu->data_iovcnt to
build the iov in nvme_tcp_build_iovs, so
send out pdu has the maximal iov number
equals to: 2 + pdu->data_iovcnt,
so we change the comparison.

This makes sure that we can handle all the data
owned by one pdu.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I2b9258cc5716d706c0fa38af609726c439708768
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467207
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-09-09 02:08:31 +00:00
Shuhei Matsumoto
9796768132 nvmf: Move pending_data_buf_queue to common struct spdk_nvmf_transport_poll_group
This unifies buffer management among transports further and is a
preparation to make buffer allocation asynchronous.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8c588eeac4081f50fe32605feb7352f72c628d95
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466847
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-09-09 00:42:22 +00:00
Shuhei Matsumoto
b4778363b4 nvmf/tcp: Pass nvmf_request to nvmf_tcp_req_fill_buffers
Most variables related with I/O buffer are in struct spdk_nvmf_request
now. So we can pass nvmf_request instead of nvmf_tcp_req to
nvmf_tcp_req_fill_buffers and do it in this patch.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I00eff578a98891e99fcb9a3aafa3d99126d6f1c1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466089
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-08-30 16:56:46 +00:00
Shuhei Matsumoto
2bc819dd52 nvmf/tcp: Use STAILQ for queued_c2h_data_tcp_req and pending_data_buf_queue
This is a small performance optimization and an effort to unify
I/O buffer management further among transports.

It is ensured that the request is the first of STAILQ when
spdk_nvmf_tcp_send_c2h_data() is called or the case
TCP_REQUEST_STATE_NEED_BUFFER is executed in spdk_nvmf_tcp_req_process().

Hence change TAILQ_REMOVE to STAILQ_REMOVE_HEAD for these two cases.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0b195874ac22a8d5ecfb283a9865d2615b7d5912
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466637
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-08-30 16:56:46 +00:00
Ziye Yang
5e7b8d18f3 nvmf/tcp: Remove the potential pdu hdr memory copy.
In this patch, we directly point the hdr_p
to the memory owned by the pdu_recv_buf to avoid
memory copy.

Change-Id: Iee0dd98058928f429bf7ad22103cd4826226400f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465158
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-08-30 02:25:22 +00:00
Shuhei Matsumoto
8a80461ac6 nvmf/tcp: execute buffer allocation only if request is the first of pendings
RDMA transport executes spdk_nvmf_rdma_request_parse_sgl() only if
the request is the first of the pending requests in the case
RDMA_REQUEST_STATE_NEED_BUFFER in the state machine
spdk_nvmf_rdma_requests_process().

This made RDMA transport possible to use STAILQ for pending requests
because STAILQ_REMOVE parses from head and is slow when the target is in
the middle of STAILQ.

On the other hand, TCP transport executes spdk_nvmf_tcp_req_parse_sgl()
even if the request is in the middle of the pending request in the case
TCP_REQUEST_STATE_NEED_BUFFER in the state machine
spdk_nvmf_tcp_req_process() if the request has in-capsule data.

Hence TCP transport have used TAILQ for pending requests.

This patch removes the condition if the request has in-capsule data
from the case TCP_REQUEST_STATE_NEED_BUFFER.

The purpose of this patch is to unify I/O buffer management further.

Performance degradation was not observed even after this patch.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idc97fe20f7013ca66fd58587773edb81ef7cbbfc
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466636
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-29 18:17:38 +00:00
Shuhei Matsumoto
9968035884 nvmf/tcp: Replace TCP specific get/free_buffers by common APIs
Use spdk_nvmf_request_get_buffers() and spdk_nvmf_request_free_buffers(),
and then remove spdk_nvmf_tcp_request_free_buffers() and
spdk_nvmf_tcp_request_get_buffers().

Set tcp_req->data_from_pool to false after spdk_nvmf_request_free_buffers().

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I286b48149530c93784a4865b7215b5a33a4dd3c3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465876
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-29 18:17:38 +00:00
Shuhei Matsumoto
005b053a02 nvmf: Move data_from_pool flag to common struct spdk_nvmf_request
This is a prepration to unify buffer management among transports.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6b1c208207ae3679619239db4e6e9a77b33291d0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466002
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-29 18:17:38 +00:00
Shuhei Matsumoto
04ae83ec93 nvmf: Move allocated buffer pointers to common struct spdk_nvmf_request
This is a preparation to unify buffer management among transports.
struct spdk_nvmf_request already has SPDK_NVMF_MAX_SGL_ENTRIES (16) * 2
iovecs. Hence incresing the number of buffers twice will be no problem.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idb525abbf35dc9f4b8547b785b5dfa77d106d8c9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465873
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-08-29 18:17:38 +00:00
Ziye Yang
d50736776c nvmf/tcp: Use a big buffer for PDU receving.
Purpose: Reduce the recv/readv system call.
Method: Use a big recv buffer to conduct the read.
Though it will introduce addtional buffer copy,
we hope that the overhead introduced by buffer copy will
be smaller compared with frequent recv/readv system call overhead.
And the design is to make a trade off between them.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9286fd9cec0b512cea8e3f2c335c5bf862b98573
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464842
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-08-28 15:38:02 +00:00
Ziye Yang
ea5ad0b286 nvme/tcp: Change hdr in nvme_tcp_pdu to pointer
Purpose: Prepare the further optimnization in the
target side whening receving pdu headers, we expect
to use zero copy.

Change-Id: Iae7f9106844736d7160d39d0af1f5941084422ec
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465380
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-08-28 15:38:02 +00:00
Shuhei Matsumoto
eab7360bcb nvmf/tcp: Factor out getting and filling buffers from nvmf_tcp_req_fill_iovs
This follows the practice of RDMA transport and is a preparation to
unify buffer allocation among transports.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib85625f2a0eca01ef4028685dd838d6c41faad7b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465872
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-26 19:04:24 +00:00
Shuhei Matsumoto
72c10f7094 nvmf/tcp: Use spdk_mempool_get_bulk in nvmf_tcp_req_fill_iovs
This follows the practice of RDMA transport and a preparation to
unify buffer management among transports.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4e9b81b2bec813935064a6d49109b6a0365cb950
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465871
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-26 19:04:24 +00:00
Shuhei Matsumoto
8aac212005 nvmf/tcp: Pass number of alloc buffers s as param to nvmf_tcp_request_free_buffers
This is a preparation to the next patch to use spdk_mempool_get_bulk.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I28a5ad941004f139c9032d85c2ef92680081f1ce
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465870
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-26 19:04:24 +00:00
Ziye Yang
1917d3b413 nvmf: move the assigment of pdu outside the switch
Purpose: To reduce the duplicated code.

And one minor fix: add an empty line between two functions

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I12c9ddba6526c094cd2bd945e14f9d8bf5209adf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464504
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-08-09 07:37:12 +00:00
Ziye Yang
73d9cef8c5 nvmf/tcp: add nvme_tcp_pdu_cal_psh function.
Purpose:

1 Do not caculated the psh_len every time.
2 Small fix, for ch_valid_bypes, and psh_valid_bytes,
we do not need to use uin32_t.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I9b643da4b0ebabdfe50f30e9e0a738fe95beb159
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464253
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-08-07 01:46:54 +00:00
Shuhei Matsumoto
cf95d4a24f sock: Fix return value of spdk_sock_group_poll to return number of events
spdk_sock_group_poll() and spdk_sock_group_poll_count() had returned
0 on success. The implementation didn't match the specification
described in the header file, and couldn't be used to collect stats
correctly because 0 means idle.

This patch fixes the return value of spdk_sock_group_poll() and
spdk_sock_group_poll_count() to return number of events and
the callers not to overwrite the return value by 0.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7e2a17187fc74ea44d3acf2f35d63f5e5a254eda
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463710
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-08-02 00:19:43 +00:00
Ziye Yang
6d4f580e79 nvmf/tcp: Remove spdk_nvmf_tcp_qpair_process_pending
Phenomenon:
Test case:  Using the following command to test
./test/nvmf/target/shutdown.sh --iso --transport=tcp
without this patch, it will cause coredump.
The error is that the NVMe/TCP request in data buffer
waiting list has "FREE" state.

We do not need call this function in
spdk_nvmf_tcp_qpair_flush_pdus_internal, it causes the
bug during shutdown test since it will call the function
recursively, and it does not work for the shutdown path.

There are two possible recursive calls:

(1)spdk_nvmf_tcp_qpair_flush_pdus_internal ->
spdk_nvmf_tcp_qpair_process_pending ->
spdk_nvmf_tcp_qpair_flush_pdus_internal ->
>..
(2) spdk_nvmf_tcp_qpair_flush_pdus_internal->
pdu completion (pdu->cb)
->..
-> spdk_nvmf_tcp_qpair_flush_pdus_internal.

And we need to move the processing for NVMe/TCP requests
which are waiting buffer in another function to handle
in order to avoid the complicated possbile recursive
function calls. (Previously, we found the simliar
issue in spdk_nvmf_tcp_qpair_flush_pdus_internal for
pdu sending handling)

But we cannot remove this feature,
otherwise, the initiator will hang for waiting the
I/O. So we add the same functionality in spdk_nvmf_tcp_poll_group_poll
function.

Purpose: To fix the NVMe/TCP shutdown issue.
And this patch also reables the test for shutdown and bdevio.

Change-Id: Ifa193faa3f685429dcba7557df5b311bd566e297
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462658
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-07-26 21:16:23 +00:00
Ziye Yang
9375616ae2 nvmf/tcp: code cleanup
move the staement location of TCP request setting and remove
the duplicated code.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia659756185547ff4f8aa26c5bc01f63defe6c113
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/462589
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-22 02:40:35 +00:00
Ziye Yang
6ad6a1131b nvmf/tcp: Add a feature to allow set the sock priority of the connection.
This priority is used to differentiate the sock priority on the TCP connections
between  NVMe-oF TCP target and other TCP based applications.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I6ee294e647420b56d1d91a07c2e37bf34ce24e03
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461801
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-07-19 06:30:19 +00:00
Darek Stojaczyk
36ccca2c08 nvmf/tcp: switch to spdk_*malloc()
spdk_dma_*malloc() is about to be deprecated.

Change-Id: Ic42db528bbae4b3ca2e91cb9ac46def99ecb5f28
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459431
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-17 01:28:57 +00:00
Shuhei Matsumoto
7ee58b90e1 nvmf/tcp: Set DIF context to PDU when processing in-capsule, C2H, or H2C data
Set DIF context of the corresponding request to PDU when
- processing in-capsule data of the command,
- processing data of C2H PDU, or
- processing data of H2C PDU.

Change-Id: I3a668a55be21dbe2ee6ecf26476290670bd7b4a8
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458929
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
2019-07-11 05:30:28 +00:00
Shuhei Matsumoto
e3e023cfd3 nvmf/tcp: Increase in-capsule buffer size to fill DIF fields
When NVMe/TCP initiator transfers in-capsule data, NVMe/TCP has to
process it as in-capsule data. If DIF insert/strip is enabled,
in-capsule data size will be increased by NVMe/TCP target to insert
metadata. However size of in-capsule data buffer had not been
increased, and buffer overflow occurred when NVMe/TCP initiator
transfers in-capsule data to NVMe/TCP target with DIF insert/strip
being enabled.

This patch increases size of in-capsule data buffer size to store
metadata. 16 byte metadata per 512 byte data block is the current
maximum ratio of metadata per block.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I88b127efd7a945bde167a95df19a0b9175cb8cd0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461333
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-07-11 05:30:28 +00:00
Shuhei Matsumoto
9d4ee5f344 nvmf/tcp: Fix wrong data offset in nvmf_tcp_pdu_payload_insert_dif
We updated readv_offset before generating DIF to avoid adding
the temporary variable _rc in the previous patch, but that caused
write error when inserting DIF.

Fix the bug in this patch.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id0788280a83cbea2554c851db77751432fc00cba
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461116
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-11 05:30:28 +00:00
Shuhei Matsumoto
2c9b0af271 nvmf/tcp: Get DIF context when handling capsule command header
When handling the capsule command header, call spdk_nvmf_request_get_dif_ctx
by passing the NVMf request and the reference to the DIF context, and set
the flag dif_insert_or_strip of the NVMf/TCP request to true.

spdk_nvmf_request_get_dif_ctx returns false immediately when the
corresponding NVMf controller disables DIF insert/strip.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I16f6b322f2692d5f9653d011a490e7929ec37365
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458928
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-07-11 05:30:28 +00:00
Ziye Yang
960460f0d1 nvmf: add spdk_nvmf_transport_get_optimal_poll_group
Add the optimal poll group get function.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia9e57c6924a6563d79269cf535814883e83698cd
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454549
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-10 02:30:41 +00:00
Shuhei Matsumoto
aa322721cb nvmf: Add dif_insert_or_strip to transport options
This is a place holder and subsequent patches will use the option
dif_insert_or_strip and provide JSON RPCs to configure it.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7e3fbb1d49c47647a9a0a1a2149152801591b283
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456452
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-10 00:43:02 +00:00
Shuhei Matsumoto
7bfbc388d7 nvmf/tcp: Pass extended LBA based length as I/O length to NVMf controller
When DIF is inserted or stripped,
- in the TCP transport layer, we can use LBA based length throughout, but
- in the NVMf controller layer and BDEV layer, extended LBA based
  length must be used, and NVMf controller gets the length from
  tcp_req->req.length.

Hence by adding and using two variables, elba_length and orig_length
to struct spdk_nvmf_tcp_req, set the extended LBA length to
tcp_req->req.length before calling spdk_nvmf_request_exec(), and then
restore the original LBA based length to tcp_req->req.length after
calling spdk_nvmf_tcp_req_complete().

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9309b8923c6386644c4fd8ef3ee83a19f5d21ce5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458926
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-07-09 03:39:25 +00:00
Shuhei Matsumoto
51b643648c nvmf/tcp: Increase buffer to insert/strip DIF in spdk_nvmf_tcp_req_parse_sgl
If tcp_req->dif_insert_or_strip, increase the length from LBA based
to extended LBA based by using its own DIF context.

Change-Id: Ie9f5cf757328dda795b43a7b6c70a72259865115
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458925
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-09 03:39:25 +00:00
Shuhei Matsumoto
536bd70eb4 nvmf/tcp: Use cached length variable in spdk_nvmf_tcp_req_parse_sgl
The next patch will extend the length from LBA based to extended
LBA based and use it as buffer length to insert or strip DIF.

So cache sgl.unkeyed.length at the top of spdk_nvmf_tcp_req_parse_sgl
and use it throughout.

Besides, one unrelated change-the-line to improve the readability
is included.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2a1dc9379bb5671ec80b5b478504c9879a4f0fff
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458924
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-09 03:39:25 +00:00
Shuhei Matsumoto
975239c29d nvmf/tcp: Insert DIF to the newly read data to create extended LBA payload
Generate and insert DIF to each data block when reading more than a single
byte.

This update is very similar with the use case of spdk_dif_generate_stream
in iSCSI target.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I063919a32153ac0daf6d6eb1836c0d5995b65d33
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459092
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-07-09 03:39:25 +00:00
Shuhei Matsumoto
8448adaefa nvmf/tcp: Verify DIF before sending C2H data in spdk_nvmf_tcp_send_c2h_data
If DIF mode is local and C2H data is extended LBA payload, DIF should
be verified just before sending the payload.

Add a helper function nvmf_tcp_pdu_verify_dif and call it in
spdk_nvmf_tcp_send_c2h_data after completing nvme_tcp_pdu_set_data_buf.

When nvmf_tcp_pdu_verify_dif returns error, treat the error as fatal
transport error because the error is caused by the target itself.

Handle the fatal NVMe/TCP transport error by terminating the connection
as described in the NVMe specification.

On the other hand, data digest error is treated as a non-fatal transport
error because the error is caused outside the target. This is reasonable.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9680af2556c08f5888aeaf0a772097e4744182be
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458921
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-08 03:33:07 +00:00
Ziye Yang
57efada508 nvmf/tcp: reorg the structure of struct spdk_nvmf_tcp_req
I used pahole to see whether the alignment of the structure
is reasonable. After reorgnization, we can saved 16 bytes and 1
cacheline according to the information by pahole.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I1347e7c582fe2b00707e2841690b87d53cc61e33
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460572
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-07-05 04:18:41 +00:00
Shuhei Matsumoto
3ff1ff004e nvme/tcp: Minor cleanups for SGL operations
Using naming rules consistent with other related libraries is helpful
to ensure the quality as verified by this patch series.

This patch changes a few parts to use iov and iovcnt for SGL operations.
Besides, name of an array points to the head of the array and is
constant. So copying name of array to an another pointer is not
necessary and can be removed.

Change-Id: I2324f28126b3088098c1c767cf6c060f22c175c3
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
2019-07-04 08:58:40 +00:00
Shuhei Matsumoto
127cfac020 nvmf/tcp: Use nvme_tcp_pdu_set_data_buf for incapsule data
Previously we had used nvme_tcp_pdu_set_data() for incapsule data.
This patch changes handling incapsule data to use
nvme_tcp_pdu_set_data_buf() as same as H2C and C2H.

This unification is necessary to support DIF insert and strip
in NVMe/TCP target later.

Change-Id: I02cae8db94e51cf79a354dd64ad45f0e491ec08e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455920
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
2019-07-04 08:58:40 +00:00
Shuhei Matsumoto
3184884f9d nvmf/tcp: Properly handle multiple iovecs in processing H2C and C2H
NVMe/TCP target had assumed the size of each iovec was io_unit_size.
Using nvme_tcp_pdu_set_data_buf() instead removes the assumption
and supports any alignment transparently.

Hence this patch moves nvme_tcp_pdu_set_data_buf() to
include/spdk_internal/nvme_tcp.h and replaces the current code to use it.

Besides, this patch simplifies spdk_nvmf_tcp_calc_c2h_data_pdu_num()
because sum of iov_len of iovecs is equal to the variable length now.

We cannot separate code movement (lib/nvme/nvme_tcp.c to include/
spdk_internal/nvme_tcp.h) and code replacement (lib/nvmf/tcp.c)
because moved functions are static and compiler give warning if
they are not referenced in lib/nvmf/tcp.c.

The next patch will add UT code.

Change-Id: Iaece5639c6d9a41bd35ee4eb2b75220682dcecd1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455625
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-04 08:58:40 +00:00
Ziye Yang
b09bd95ad3 sock: update spdk_sock_group_add_sock
And also add spdk_sock_group_get_ctx function

Change-Id: I2a2a58b0588ff7d99d3538ea0a633a3b8c7a234b
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454538
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
2019-07-04 08:21:05 +00:00
Ziye Yang
cdc0170c1b nvmf/tcp: Add a maximal PDU loop number
In our previous code, we will handle all the PDU until there is
no incoming data from the network if we can continue the loop.
However this is not quite fair when we handling multiple connections
in a polling group.

And this change is setting a maximal NVME/TCP PDU we can handle
for each conneciton, it can improve the performance. After some
tuing, 32 should be a good loop number. Our iSCSI target uses
16.

The following shows some performance data:

Configuration:
1 Command used in the initiator side:
./examples/nvme/perf/perf -r 'trtype:TCP adrfam:IPv4 traddr:192.168.4.11 trsvcid:4420'
-q 128 -o 4096 -w randrw -M 50 -t 10

2 target side, export 4 malloc bdev in a same subsystem

Result:

Before patch:

Starting thread on core 0
========================================================
                                                                                                           Latency(us)
Device Information                                                    :       IOPS      MiB/s    Average        min        max
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   51554.20     201.38    2483.07     462.31    4158.45
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   51533.00     201.30    2484.12     508.06    4464.07
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   51630.20     201.68    2479.30     481.19    4120.83
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   51700.70     201.96    2475.85     442.61    4018.67
========================================================
Total                                                                 :  206418.10     806.32    2480.58     442.61    4464.07

After patch:
Starting thread on core 0
========================================================
                                                                                                           Latency(us)
Device Information                                                    :       IOPS      MiB/s    Average        min        max
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   57445.30     224.40    2228.46     450.03    4231.23
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   57529.50     224.72    2225.17     676.07    4251.76
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   57524.80     224.71    2225.29     627.08    4193.28
TCP  (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0:   57476.50     224.52    2227.17     663.14    4205.12
========================================================
Total                                                                 :  229976.10     898.34    2226.52     450.03    4251.76

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I86b7af1b669169eee2225de2d28c2cc313e7d905
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459572
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-06-28 12:28:54 +00:00
Or Gerlitz
6629202cbd nvmf/tcp: Use the success optimization by default
By now (5.1 is released), the Linux kernel initiator supports the
success optimization and further, the version that doesn't support
it (5.0) was EOL-ed. As such, lets open it up @ spdk by default.

Doing so provides a notable performance improvement: running perf with
iodepth of 64, randread, two threads and block size of 512 bytes for 60s
("-q 64 -w randread -o 512 -c 0x5000 -t 60") over the VMA socket acceleration
library and null backing store, we got 730K IOPS with the success
optimization vs 550K without it.

IOPS           MiB/s    Average       min      max
549274.10     268.20     232.99      93.23 3256354.96
728117.57     355.53     175.76      85.93   14632.16

To allow for interop with older kernel initiators, we added
a config knob under which the success optimization can be
enabled or disabled.

Change-Id: Ia4c79f607f82c3563523ae3e07a67eac95b56dbb
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457644
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-06-26 06:24:03 +00:00
Ziye Yang
0bb626672b nvmf/tcp: Support single r2t usage
According to the TP 8000 spec in Page 26:
Maximum Number of Outstanding R2T (MAXR2T): Specifies the maximum
number of outstanding R2T PDUs for a command at any point in time
on the connection.

Note that by the spec, the target may only support single r2t
(which is the minimum possible), it doesn't have to use multiple r2ts
even if the initiator supports that. So remove the maxr2t and
pending_r2t variable in the tcp qpair structure.

In the original design, we think that maxr2t is the maximal active
r2t numbers for each connection. So if the initiator sends out maxr2t=16,
it means that all the commands of a qpair can use such number of R2T pdus.
So we need to wait for the available R2Ts for the request when the maxr2t
reaches the maximal value. But it is the wrong understanding of the spec.

In fact, each command has its own number of maximal r2t numbers, then we
do not need to use the wait method for R2T method anymore. So we remove
the state TCP_REQUEST_STATE_DATA_PENDING_FOR_R2T. Futhermore, we adjust
the related SPDK_TPOINT_ID definition.

In current patch, the target will support one active R2T for each
write NVMe command. Thus, we remove the function spdk_nvmf_tcp_handle_queued_r2t_req.

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7547b8facbc39139b4584637ccc51ba8b33ca285
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455763
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-06-05 16:46:55 +00:00
Jim Harris
f758598c44 nvmf: fix assert in spdk_nvmf_tcp_req_fill_iovs
It's OK for iovcnt to equal SPDK_NVMF_MAX_SGL_ENTRIES.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic95d04f5667858e7fbb025f469c027e2d47b8ba1

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456111
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-05-31 14:46:35 +00:00
Jim Harris
bf647c168a nvmf: increase default max num qps to 128
This matches the Linux kernel target.  Users can
still decrease this default when creating the
transport (i.e. -p option for nvmf_create_transport
in rpc.py).

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icad59350a2cd35cfc4ad76d06399345191680c05

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454820
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 14:50:05 +00:00
Jim Harris
b6206d657c trace: shorten max name from 44 to 24 characters
This restriction helps reduce the amount of padding when
printing out the event trace, allowing it to fit in a
small number of columns.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa31e5a6967c7b9bc7028069effb71533f80596f

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452736
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-02 08:41:56 +00:00
Jim Harris
617184be3b trace: remove short_name
This was not used by any of the trace register descriptions.
Let's remove it rather keeping it around if we don't need it.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idda809e2911db5be555ff6aa13695484a14bf665

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452734
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2019-05-02 08:41:56 +00:00
Seth Howell
33f60621af lib: resize key mempools
Mempools are based off of a ring structure which allocates its elements
as a power of two. It also only exposes n-1 elements to the user. So
when we create a mempool with 2^n elements in it, we have to allocate a
ring with 2^n+1 entries. By decreasing the number of elements in these
key mempools by 1, we can save a decent amount of memory.

Change-Id: I942c9dd4cf59096969bc2559fb46fd2084a07f09
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448875
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-01 17:45:29 +00:00
Jim Harris
b92c3d412d nvmf: add tcp trace points for data read from socket
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib04abb64dd379dd73c7ff3c8318591124b4bb7dd

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451477
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-04-23 17:59:23 +00:00
Jim Harris
4ff7949893 nvmf: remove unused tcp trace point
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8f2e26f46f8c37312c3201df8210b449279640d0

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451476
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-04-22 15:25:37 +00:00
Ziye Yang
4ee4023a0d nvme/tcp: Replace the data with iov in pdu struct
Purpose: To support the multiple SGL later.

Change-Id: I133a451100b736353cf98a6aaca879d290ff5b67
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448259
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-04-04 14:28:09 +00:00
Ziye Yang
8f3b4a3a6d nvme/tcp: Add a helper function nvme_tcp_pdu_set_data
This function will be exteneded later for multiple SGL
support.

Change-Id: I1f6962ec03c72e335efaa311a12d3891312fcc53
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449968
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-04-04 04:50:04 +00:00
Ben Walker
527be2bf4e nvmf: Remove qpair_is_idle
This wasn't used anywhere.

Change-Id: I405af3c808be284d19218f3f04c1e90e33e31de8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446977
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
2019-03-15 19:19:17 +00:00
Ziye Yang
58739014a3 nvmf/tcp: use the nvme_tcp_readv_data
The purpose is to use the single readv to read both
the payload the digest(if there is a possible one).

And this patch will be prepared to support the
multiple SGL in NVMe tcp transport later.

Change-Id: Ia30a5e0080b041a65461d2be13db4e0592a70305
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447670
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-03-13 14:29:17 +00:00
Ziye Yang
791d89bfa7 nvme/tcp: optimize nvme_tcp_build_iovecs function.
Borrow the ideas from iSCSI and optimize
the nvme_tcp_build_iovecs function.

Change-Id: I19b165b5f6dc34b4bf655157170dec5c2ce3e19a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446836
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-03-07 02:59:33 +00:00
Ziye Yang
5f3c92c2fd nvmf/tcp: fix the space alignment issue in spdk_nvmf_tcp_qpair
Change-Id: Ieedfb46cadc8610ca8a6c33372e3a82ae8052550
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/446477
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-03-01 04:43:40 +00:00
Ziye Yang
2da86de69f nvmf/tcp: fix error message printing in spdk_nvmf_tcp_qpair_set_recv_state
If the current recv_state of qpair is same with the state to be set,
we will print error message. And checked the current code,
we should add a check to avoid this.

Change-Id: I49334f637c48e565e785d1fe6d0f000e18b2048a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/c/445653
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-21 18:04:10 +00:00
Ziye Yang
a1c5442d16 nvmf/tcp: remove the tqpair->group = NULL statement
Purpose: solve the coredump issue for the buffer
return later in spdk_nvmf_tcp_request_free_buffers.

If keep this statement, we cannot return the buffer
to the polling group.

Change-Id: Ib5c95ba54b37540950e654110fe6317cab507076
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/445435
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-02-21 03:37:47 +00:00
Ziye Yang
2d0ce5b48b nvmf/tcp: Implement correct behavior of timeout for C2Htermreq case
From TP8000 spec 7.4.7,

"In response to a C2HTermReq PDU, the host shall terminate the connection.
If the host does not terminate the connection in an implementation specific
period that does not exceed 30 seconds, the controller may terminate the
connection on its own".

It means that the timeout is designed for: when the target is
sending out C2hTermReq, if the host does not terminate the connection,
the target should terminate the connection.

PS: For detecting the malicous connection without sending response
(such as no response of R2T PDU) which should be another patch.

Change-Id: I586dbb235d99aeab5d748a19b9128cd8b0cef183
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/440831
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-13 18:20:28 +00:00
Seth Howell
b7651b681c NVMe-oF: add asserts for SGE counts
We should never be going over these limits in the respective transports,
but add asserts to check this during testing.

Change-Id: Ifcaa82ccf58546a38020b31df54ee5d1d9822b8b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442777
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 23:34:20 +00:00
Ben Walker
7a4d6af182 nvmf/tcp: Stay in AWAIT_PDU_READY state until atleast 1 byte arrives
This doesn't fix any bug, but it makes more sense to leave the qpair
in the NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY state until it
receives at least one byte.

Change-Id: Ic5f34a733a80b58f65a1334fae7e07dbded2b3d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-02-08 16:35:12 +00:00
Ben Walker
63de221bf6 nvmf/tcp: Eliminate management channel in favor of poll group
The management channel was used in the RDMA transport prior
to the introduction of poll groups and made its way over to
the TCP transport when it was written. Eliminate it in favor
of just using the poll group.

Change-Id: Icde631dd97a6a29190c4a4a6a10a0cb7c4f07a0e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2019-02-06 16:02:43 +00:00
Ben Walker
2b59852b65 nvmf/tcp: Rename nvme_tcp_qpair to spdk_nvmf_tcp_qpair
Naming consistency.

Change-Id: Ia044a41fa9939c17b52d306c2a053ffc56f03d56
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442441
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-04 16:24:00 +00:00
Ben Walker
55e12a6cdb nvmf/tcp: Remove tqpair pointer from pdu
This was only used by the target, and it didn't actually need it.

Change-Id: Ibcef410165efdc16077da24419580ed51b087d70
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442440
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-04 16:24:00 +00:00
Ben Walker
c57bafed51 nvmf/tcp: Rename nvme_tcp_req to spdk_nvmf_tcp_req
Naming consistency.

Change-Id: I9a5ca6fb22fd80f818c4e2223a90af4257140fac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442439
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-04 16:24:00 +00:00
Ben Walker
d3e3f7622b nvmf/tcp: Remove forward declaration of nvme_tcp_req from nvme_tcp.h
This type was actually two entirely different types for
the initiator and the target, so just make it void.

Change-Id: I15512d9d4efd790dce0fa4323b7230de66144bc6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442438
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-04 16:24:00 +00:00
Ben Walker
2d07fa1532 nvmf/tcp: Rename spdk_nvme_tcp_term_req_fes_str
Switch nvme to nvmf

Change-Id: Ibc2540018b7f6d062d2ad6c4ffa8337b94d22614
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442436
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-04 16:24:00 +00:00
Ziye Yang
81faea1b2d nvmf/tcp: remove the timeout handling code
Currently, the code does not comply with the spec,
so remove such code for 19.01 and will add the code
which complies with the spec for 19.04

Change-Id: Icd3b2573fbc46dc2fa7a00c6672c23ea01ffe0ee
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/441985
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-01-25 16:38:13 +00:00