This allows users to swap out SPDK's third party
libraries for an implementation based on their own
framework.
Change-Id: Ia0b7384ce5e31acba5ad0d7002dec9e95b759c52
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The offset variable is used to store the result of a uint64_t * uint32_t
multiplication; a signed integer is not the correct type for the result.
Change-Id: If1fb22314ba7e3cec91808cc051678f809c9e58b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This feature should only be used if clients are coordinating
with one another.
Change-Id: I89a437441a7e3fbcc1e5f6efa1c8e970ade7c2ec
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
We already require the assert header from the C standard library,
so use that instead of RTE_VERIFY to further isolate DPDK
dependencies.
Change-Id: I4a718af858c88aff6080e33e6c3dd533c077b8f4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
bdev and copy modules no longer have check_io functions
now - all polling is done via pollers registered when
I/O channels are created.
Other default resources are also removed - for example,
a qpair is no longer allocated and assigned per bdev
exposed by the nvme driver - the qpairs are only allocated
via I/O channels. Similar principle also applies to the
aio driver.
ioat channels are no longer allocated and assigned to
lcores - they are dynamically allocated and assigned
to I/O channels when needed. If no ioat channel is
available for an I/O channel, the copy engine framework
will revert to using memcpy/memset instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99435a75fe792a2b91ab08f25962dfd407d6402f
I/O channels are not actually used for I/O yet however.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa3774ecacc7ec206c7c0c66e6b2f2d10c8fa785
Instead of polling for only 1 completion at at time,
poll for batches of 32.
Change-Id: I5ef99a270489e7b3d2a58cb765915f187775a93e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Purpose: To make the function definition style consistent
Change-Id: I7ade943881aa5076fdd419958e386ae3c3661da6
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
1 In our nvmf tgt implemention, we use the async
mode to delete the nvmf subsystem. However, when
we parse nvmf subsystem, we need to use the sync
function to delete the nvmf subsystem. Since if
there is error, we will call spdk_app_stop, thus
async functions will not be executed. It is
approved in my local test.
2 Add debug info in spdk_nvmf_delete_subsystem
Change-Id: Ia8ecd6eee1bbd25cb3e1ceeb0e2146f3f03be228
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
ibv_poll_cq is actually an expensive call to make, so take
steps to begin to minimize the number of times it is called.
Change-Id: I6fc64979604220eb8cacd612b46e3a3b1bca0924
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This matches the general order (LBA start then LBA count) for
the NVMe API.
While here, fix a copy/paste error in a debug message (write
instead of writev).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ice326af5d6025867dffed4d1f6c7b81fb9eba5eb
Set status code to invalid opcode when opcode is not supported
in nvmf_process_discovery_cmd.
Change-Id: Ibab8097e536f26f16c322d5f539277688906cfc3
Signed-off-by: Liang Yan <liang.z.yan@intel.com>
The spec does not define NQNs as case-insensitive, so replace the
strcasecmp() matching of NQNs with strcmp().
Change-Id: I5946d9ee8e1d0aa5966e9b1b3c6f14f3f5119aec
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
1 Rename this function and make it more meaninful, since
we have spdk_nvmf_session_connect which is used to link a
connection to the session
2 split spdk_nvmf_session_destruct.
Change-Id: I150df7ccdf4de3428d8cecbb286d5f7944510a8c
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Fix copy-and-paste errors - when polling the recv CQ, we should print
"Recv" instead of "Send" in log messages.
Signed-off-by: Roland Dreier <roland@purestorage.com>
This can just directly assign the completion instead
of calling memcpy.
Change-Id: I07819c824eba45245b00fa3538a99bc81bcb9fcc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This function always shows up as one of the hottest functions when
profiling. I believe it is the memset that is expensive, so instead
use default initialization when the wr is declared on the stack
and just set the members that need to be updated in the function.
Also make the function inline for good measure.
Change-Id: I29e24cdd375311fa033b5a6df772ff4f73e35302
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
We need to free the session resource, if there is error
for creating a new session
Change-Id: I7c4f3e779e0b30e213e02b8676d93bd2fe9bf851
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The application is now entirely responsible for scheduling subsystem
pollers and sending events between threads.
Change-Id: I88da1f53b5e8852c7c4acd6f0a7a1e2219fbed41
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reason: In acceptor_poller_unregistered_event, we
directly call spdk_nvmf_check_pools and spdk_app_stop,
it will fail the memory check.
And function nvmf_delete_subsystem_poller_unreg will
not be called since we already call spdk_app_stop.
Change-Id: I3ffa30c87b149a66cee1d87d1bb81d4dc8cc96b9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The "+" is not correct, should be "-". Currently,
the issue doest not happen since the offset is 0,
then both + and - is OK. But if we adjust the location
of spdk_nvmf_conn or spdk_nvmf_request, we can find
this bug.
Change-Id: Ib358dc729da901a69442d0402a6089989f49b05c
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Check that the number of blocks/ranges in the command fits within the
length specified by the SGL.
Change-Id: I21aded797dc1f1e752fe0bc9cec27310a4fb106a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The Dataset Management command allows several operations to be specified
at once; the virtual controller only supports deallocate for now, but it
should just ignore the other bits in order to be spec compliant: "If the
Dataset Management command is supported, all combinations of attributes
[...] may be set".
The spec also explicitly states that it is acceptable for controllers to
choose to take no action based on information provided, so not
implementing the other attributes is fine.
Change-Id: Ia989dc1faa9c852660bf1299ea18fa8e7bdf4053
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Also add a diagnostic message if the requested log page ID is not
supported.
Change-Id: I7551b5905d5ebc29356839f0f9153dc86f237106
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rather than comparing the bdev name against "NVMe", use the new I/O type
supported API to query whether the unmap operation is supported.
Change-Id: I62c7a1ea5529366ff2ae4723b62f24ea78aa8193
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the NQN validation into the subsytem creation function, and fix the
allowed size to match the spec.
The spec is not clear about the allowed NQN size; for now, interpret it
as 223 bytes, including the null terminator (222 bytes of actual NQN
plus one terminator byte).
Change-Id: If9743ab2fe009d9d852e8b03317d9b38d8af18dc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
SUBNQN is a UTF-8 null terminated string according to the NVMe base
spec, so pad it with zeroes using strncpy().
Change-Id: I486161b26d91f3ea1fd17428e220b9f20a874732
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These are specified as "ASCII string", which means they should be
left-aligned and padded with spaces, according to the NVMe base
specification.
Change-Id: I25babe0ca417c2e16137b0bfc41fc7834277114e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Clean up the poller and only then free the associated subsystem's
memory. This prepares for future dynamic subsystem creation/deletion.
Change-Id: I9e56cbf8822814930fdbb662095c51b6ad40fbc4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently the NVMf target listens for new connections on any address.
Instead, listen only on the addresses specified by the user.
Change-Id: Idb6d37c422e442fc70a8673bd3fcfb9c27b57828
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Use the event framework's new delay parameter to allow
for idle cores to sleep for up to 1ms at a time.
Change-Id: I665f38e590c07338418892afe0e75b0b2c79706e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
It is no longer needed, since the nvmf_tgt app handles initialization
and shutdown.
Change-Id: I051afe2b4fcbd09b32998386c63f591a0ab343c2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will be used in future patches outside the library.
Change-Id: I1fcf5709944a884e161e5a6a9eaec033a995a812
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe over Fabrics target library now exposes a simple function call
that polls the acceptor once, and the application handles registration
of the poller.
Also rename the transport function pointers related to the acceptor so
they better reflect their purpose.
Change-Id: I5fa0d516586bf17e73afeb88ff3c2d5b0d46794d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will become more important when other transports are added.
For now, it is also useful to be able to start nvmf_tgt on systems
without RDMA hardware.
Change-Id: I6b9002cc7711f928c4e6b73adcd9b677349ebdd6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_shutdown_nvmf_subsystems() was removing the subsystem from the
list, but nvmf_delete_subsystem() also wants to remove it, so drop the
extra removal.
Also rewrite the shutdown loop as a TAILQ_FOREACH_SAFE() to make the
static analyzer happy (and make it more obvious that the loop will
terminate).
Change-Id: Iccadafa77d9cd3e26be21c0f11e62cfc1ef0197c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Verify that the record format is the one we support (only 0 is defined
by the spec for now).
Change-Id: Iddf038b381e540134abf572e0545c97a0ef71d5f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spec requires that NQNs are null terminated and maximum of 223 bytes
long, despite the Connect command fields being larger (256 bytes), so
add checks for both subsystem NQN and host NQN before using them as null
terminated strings.
Change-Id: I343d9e44a09ab4d0f6654feba460b31e976c4e56
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Users can specify the core for each subsystem and the acceptor listen routine
to run on different cores for performance consideration.
Change-Id: I4bd1a96f39194c870863b4b778e6ea7cf8fc1a2d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
This is causing issues during shutdown because the poller removal is not
synchronized with the rest of the cleanup path.
This reverts commit 7dfc5e922d.
Change-Id: If95c4b72c5d120f18bdc3db6d7d532ad1aada642
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This should enhance performance, since the hardware admin queue poll
function takes a mutex and should not be in the performance path.
Change-Id: I7e4acde0337aaf7079811612cba5348acf0a467d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This leaves more flexibility for future changes to the poller
representation without requiring API changes (after this one).
It also prevents the user from accidentally using poller fields in a
non-thread-safe way, since they can't be accessed directly anymore.
Change-Id: I7677d5b93668665d29ae39c5e0ba74333ad3f878
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe submission queue head wraparound point can be determined in the
generic NVMe over Fabrics layer; it should not be using the RDMA
connection queue depth.
Change-Id: I9da8f09e4f057f8fdc1ff4c6cc5f48cea7123e11
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Report the maximum admin queue size correctly.
Change-Id: I52cad654bf59806e0abb8d869c22973647056617
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the max_queue_depth parameter rather than rdma_conn->max_queue_depth
so that we can start to eliminate rdma_conn->max_queue_depth.
Change-Id: I1670c634e6d12aa004fb5a10338b7624850fbc4a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There were two unchecked allocations in the nvmf library. Check
for allocation failures.
Change-Id: Ic6b3104d825dba1ee6bd1748fa99e132702f300c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This fixes a static analysis warning for unsigned/signed
mismatch.
Change-Id: I49bd8d6d195f13b402e14a85503a5de6114f5b7f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The large buffer pool allocation was using the per-connection queue
depth, whereas the RDMA memory region registration was using the global
RDMA max queue depth. These sizes need to match, so use the global RDMA
max queue depth for both calls.
Change-Id: Iae161b719e09e19ca3e81df6593b68a4a2e86614
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the new timer-based poller functionality to replace rte_timer.
Change-Id: Ic40653306cc73b40139fe18e06bab29b35721a43
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Allow pollers to be scheduled to be run periodically every N
microseconds instead of every iteration of the reactor loop.
Change-Id: Iaea3e98965d81044e6dc5ce5f406bcb7a455289e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
We reported virtualized NVMe devices through NVMe over Fabric specification,
with 1.2.1 NVMe version. For direct mode, the NVMe device maybe has lower
version, such as 1.0, the identify namespace list can not support in those
devices, so we need to add helper function here to simulate such commands
from initiator.
Change-Id: I226f4f34bf61017f538d2dd80332f1d054a501f1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Allow higher queue depths by allowing many more send/recv
operations than read/write.
Change-Id: I66c424a6463e5e09be6d5463667241ce9271404b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The target can only provide updates to sq_head inside
of completions. Therefore, we must update sq_head prior
to sending the completion or we'll incorrectly get into
queue full scenarios.
Change-Id: If2925d39570bbc247801219f352e690d33132a2d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This allows the target to poll for internal completions
at higher priority.
Change-Id: I895c33a594a7d7c0545aa3a8405a296be3c106fb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This ensures that the data buffers are not in use
when we go to send the completion.
Change-Id: I30467b3e3964001150f81b21e5b695dcd0974b0c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is useful for holding session-wide buffer pools.
Change-Id: I7024da24b210a2205bf1e159d5935e0093b81120
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
For small SGLs, even if they are keyed and not inline, use the
buffer we allocated for inline data.
Change-Id: I5051c43aabacb20a4247b2feaf2af801dba5f5a9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Read/Write depth is much lower than Send/Recv depth.
Calculate them separately to prepare for supporting
a larger number of receives than read/writes.
Currently, the target still only exposes a queue depth
equal to the read/write depth.
Change-Id: I08a7434d4ace8d696ae7e1eee241047004de7cc5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These don't actually work quite yet, but pipe the
configuration file data through to where it will
be needed.
Change-Id: I95512d718d45b936fa85c03c0b80689ce3c866bc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
For each connection, allocate a single buffer each
of requests, inline data buffers, commands, and
completions.
Change-Id: Ie235a3c0c37a3242831311fa595c8135813ae49e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This can be used to release requests that don't
require a completion to be sent.
Change-Id: I8fb932ea8569bf3c45342d9fa4e270af5510c60c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
PORT IDs indicate hardware failure domains according
to the NVMf specification, which means they should
indicate which transport addresses are on the same
NIC. Unfortunately, that doesn't really make sense for
IP-based fabrics because IP addresses can move. The
safest way to present this is to show all IP addresses
as part of different subsystem ports.
Change-Id: I056a50c69be70b4fbf1f896e684ce65bd792241e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The NVMe over Fabrics 1.0 spec corresponds to the NVMe base spec version
1.2.1, so we should pretend to be at least that new.
Change-Id: I36fc44c780de01d6c666e87b803cd47dba0e74c5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These belong in nvme_spec.h anyway and are not used.
Change-Id: I889dfebee523dc5ae503fd0370bb800f1d17fb5d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is a leftover from a previous controller numbering scheme that is
no longer used.
Change-Id: I3058802f0324b0e38708111634ee993c6e884087
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the ctrlr and io_qpair out of spdk_nvmf_subsystem, package them
as a new data structure. Union the direct and virtual mode namespaces.
Change-Id: I839aee3372c6c57aa03a0be76f8aaeb5045ecdaf
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
CAP.CQR indicates whether contiguous queues are required; this is
meaningless in NVMe over Fabrics, since queue creation is handled
implicitly for each connection, but the spec requires it to be set to 1.
Change-Id: I6b05954eefa6928beecd7a640bbbdbd835c6b69a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the size of the applicable structs directly.
Change-Id: I4a65de548d409c9962b11a75d3fde2bfe434a3ec
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvmf_create_subsystem() already copies the name, so the strdup() in the
caller is unnecessary.
Change-Id: I225f0f077fee30051b197a4b1d7276b113ec6b01
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It isn't actually necessary to drain the cq before
destroying it.
Change-Id: I6f77ae578176a14b5de935274a14cfd165229ec5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This logically belongs inside the session handling code, not
in the transport-specific layer.
Change-Id: I93b2271f38dbfc742162c98c40acb153c7e9022a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Track and print out the currently outstanding I/O in debug
mode with rdma tracing enabled.
Change-Id: I0a1f0cd6e22dbf21e18ca0ec7d0c2c6d194509e3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Instead of reimplementing handling for checking the
completion queue, nvmf_rdma_accept can now call
the general purpose poller.
Change-Id: Id2c899d1e500a8cb8491e51cc101a1bf0e167764
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
AER breaks our current model of requests/completion pairs.
Temporarily handle it by immediately re-posting the
capsule while we work on a real solution.
Change-Id: Ie7a4d88030b6fff5a11c4697eec0f024f9737f27
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Inline this code into the places that called it. These two
spots will be combined into a single path in a later patch.
Change-Id: Ice2f009ad56b783dc28ebbf1abbb877ce6000293
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is an RDMA-specific operation, so hide it inside
the transport-specific layer.
Change-Id: Iaa097e8dde78d820547b3a39e9717c992581340b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These can be done at the same time now that the queue depth
is known ahead of time.
Change-Id: I7ecef30ebb4311e0a1c88f37461d34534f8600bf
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Calculate queue depth into a local variable without
touching the rdma_conn.
Change-Id: Ie804ed39ddecbf59015a4e4f7aa127f1381d9080
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Make sure the trace history that is exported via shared memory is always
the same size, regardless of DPDK configuration.
Also removes the necessity of including DPDK headers from spdk/trace.h
(so we have to fix up other files to include what they use).
Change-Id: I32f88921fd95c64a9d1f4ba768ae75e2ca5d91da
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is not currently configurable, but this will allow us to make the
discovery subsystem have config options (e.g. which lcore to run on).
Change-Id: I788a64ba4462b023453191e509ce8de59fd90ae4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is a much simpler approach and is only slightly
less efficient.
Change-Id: I909de376d576a74156c1be447e90e7dbc240f025
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Drop the redundant controller ready check.
nvmf_process_io_cmd() was checking CSTS.RDY, but this is not necessary,
since its only caller, spdk_nvmf_request_exec(), is already checking
CC.EN, which always matches RDY in our virtual controller
implementation.
The initialization of status is a dead store -
nvmf_complete_cmd() always writes the full response, and the only other
branch is the return immediately below the call, which also sets status.
Change-Id: I1ec2b8a225a91c4b2997d8ab4f45d050cc216de3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
No reason to use DPDK in this file just for an equivalent to assert().
Change-Id: Ic6932a16d0a36cd1a3cb25c8cc5e295c59f3e2db
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Temporarily set the in-capsule data size to the maximum data transfer
length. This should actually be updated by the transport layer, but for
now, the only transport (RDMA) supports the full bounce buffer size.
Also drop the check that prevents admin connections from using
in-capsule data; the host may send in-capsule data for the Connect on an
I/O queue, and we don't know the type of connection until after Connect
is processed.
Fixes: 828dca7 ("nvmf: Move some stray session init code to the right place")
Change-Id: I369ee5497247d7e875ad0b6f0aaf6c47c1d3887c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make sure no response fields are left over from the previous command in
the spdk_nvmf_request.
Change-Id: I42937e991d9dd6550fd4bc9b6d0dd66b44c6b83e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_nvmf_request_complete() always sets CID to the value in the
command, so there is no need to set it in the command execution
functions.
Change-Id: Ibbe745b862e27fff7c55e553758ef093e3ef7f6d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the passthrough command for all Identify commands except Identify
Controller.
Also only check the CNS field of CDW10 and use the new enumerated names
instead of magic numbers.
Change-Id: Ia94f820ac85a2d6b2d0ae02659e73c53f1b1a4cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If we connected a subsystem twice from the initiator, the second
connection will be rejected by the NVMf target, however, the previous
connection will also be impacted because we destroy the connection id
before ack the disconnect event.
Change-Id: Ib597cc68a7823524460693053898f4d6e5499eb4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
There is no need to handle Read and Write commands separately; the
generic raw I/O command case can handle them just as well.
Change-Id: I8475eed0a20bd809c447ed2ccac0b99f6c2a9b4d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Replace use of the newly-deprecated rte_mempool_count() with the new
name, rte_mempool_avail_count().
Also add a compatibility wrapper so that builds against older DPDK
versions still work.
Change-Id: If3c44bdef4bbcf7a456a1dfa272348ccc6f35261
Reported-by: Jay Sternberg <jay.e.sternberg@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The host is not allowed to send normal admin or I/O commands until the
controller is enabled (via the Fabric Property Set command).
Change-Id: Ib62be3a3792fc0b36bace28b4c9afdf78dad3bcd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Only allow Connect on a new connection (one that has no associated
session yet), and only allow Propert Set/Get on admin queues.
Change-Id: Iae22379ee47b095333372e6d151a7a1509acf654
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe spec requires that the I/O queue entry size values in CC are
set before any I/O queues may be created.
Change-Id: I4f0c9a9c20411223d281993745c85a8431197961
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Track each individual bit in the Set Property handler for CC, and fail
the request if any unhandled bits are modified.
Also add handlers for IOSQES and IOCQES (I/O submission and completion
queue entry size).
Change-Id: I374dc3c15197e029ba07fd9ee1cff0e38a0a884d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is not implemented yet, but add a message to remind us to write it
later.
Change-Id: Ic1c35a0d35f728bc63b38c334d9c622493bee967
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Property Set of CC.SHN is not supposed to terminate the session - remove
the commented-out code that was attempting to do this.
Change-Id: I1db230df9be549764287a8fd45ccdebea1d22a8b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Set CSTS.SHST = 10b to indicate that shutdown is complete, and
CSTS.RDY = 0 to match the state of CC.EN.
Change-Id: Ia651c34427526a38f22cba3910df2cf7d4bedd92
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Explicitly include spdk.common.mk at the top of all lib Makefiles so
that CONFIG options and other predefined variables are set.
Change-Id: I1e560c294fe8242602e45191a280f4295533ae44
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is no need to allocate ibv_sge structures within the RDMA request;
we can just fill them out on the stack right before submitting each
request.
Change-Id: I438ff0be2f6d07ffa933255c92c4ec964aa1b235
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Just return success or failure - the actual count was not used.
Change-Id: I26e7c4c6319af444d221d9b0f313fb7071733619
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
All of the WC events that we handle map back to a request, so look it up
before checking the opcode.
Change-Id: I1b70a773374f64387df0a21a4f7fd64b26534b14
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make sure all tracelogs in rdma.c use SPDK_TRACE_RDMA.
Change-Id: Idc3d3b6654215b5ab3ee84a106e46ffd3019cc7a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These NVMf spec structure definitions are the same as the equivalent
NVMe structs.
Change-Id: I21c45973b7843e3767c48f97ec42e7b446df296f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There was only one function and a structure declaration
left.
Change-Id: I63277b4182120e7a76a925ed0bf7378ec7c23f20
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These can be simplified and merged into the subsystem.
Remove the concept of mappings from subsystems and replace
it with a list of hosts and ports. The host is optional -
not specifying a host means any host can connect.
Change-Id: Ib3786acb40a34b7e10935af55f4b6756d40cc906
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Make the transport responsible for filling out the fabric-specific
details in the discovery log entry.
Change-Id: I41d871c605becd557dca18f8ef7e80da66950257
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the core NVMf to transport interface generic and allow for multiple
transport types to be registered.
Change-Id: I0a2767a47d55999c45f788ae1318bb50af60ab4e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change the Port configuration file entries to a new format:
[Port1]
Listen <transport> <address>:<service>
Initially, this still only supports RDMA, but the new format will allow
specifying other transports once they are added.
Change-Id: Iadfd19b91db57b571064379368dbe77204ccecbb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Each subsystem will run on a single core, which is more than enough
to fully saturate a device and a NIC. For now, all subsystems
run on the master lcore.
Change-Id: I95340a262d70fd346fa81fe519e7d4190a369e64
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Instead of starting the connection poller immediately upon
the connect event, wait for the first connect capsule to
start the poller.
This builds toward associating all connections with the same
session with the same lcore.
Change-Id: I7f08b2dd34585d093ad36a4ebca63c5f782dcf14
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
It can be different per fabric interface within a single port.
Change-Id: If13590d7f12291499ccfd705efaf6d2b1b1d7003
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The type is already stored in the fabric_intf.
Change-Id: Icd33dd29f2fa1313329b4053892693c7ff90945d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For now, it just contains RDMA, plus a raw byte array to allow generic
copying.
Change-Id: I02fe11f99dd8b49000de0dba991cd34c99fd7a4a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Pull out the duplicated min checks against the ibdev_attr values.
Change-Id: I774c355ba669486afde5c05c55a4ed653723db98
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Set a status code in the response capsule for each possible error case.
Also enforce CC.EN == 1 before I/O connect.
The NVMf spec requires that the controller is enabled before any I/O
queue Connect commands are allowed.
Change-Id: If56d6b4d6bedad00e9e845e77f05f715e3969f8b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Drop the debug print in conn.c that was the only user.
We still have the connect data structure when determining the connection
type, and after that point, the queue ID is not needed.
Change-Id: Ida9e170099f977ec6b84478874863c40d6f7d8a1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMf target is being refactored to split the RDMA transport-specific
code into its own file. Once this is complete, we should be able to
plug in other transports and build the NVMf target without any RDMA
dependency if desired.
To enable this, change the CONFIG option to RDMA; it still controls
whether the whole NVMf target is built for now, but once the RDMA
dependency is actually made optional, we will be able to build the
generic NVMf target code without libibverbs installed.
Change-Id: I8cd90a9aaa85dcefcc9b0f8f2e7b6af21958b2a8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the configuration file parsing for subsystems
into the configuration file parsing file.
Change-Id: Ie16e73cdc65fae7f2f3c3b22f9cba7f167024fa1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The code for parsing the configuration file still
referred to a host as an init_grp, so fix it.
Change-Id: Ifa250b09de495dd7d393ccc3557fd6d56a54e790
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This never really made sense, so replace it with a list of
subsystems.
Change-Id: Ie7a9400083c091ac7142d01c23948200f515bdf7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is just extra complication for no real benefit.
Change-Id: I528af98e799d0641e753390fe35ff561fa3d7d76
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Use the number of devices returned by ibv_get_device_list() instead of
stopping at 4.
While we're here, drop the unused MAX_SESSIONS_PER_DEVICE definition
too.
Change-Id: I21ca6c6c95b7f2cccc1de4d0a34b95217a522bfc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is the only file that calls it, so it can be static.
Change-Id: I47573b7b38b40ad37e758234245eedbe94ae0a12
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These were internal-only APIs; initialize just checks to see that the
pool was initialized (which is already checked internally), and shutdown
just called spdk_nvmf_shutdown_nvme(), which we can call directly.
Change-Id: I95e1b912d61a38fa9934f58df7b1512678303452
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These can be isolated in rdma.c rather than being part of the generic
transport API.
Change-Id: Idc2b969a2f7685420cda2f7c4aa12495ffc3fcbc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Just calculate the required number of requests once and store it in a
global variable.
Change-Id: Iffeb637a3ac5f69ec89989b84f03699bac483b6e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There can be only one session per subsystem.
Change-Id: I8ba85a5ebd11dd71fda2a4bafa97a0935609379f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is just a duplicate of the NVMe library request_mempool.
Change-Id: I2a5484e5d515b965503b2cfcd8d85ccfcb0dee05
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Clean up everything that isn't strictly necessary in rdma.h.
Change-Id: Ied9acbed5f5b64860eae39816cdcb74620009a79
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This essentially turns the current nesting (of RDMA conn inside NVMf
conn) inside out. Now the transport owns the connection structure and
allocates it when necessary.
Change-Id: Ib5ca84e2a57b16741d84943a5b858e9c3297d44b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This sets up the RDMA layer to be able to embed the NVMf conn inside the
RDMA conn.
Change-Id: I5e3714ac8503826504d78d06fb5eaafabd025bb8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The whole cleanup process is now started by
spdk_shutdown_nvmf_subsystems(). Each subsystem will clean up its
session, if any, and each session will clean up its connections.
Change-Id: I9915d4547751ed4ffc4baa2c45c628698dd0b881
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The per-lcore connection counter was incremented and decremented, but it
is no longer actually read. The lcore allocation should happen at the
session level instead.
Change-Id: I7bdf1b521bfda4892304338d43fad3ed5123c494
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Nothing actually maps the shared memory region, so there is no need to
allocate the array of connections that way.
Change-Id: I3d5eca748f892e37fbb0ec52942f1c510e9f9dc8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is only one controller per subsystem, so therefore
there can be 0 or 1 sessions. Change the list of sessions
to a pointer that can be NULL if no session exists.
Change-Id: I2c0d042d9cecacae93da3e806093faf0155ddd6e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Subsystems only have one controller, so cntlid
is always 0.
Change-Id: I690a1793ad3a696adbaefca856e559dd0177b11a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This was intended to track the number of NVMe device
queues per session, but there is only one hardware
queue per session. It was conflated with the number
of RDMA queues in several places as well.
Change-Id: I74a1c56a5d395dea8bee4778882821e904cebcf9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Everything can be done when the session is created.
Change-Id: I7cb38c093b2b1b69460cabba465828eed0cec432
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The cntlid is inside the session, so no need for
duplicate data.
Change-Id: I5669ee6393807959506dfec36a7583af77386fc4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Since we only allocate workers to the master lcore,
remove the logic that places I/O conns on the same
lcore as the admin conn.
The "right" logic would be to place the I/O conn
on the same lcore as the whole session, and this
patch builds toward that.
Change-Id: I8983b56de41062ec834b0a169ba0fa61326c466d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Temporarily, only run on the master lcore. This makes
some temporary refactoring possible that is required
to move to a truly scalable threading model.
Change-Id: I13a2e03107a27f8ec18b023b15f653d374a137b5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
A connection function was initializing some session data, so
move that code to the function that initializes the session.
Change-Id: I5f2d4349585cb97985a7bbd9fb8d6c66eeaa7d4e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
There was an extra layer of indirection complicating
things for no reason. This removes it.
Change-Id: I8d4e654eb17f8f6ec028d775329794f0745fb0f7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The NVMf target set the maximum data transfer size(MDTS) to the default value
of 128KB now, and the initiator driver will read the value and set it to the
block layer, so each command sent from initiator will not runoff 128KB.
Change-Id: I1d4f259e887b2fc70c7f1c5406c07c58f7fc9b8d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
If any completion indicates an error, we need to close the connection.
Change-Id: I50b30aa692ae121932f1baec32f713422ff415ed
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
NVMf does not have the concept of subsystem groups; the (former)
subsystem_grp files really contain structures and functions related to
individual subsystems.
Change-Id: I4b3a64de799fffb29f8685ea4908d754516815cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Create a list of valid properties with get and set callbacks (set is
optional to allow read-only fields).
Remove handling for fields declared as "reserved" in the NVMe over
Fabrics 1.0 specification.
Also simplify the vcprop structure to only contain the required fields.
Change-Id: I14d3ddfd008c62b75fce8e64d193c87fb6f7b5ad
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Multiple NVMe controllers within a subsystem does not work correctly,
since we would need to virtualize the controller data, namespace IDs,
and so on. For now, only allow pass-through mapping of a single NVMe
controller per subsystem.
Change-Id: Ib2d3576d2856c46a086f38eb6bec56f3e7a73575
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, we used cap_lo and cap_hi to represent the 32-bit halves of
the full CAP register. However, it is simpler to keep them in a single
64-bit structure, and is no less efficient on 64-bit platforms.
Also name the NSSRS field from NVMe 1.2, which was previously reserved.
Change-Id: I1d5d9b0dccbb12373b4aed3db29c883881d43223
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The bb_sgl must follow recv_sgl make the logic obscure.
Change-Id: I8d47477986efd8f2d4ed964ab9373b7f157af274
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Admin commands technically don't allow inline data,
but there is nothing from preventing us from posting
a recv buffer that could handle inline data. It just
won't be used for incoming admin capsules.
Change-Id: I3e7e4406e01ab870654a166d52221c11fc0ac683
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
We need to bind to each port declared in the config file; there is not a
single global port number.
Change-Id: I41c315588078d131c32cb145d22314047505c95c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The access to the NVMf IOCCSZ (I/O Queue Command Capsule Supported Size)
field in the Identify Controller data was incorrect.
Change-Id: I23b0aa175de8e5d8a0220e9c35e0cb6868121cb5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The maximum in-capsule data size is determined by the I/O queue bounce
buffer size, and there is no point in limiting it beyond that, so remove
the need to configure it.
Change-Id: I64806516b847e819f57ac9f62a162f7a04805b57
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
4420 is the officially assigned IP port from IANA for NVMe over Fabrics.
Change-Id: I433a5ed0780d1ffd7ca6512617759d59fa5e8def
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The queue type and queue depth are not known until
the connect capsule is processed. Delay allocating more
than 1 recv wqe until then.
Change-Id: I0e68c24bc3d6f37043946de6c2cbcb3198cd5d1b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Currently, the recv wqe is re-posted immediately. This
closes a small window where we could get more I/O
than we could handle.
Change-Id: I9b0b1f0cc526069033b9e04f170195c4fb130e37
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is going to be used elsehwere in teh code, so
name it according to the public namign convention
and make it public.
Change-Id: Id5fd57e78e146f3235741a251bb30244d6530f2c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is going to be used elsewhere in the code, so
name it according to the public naming convention
and make it public.
Change-Id: I0dcb88e902c5e609fe6acd06ad06743203fcaa60
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Break out the code to allocate a single rdma request
to be used elsewhere.
Change-Id: I687ce5ec862831fed5300157bfb4bf980d22c782
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
When Debug is not defined, SPDK_TRACELOG will do nothing,
thus cmd_type is an unused variable, and will trigger the
compilation warnings. And this patch will solve this issue
Change-Id: I821f7601a16c98e514227aee2e18fbfa61928bea
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The queue depth allowed for incoming commands is set
such that we can do the maximum number of RDMA reads
necessary. There is never a case where a READ will need
to be queued anymore.
Change-Id: I4f7e7f4a59f6358065b82f36a5e22744af210d07
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
There were 4 variables tracking queue depths. In reality,
only one is needed once the minimum is computed correctly.
Change-Id: I9bb890e92a33a3c7bd6e27cbd31d6bee7ca0cf3d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
NVMe over Fabrics defines its own NVMe Qualified Name (NQN) format; it
does not use iSCSI Qualified Names.
Also change the default node base for nvmf_tgt to "nqn.2016-06.io.spdk".
Change-Id: I2b73c1426ef1d8c83cc2df499d79228ea61257cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fix the sizes of the UUID fields to match RFC 4122.
Change-Id: I1458a22579f455cde0a67ee3ce616e78d5c810c2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will allow removal notifications to be propagated to the library
user (e.g. for hotplug).
The callback is currently unused, but this at least prepares the API for
the future hotplug support.
Based on a patch by Dave Jiang <dave.jiang@intel.com>
Change-Id: I20b1c2dbf5e084e0b45a7e51205aba4514ee9a95
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The function call of spdk_nvmf_check_pools can be
directly put in nvmf.c.
Reason: This pool is created by nvmf subsystem,
it should be recycled by this subsystem.
Change-Id: I49e49bcb56079fc25d26b1f5078a1808c2f8e189
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Drop the RDMA-specific fields from spdk_nvmf_request and get them
directly from the command SGL in the transport-specific read function.
Change-Id: Icd06a9018a8c341213fbc8d26d3d7cbf2fb32d30
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The connection will be closed in these cases anyway, so just let the
normal connection cleanup deal with the active tx_desc.
Change-Id: I96c68d5802e189bb82b180cc3c7d7c3f4135be1f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If the transport poll routine fails, we need to close the connection.
Change-Id: Ie534b0f05e6642c31e0450865e309a784abbe744
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If spdk_nvmf_request_exec() fails, the connection will be closed anyway,
so just leave the tx_desc in the active array; it will be cleaned up in
the normal connection cleanup path.
Change-Id: Ie4f60bd6001658403dd7e1c6a47d40be756ef6f2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If an invalid SGL is specified, send a response with a status code
indicating what the error was rather than silently dropping the command.
Change-Id: I12d1fd847d3bc0ea8de7698e934626c2586a7452
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make all command processing functions return a bool to indicate
asynchronous (false) or synchronous (true) completion.
Change-Id: I7c2e4d28fa473b36ff26c902e4bb69f38b64d18d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Only use SPDK_TRACE_RDMA within the RDMA transport code.
Change-Id: Ie15fd24bb142a68f3661929267ebe396b556c351
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The error case could only be reached with tx_desc != NULL in one case,
so move the cleanup code there and drop the goto.
Change-Id: I7aace6b40dd75ef8d86fb173f9d58110e929b082
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Also split the generic nvmf_trace_command() function out of
the RDMA-specific handler and move it to request.c
Change-Id: If29b89db33c5e080c9816977ae5b18b90884e775
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Also finish up the req_state -> req conversion.
Change-Id: I131dd52dcd36a790b942e06f0207a3274cc04ffc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The overflow condition can't happen unless there is a programming error
in the nvmf_tgt library; we can only possibly receive command capsules
(sq entries from the point of view of the host) if we have posted a RDMA
Recv for the command capsule memory region.
This means that we also don't need to track sq_tail in the NVMf library.
Change-Id: I101509080c744528871e72fa46d188e2850c928a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The "immediate completion" cases in spdk_nvmf_request_exec() already
call spdk_nvmf_request_complete(), so the ret == 1 case in nvmf_recv()
is bogus.
Also fix a couple of spdk_nvmf_request_complete() calls in
nvmf_process_admin_cmd() that should be handled by its caller.
Change-Id: I41b865d5e6e7fec08087faf9c6f3da3b057a5fb2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These are not supported (and did not actually function) in NVMe over
Fabrics. Queue creation is handled automatically when new connections
are initiated.
Change-Id: If3a10e5df2f0625537b2c453cd8c835e570fa31e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move toward making request.c transport agnostic.
Change-Id: I25fbe74fff21a5c23138e1a6e2d40bc6a4a984ec
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make nvmf_post_rdma_read() interface generic (don't require a tx_desc).
Change-Id: I331a93eed4bb1912a47a88bb904cf392fcc364c6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This fixes an oversight that allowed in-capsule data block SGLs to
potentially refer to more than the received in-capsule data size.
It also makes spdk_nvmf_request_prep_data() less dependent on the
RDMA-specific rx_desc/tx_desc structures.
Change-Id: I34d61aca4cf5ba033849673116d16ec90488dcd4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is the same as spdk_nvme_cpl, aside from reserved fields.
Change-Id: I62b0718dd58c998b4d26a0d1b44ee16d37eff25d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The RDMA read and write commands can determine the desired length based
on the nvmf_request length field.
Change-Id: I97b63289556e7de3c19c5a17ecbacbbbdfc10425
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Replace the generic "msg_buf" naming with command and response.
Change-Id: I19baff43b41a5eb7db9be9d7feec33d17112e320
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The mempool functionality is never used at runtime - all bounce buffers
were immediately assigned to a rx_desc.
Change-Id: Ie2195059858e34b30b07e104739f046c13abc335
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The RDMA tx_desc and rx_desc pools were only used at startup; all
descriptors are immediately allocated and put into a queue, and the
mempool functionality was never used at runtime.
Change-Id: I2882274962550191a555c8483b8f7be2854b32ec
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is an implementation detail of the RDMA layer.
Change-Id: Ib97d6fbd593789eed0b6e746972b8882a3320995
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This code is operating on a list owned by the RDMA
connection, so move it to rdma.c
Change-Id: I8b81f9d1ffc1df489c9b698969725ed0d1db6a06
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These are an implementation detail of RDMA, so move
them into the RDMA portion of the connection.
Change-Id: I68d146019c5d78fbf5e9968abfd7baed2a54a2ed
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Separate out the RDMA connection from the
NVMf connection. For now, the RDMA connection
is just embedded in the NVMf connection, but
eventually they will have different lifetimes.
Change-Id: I9407d94891e22090bff90b8415d88a4ac8c3e95e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This structure will be expanded in future patches.
Change-Id: Ibb04917134243560e09a2a255844739eb33fab65
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The request needs to know which connection specifically
it is associated with.
Change-Id: I492b9968b4d2e307b5af44edee0778478b32d2ba
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
They each only had 1 function left that belonged
in the session.c file.
Change-Id: I405902b02e9316d2dc02d3732d8bc085c2b84d31
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Only move nvmf_request definitions from nvmf_internal.h
for now. Subsequent patches will move more.
Change-Id: If47472542515fd050cc78d95540eb25beee59d2a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Fabric commands were skipping a step, so unify all
types of requests through the same completion path.
Change-Id: I5f38a7e1cdcdf33baf71486d5ddae9f5a6157fac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The nvmf_request structure holds the pair of pointers
for rx_desc and tx_desc.
Change-Id: I3e735979bbdcdc0e70ad78762e289849d41158ba
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This moves some definitions from nvmf_spec.h to
nvme_spec.h based on the latest publication.
Change-Id: I51b0abd16f7d034696239894aea5089f8ac70c40
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The nvmf_request object is generic and is mapped 1:1 with rx_desc.
Change-Id: I397224a3859c3c93d6eca99f7ba7c53ce7963f57
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Instead of searching the global list of connections to find a matching
cm_id, we can just store the pointer back to the spdk_nvmf_conn in the
rdma_cm_id context field.
Change-Id: I39ea16be6a633a1136d65743747b63b600f20e63
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
All of the variables are private to conn.c, so they don't need global
visibility.
Change-Id: I7c24cfc6249a9f8164b162b4b8de0e24c452e0df
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is always set to nvmf_process_async_completion and is only used
within the library.
Also rename nvmf_process_async_completion to spdk_nvmf_request_complete
to clarify its purpose.
Change-Id: Ie737fb60688329bfe329a8553c4a40ff2e5f8f1d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Create spdk_nvmf_request_prep_data(), which handles SGL processing and
data transfer for all command types, and spdk_nvmf_request_exec(), which
executes a command after data transfer has completed.
Change-Id: I51c2196260dd0686db8acca4d8f7c93e17618c2f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The pending type can be determined based on the command opcode.
This also moves the "issue pending RDMA reads" case out of the I/O queue
handling into the generic continuation code; this should not make any
difference for the current case, since the Fabrics Connect command is
the only other continuation case currently, and there cannot be any
pending RDMA reads in that case.
Change-Id: Idddfa496b6e5b7e6da772aa3ab1b9d1a5344771f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvmf_connect_continue() no longer needs the RDMA-specific tx_desc.
Change-Id: I95f6938063e9853aa7dcd419f488b91422ff9b60
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Let the calling function handle the tx_desc if nvmf_connect_continue()
fails.
Change-Id: I25a8cbc4c3be0608bcec8db2fb8c50e55fbe3e8c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Everything necessary for processing an admin command is now stored in
nvmf_request.
Change-Id: I74e75a5b7bb3b406ad167c2b31cab1af7a1f270a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Everything necessary for processing an I/O is now stored in
nvmf_request.
Change-Id: I3f390707ebe83ea66a116dcfda4d0388a6823629
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Keep a pointer to the local bounce buffer in the transport-agnostic
struct nvmf_request rather than groveling in tx_desc/rx_desc to get it.
Change-Id: Ic328d8e2b3a15759ccb149a89fb3562e928ca500
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Now that we have xfer to track data direction, the length field can be
populated correctly for all transfers (including in-capsule data).
Change-Id: I7b2228f3fac80aab983a4103ba095c7bc38e0b21
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This field is used to decide whether data needs to be transferred back
to the host after a command is completed.
Previously, this was determined using the length field, and length was
cleared to 0 after a transfer was completed. However, length will be
used in future patches after a host to controller transfer completes, so
we need some other way to tell what kind of data transfer is required.
Change-Id: I6b27cf7816908394735fc95c15bd5eb40a7c0157
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Since we used u64 to mask CPU cores, the available number of CPU
is 64, for default RTE_MAX_LCORE in DPDK, the value is 128, in some
cases(e.g.: when nr_io_queues > 4) we can get the wrong lcore ID.
Change-Id: Icc334b1bf5b068a310839118be341e61071cff65
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
A transfer using less than the total bounce buffer size is a normal
occurrence and not worthy of a tracelog. Also drop the pointless
conditional.
Change-Id: Ibcdcf693fea439d5034fa51b08b3fbd8fd7df8f2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Avoid using the RDMA-specific tx_desc when the transport-agnostic
nvmf_request will suffice.
Change-Id: Id35bbdfb353cb72e0feb4f5af19e5bd5c86d3ff4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Should set tx_desc=NULL here. If not,When
nvmf_post_rdma_recv(conn, rx_desc) fails,
we would make tx_desc deactive again, and
this is wrong.
Change-Id: Ieabc7e3864b7f124b003d052f66ab8799a1d632e
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
If we just pass NULL to rdma_create_qp, it will do
the right thing.
Change-Id: I9621a5110ace6237a1e47c6e5defb4cac3afc4ae
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The wrappers are much simpler to use than the low
level ib verbs calls.
Change-Id: I4b09a96a60020bc27df9396d40d955733f618837
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
They are only ever called in sequence and do related
operations.
Change-Id: I825abe08deba1dafb405757bb4f2d52062a801ca
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This enables SPDK_NVMF_BUILD_ETC to be moved out of the library as well,
since only authfile was using it before
Change-Id: I10d1145881f9a0358d7effe2d2d9851899413e1b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
SPDK_NVMF_BUILD_ETC will be cleaned up in another commit; it is
currently used both in the lib and in nvmf_tgt.
Change-Id: Ibc5f15cc4341f9d52b29c84defcd332bec4a4d09
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Most of the #include statements in nvmf.h aren't part of the public API.
Change-Id: I0d43dd542a28744a91a4fd0c4c806a991d1e194e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is not part of the NVMf library's public API.
Change-Id: I665d5713343c9185cbdadaef4fedfdc83b8232d6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is only a single global g_nvmf_tgt that can be passed to this
function, so remove the parameter and use the global directly.
Change-Id: Ia1a2a1e6cd3801101ddeb4de5526dd115fa7ef8f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The section is really defining a subsystem as defined
by the NVMf specification. There does not appear to be
any need for a group of subsystems.
This change only updates the configuration file. It does
not remove all references to a subsystem group from
the code.
Change-Id: I38e62735a5ac924dcafacb3c9a332a103d751d4a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The specification refers to this concept as a Host,
so use that term. This only changes the configuration
file usage. Initiator groups are still referenced in
the code and will be removed later.
Change-Id: I897f4dbdfb65d94da1e5a77434fc07a2c18bcdc2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The index should be 0 for fabricintf.
Moreover, when there is no fabricintf found, error should
be returned
Change-Id: I3aa04566a5a318b8c921dd37c8573ed075254266
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
There is no logical split between nvmf.c and framework.c, so combine
them and drop nvmf.c.
Change-Id: I91230c01ed7f171bfed04456b0bfcf0e7ddbc263
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The mutex is initialized, but otherwise is unused.
Change-Id: Ia68adbd430fad391cc465c07dd6e937e90dd2c5c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The code that actually removed items from the list was removed in
addition to the free() call, which caused a hang on shutdown.
Change-Id: If0e843d0d0ebfa28638b12104da880e70b3e548a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>