Previously, we directly assigned the pointer of pool_name
and rbd_name, and this is not safe. After the rpc test,
we found the string value is not correct, so use strdup.
Change-Id: Ibadc57d3cb5b9869b7db5a22c2459769e92edebd
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This requires a couple of related changes:
- I/O queue IDs are now allocated by using a bit array of free queue IDs
instead of keeping an array of pre-initialized qpair structures.
- The "create I/O qpair" function has been split into two: one to create
the queue pair at startup, and one to reinitialize an existing qpair
structure after a reset.
Change-Id: I4ff3bf79b40130044428516f233b07c839d1b548
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the transport ctrlr_construct callback responsible for allocating
its own controller.
Change-Id: I5102ee233df23e27349410ed063cde8bfdce4c67
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Four read/write functions share the same code for checking
IO len and offset. Extract this code into separate function.
Change-Id: I40f0021e70a60c591b048ad3a70b22eaa07af3b4
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
These are specific to local NVMe PCIe devices, so move them out of the
generic NVMe code into the PCIe transport.
Change-Id: Iea2056a4c438b7d3a303b4b5e977ce7aa9e58c05
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows the entire transport structure definition
to become private.
Change-Id: I9ca19edbfc3cfb75b9b113a89bb2b90bc499ab16
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This changes as little code as possible while still creating
a single public API header. This enables future clean up
of the public API and clarification of the exposed
concepts.
Change-Id: I780e7a5a9afd27acf0276516bd71b896ad301c50
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Only call spdk_bdev_io_complete() where IO error is seen.
Change-Id: I829e4c589dbcb47017e810035837a4c61c3428f9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This will allow factoring out PCIe-specific code into a swappable
transport so that NVMe over Fabrics host support can be added.
Change-Id: I4df74dd268d655e3b36e8d6114ebe7d79a24844d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For large writes that require multiple SCSI tasks (one for immediate
data, then one or more for R2T-solicited data), we bump the refcount
for the task associated with the initial immediate data PDU, to
ensure it does not get freed until all of the child tasks are
completed. But in some cases this initial immediate data PDU could
complete after all of the R2T-soliciated data PDUs. The
completion code was not handling this case correctly which would
result in the iSCSI connection thinking it still had outstanding
SCSI tasks when the connection closed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f9c5322755462d1918fde0075c87c84295cb10c
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: Ie2c1f6853bfd54ebd8039df9a0305854ca3297b9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Make sure the function reentrant, prepare for rpc method.
Change-Id: Ie5230e4ac6c9a750e8e779c5e0b67134729c07e3
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This prepares for future scatter-gather support.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie21c4d86c1e932dcaf63cf13d7a7198890595d79
Return void in main I/O path, and have functions
explicitly complete the I/O back to the bdev layer
if any failures are encountered.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia729b0af555f87c2fb36b92e79a47d19a325de7a
Add public function which could be used by rpc method.
Change-Id: Id9d2938801e0acdf0f9827ef2990a54c75aec22a
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This patch removes the lock in RBD module. And it requires
the librbd library supports rbd_poll_io_events function.
Change-Id: I040a7d8369ab4f69f41d1d0233115f885168f019
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This patch makes lun_name:lun_id pair as one object,
the same for the pg_tag:ig_tag.
Change-Id: Ib08450d12bde9b8388d4ae41e214cc0ba64c8b1e
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
For the nvme readv/writev APIs, the PRP checking logic was
incorrectly failing single SGE payloads that were larger
than 4KB. This patch adds a test case for this scenario,
and fixes the PRP checking logic.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6357d620599666046d2cb74d7923dac1f75418c5
Use standard GCC style atomic operations instead of
the DPDK calls. The DPDK calls end up translating
to the gcc standard inline calls in the generic
case anyway.
Change-Id: I0ea760c4e23c3660b082a803bbc174de7250f365
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Remove #includes for all DPDK headers that weren't
necessary.
Change-Id: Ib02522e0f04e64a1c98afceb7508cc0e8d931a9d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This was only used for debugging. Everywhere else
used the spdk_memzone abstraction.
Change-Id: I8a828ea3c7abccb66c8a027cb13de43c560ff7a1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This converts some, but not all, usage of rte_mempool
to spdk_mempool. The remaining rte_mempools use features
we elected not to expose through spdk_mempool such as
constructors, so that will need to be revisited.
Change-Id: I6528809a864ab466b8d19431789bf0f976b648b6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change the type from int to bool and change the name
from data_ref to data_from_mempool.
Change-Id: If1fc11761e63561443ed44d6a0860e416e424df8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This is required since pollers are now directly removed
(rather than scheduling an event) when the unregister call
is made on the poller's lcore.
Without this change, if a poller is registered then
immediately unregistered, the unregistration will seg
fault since the event adding the poller has not executed
yet.
Also add a test case that exhibits the sequence of events
described in this commit message.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6ba0ee224ac1f8f3ebb8e7571714e718bd42db
Use the env library to perform all memory allocations
that previously called DPDK directly.
Change-Id: I6d33e85bde99796e0c85277d6d4880521c34f10d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Enforce exactly one trailing \n, and fix all of the existing cases.
Change-Id: I6218e4700e90aeb647eaee78089530c79993c8c8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: I5f401527c2717011ecc21116363bbb722e804112
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
'virtual' is a keyword in C++, so avoid using it in variable
and structure names in case any files are eventually
included from a C++ project.
Change-Id: I2122750445def63038af68a3000758e33b937f9d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
All completion queues for the same listen address
now share a common completion queue channel.
Change-Id: I42c149fe7e221951e8a3826b1713482c37a265b8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These 4 callbacks can be condensed into two callbacks, which
simplifies the API.
Change-Id: I069da00de34b252753cdc8961439e13a75d1cc68
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Validate the number of unmap descriptors in the generic bdev layer
before calling the blockdev-specific unmap function.
Change-Id: Ib24e7ec63f782f23f2ee3e63393aa8463123fdb4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
All timers have been converted to SPDK pollers.
If an app requires rte_timer support, it should register its own poller
that calls rte_timer_manage().
Change-Id: I8a827a357b344deac76d42357a5a84ac2daabbf8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows users to swap their PCI library from
libpciaccess/dpdk to another mechanism using the standard
method for swapping out the env library.
Change-Id: Ib2248f8b43754a540de2ec01897e571f0302b667
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This allows users to swap out SPDK's third party
libraries for an implementation based on their own
framework.
Change-Id: Ia0b7384ce5e31acba5ad0d7002dec9e95b759c52
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The new env library will wrap all third-party library
calls and be easily swappable with alternate implementations
at build time. For now, it's just the memory library
renamed.
Change-Id: I26a70933289f8137107208ba75f7520fd7a33da0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Increased printed data width, added data offset indices.
Change-Id: I44f81396e33870109c2bece5e152657f8a24a56a
Signed-off-by: Krzysztof Jakimiak <krzysztof.jakimiak@intel.com>
Preparation for SGL support (readv/writev).
Change-Id: I14a116d764ebc582ea0a0077cc5a0d0bac638cb0
Signed-off-by: Krzysztof Jakimiak <krzysztof.jakimiak@intel.com>
This patch also drops support for automatically unbinding
devices from the kernel - run scripts/setup.sh first.
Our generic pci interface is now hidden behind include/spdk/pci.h
and implemented in lib/util/pci.c. We no longer wrap the calls
in nvme_impl.h or ioat_impl.h. The implementation now only uses
DPDK and the libpciaccess dependency has been removed. If using
a version of DPDK earlier than 16.07, enumerating devices
by class code isn't available and only Intel SSDs will be
discovered. DPDK 16.07 adds enumeration by class code and all
NVMe devices will be correctly discovered.
Change-Id: I0e8bac36b5ca57df604a2b310c47342c67dc9f3c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Provide a convenience wrapper for general purpose dataset
management commands. The previous wrapper for deallocate
was difficult to use correctly and only for deallocate.
Note that the name is "dataset_management" as opposed to
"data_set_management" to match the NVMe specification.
It's questionable whether "dataset" is valid English, but
it is best to match the specification.
Change-Id: Ifc03d66dbabeabe8146968cf8a09f7ac3446ad68
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Namespaces can be allocated but inactive, which causes
the identify namespace command to fail. Handle this
case so that attaching to the controller does not fail.
Change-Id: I9d692f8e7841a9315a737b0a5e44d9b4e4484a13
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Modify the spdk_poller_unregister() function so that it works correctly
when unregistering a poller from its own callback function.
Change-Id: I57fa5ebd8a8bad522e34f597b406a4726f1b76ad
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch will add a new bdev module, rbd.
It can make ceph rbd as the backend of iSCSI
target.
Change-Id: Id5eb3b159ee607052e3c33a2e59d721739fd9977
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
These offsets are passed to the bdev I/O functions, which take uint64_t
offsets.
Change-Id: I1d597d066dfb64b6c7658906e7ee8e6fb2f8e4db
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The offset variable is used to store the result of a uint64_t * uint32_t
multiplication; a signed integer is not the correct type for the result.
Change-Id: If1fb22314ba7e3cec91808cc051678f809c9e58b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This code calculates the difference between two pointers and then stores
it in an off_t, which is intended for file offsets.
In this particular case, the offset will never be large enough to
overflow off_t, but use the correct ptrdiff_t type anyway.
Change-Id: I6b159bf0286a7f5962d08b9894538f4d99c8647b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
off_t is problematic for use as a file/block offset: it is signed, and
on 32-bit platforms, it can be 32 bits (depending on the settings of
_FILE_OFFSET_BITS and _LARGEFILE_SOURCE).
The blockdev layer already uses uint64_t to represent offsets; replace
the blockdev module uses of off_t in internal functions with uint64_t
to match.
Change-Id: I77a2e594572c56f1cd8a7a080f985ea5b27c35f3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The free function is missed for dev in spdk_io_device_
unregister function.
Change-Id: Id212344dfcde2ae4780c631e3443f530ef25cfd1
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
For process_read-task_completion, add a
new paramter and remove the duplicated code
since this function is the critical path
Change-Id: I6a56327def717ee965c701383f01d6745a8c6988
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This feature should only be used if clients are coordinating
with one another.
Change-Id: I89a437441a7e3fbcc1e5f6efa1c8e970ade7c2ec
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
We already require the assert header from the C standard library,
so use that instead of RTE_VERIFY to further isolate DPDK
dependencies.
Change-Id: I4a718af858c88aff6080e33e6c3dd533c077b8f4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Some subsystems may wish to create unique I/O channels
which are not shared across all users of the same I/O
device on the same thread.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ade3675d57338cf85b6a301285e6f392bd6cd2e
We need to return -1, when there is still tasks. From the
usage, return 1 is wrong.
Change-Id: Ibf1b53e0be92818c73590c0b4211d34332073c74
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Fix the existing cases (all missing void in parameter lists) and enable
the warning to prevent new ones from being introduced.
Change-Id: Ieaf00b3dfd5daf1e21fcbefb124514882e8996c9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch aimed at avoid run out of large rbuf for read commands
Change-Id: Ibc42b2216e929f8dfa59cba1b32ae8d52a1a345e
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
bdev and copy modules no longer have check_io functions
now - all polling is done via pollers registered when
I/O channels are created.
Other default resources are also removed - for example,
a qpair is no longer allocated and assigned per bdev
exposed by the nvme driver - the qpairs are only allocated
via I/O channels. Similar principle also applies to the
aio driver.
ioat channels are no longer allocated and assigned to
lcores - they are dynamically allocated and assigned
to I/O channels when needed. If no ioat channel is
available for an I/O channel, the copy engine framework
will revert to using memcpy/memset instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99435a75fe792a2b91ab08f25962dfd407d6402f
I/O channels are not actually used for I/O yet however.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaa3774ecacc7ec206c7c0c66e6b2f2d10c8fa785
This will start testing the I/O channel allocation paths. I/O channels
are not actually used for submitting I/O yet however.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I901402633248170324db1e2fc8fb813f7629c2b0
This will help catch any cases where I/O channels are not
released during shutdown.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I96cf93218026b9ef319abcf0662fe258bf75174d
Also implement these functions for all of the bdev drivers in
the tree.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idea97743d601150044b1fe2d9d76e922d46d3ee1
This patch adds a basic framework for creating I/O channels
for I/O devices. An spdk_io_channel represents a one-to-one
mapping between a calling thread (represented by spdk_thread)
and an I/O device that the thread will perform I/O operations
on.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I658ab7f995cc962f4e2a204e058cdd3ad3fd735d
Change the return type from void to int so that the result of
spdk_subsystem_fini() can be reported.
Change-Id: I811c25513e41573ca0c9cb111512d7705d107f66
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Instead of polling for only 1 completion at at time,
poll for batches of 32.
Change-Id: I5ef99a270489e7b3d2a58cb765915f187775a93e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Purpose: To make the function definition style consistent
Change-Id: I7ade943881aa5076fdd419958e386ae3c3661da6
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
SPDK_NOTICELOG() can already do printf()-style formatting, so there is
no need to use snprintf() on a temporary buffer first.
Change-Id: Iffb5369b74f27fb2c4b3ac07ea0cdeab52258ba1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch aimed at avoid run out of large rbuf for read commands.
Change-Id: If10f45292da5d5a26c2e338f1ddeafccedb88a4c
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
There are some error paths that can get to spdk_iscsi_conn_stop_poller()
before the conn's session is set. So check whether the session is NULL
before trying to check its session type.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I352a2aa513541ba630ace368137433e509700e32
1 In our nvmf tgt implemention, we use the async
mode to delete the nvmf subsystem. However, when
we parse nvmf subsystem, we need to use the sync
function to delete the nvmf subsystem. Since if
there is error, we will call spdk_app_stop, thus
async functions will not be executed. It is
approved in my local test.
2 Add debug info in spdk_nvmf_delete_subsystem
Change-Id: Ia8ecd6eee1bbd25cb3e1ceeb0e2146f3f03be228
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This ensures against races, when an existing session to a target node
stalls, causing the initiator to create a new session. These new
session's connection may get migrated to a different core than the
core of the stalled session.
In practice, this does not happen, but is a common occurrence when
debugging the iSCSI target using gdb.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1864c2ca0c330dc4faeeb1312adac7a02c8281dc
This enables some future changes which will use per-thread
nvme_qpairs.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1efcacfa6aedc970656633c9ce1393dc9b4fdbcc
This breaks out the resources needed to perform
aio-based I/O into a separate data structure, as a steps
towards some future patches that will enable per-thread
resources to enable parallel I/O without synchronization.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I84b95713133f9c411863ff0aeef8f886a08e0857
While here, also break out a new ioat_poll() function which
takes the ioat_channel as a parameter. This will be reused
for some future refactoring.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9c03577e8d90d9bbd4d7adb9c186f21f54b85e82
This moves towards a single pair of functions where code can be placed
that must execute on the polling thread before the poller starts execution
and after the poller stops execution.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2df7bacaa7b173f495c41c7cc79bafae53a57729
This list was originally intended to ensure blockdev I/O operations
with a malloc backend would not be completed until after the blockdev
I/O submission routine completed. This is no longer necessary, since
blockdev I/O completion operations are now handled by events. Removing
this simplifies the memcpy copy engine implementation significantly.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4d318bed996694e49946d67baa3c2403d4bbef7a
ibv_poll_cq is actually an expensive call to make, so take
steps to begin to minimize the number of times it is called.
Change-Id: I6fc64979604220eb8cacd612b46e3a3b1bca0924
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The out-of-bounds case in the bit array accessors should not happen
normally, so help the compiler order the basic blocks correctly so that
the in-bounds case is the fallthrough path.
Change-Id: Id778e724b3a58c17c728b8544c2653c60d90a6ba
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
My previous pdu leak fixing patch breaks the
large logic for large read, and this patch
fixes this.
Change-Id: Ic3f654527f7addd4ee45aad53a752de72a84edfd
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This matches the general order (LBA start then LBA count) for
the NVMe API.
While here, fix a copy/paste error in a debug message (write
instead of writev).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ice326af5d6025867dffed4d1f6c7b81fb9eba5eb
Set status code to invalid opcode when opcode is not supported
in nvmf_process_discovery_cmd.
Change-Id: Ibab8097e536f26f16c322d5f539277688906cfc3
Signed-off-by: Liang Yan <liang.z.yan@intel.com>
Rather than forcing the NVMe library user to pass a specially-allocated
block of memory (e.g. rte_malloc() in the case of the default
nvme_impl.h), just make the NVMe library allocate a suitable buffer
itself and copy to/from the user buffer as needed.
The fast path I/O functions still require special rte_malloc()
allocations, since we don't want to add an allocation and copy to the
I/O critical path.
Change-Id: I7fe88c0ba60c859a33bbe95b7713f423c6bf1ea8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spec does not define NQNs as case-insensitive, so replace the
strcasecmp() matching of NQNs with strcmp().
Change-Id: I5946d9ee8e1d0aa5966e9b1b3c6f14f3f5119aec
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is pdu memory leak issue. The reason is that
we did not correctly handle the read pdu task.
Change-Id: I719c87fe7825537b9c77f5ee7e0816671de4c051
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
1 Rename this function and make it more meaninful, since
we have spdk_nvmf_session_connect which is used to link a
connection to the session
2 split spdk_nvmf_session_destruct.
Change-Id: I150df7ccdf4de3428d8cecbb286d5f7944510a8c
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Fix copy-and-paste errors - when polling the recv CQ, we should print
"Recv" instead of "Send" in log messages.
Signed-off-by: Roland Dreier <roland@purestorage.com>
This can just directly assign the completion instead
of calling memcpy.
Change-Id: I07819c824eba45245b00fa3538a99bc81bcb9fcc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This function always shows up as one of the hottest functions when
profiling. I believe it is the memset that is expensive, so instead
use default initialization when the wr is declared on the stack
and just set the members that need to be updated in the function.
Also make the function inline for good measure.
Change-Id: I29e24cdd375311fa033b5a6df772ff4f73e35302
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
We need to free the session resource, if there is error
for creating a new session
Change-Id: I7c4f3e779e0b30e213e02b8676d93bd2fe9bf851
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The application is now entirely responsible for scheduling subsystem
pollers and sending events between threads.
Change-Id: I88da1f53b5e8852c7c4acd6f0a7a1e2219fbed41
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reason: In acceptor_poller_unregistered_event, we
directly call spdk_nvmf_check_pools and spdk_app_stop,
it will fail the memory check.
And function nvmf_delete_subsystem_poller_unreg will
not be called since we already call spdk_app_stop.
Change-Id: I3ffa30c87b149a66cee1d87d1bb81d4dc8cc96b9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
flush the data in pdu to client if the pdu are ready and sequential.
Change-Id: Idf0ec0c7f6058790a85407dff324900fd36c9527
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Our SCSI translation layer only fills 4 version descriptors
meaning the last 30 bytes of the 96 byte standard inquiry
data format are not used. Some compliance tests expect
the full 96 bytes to be returned, even if they are unused.
So zero the remaining bytes (up to 96) if those bytes were
allocated.
This fixes a regression introduced by recent commit d3b58c006.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id61614b904b5dff39f034b7ba4da624be1b25bae
The translation code currently cheats a bit - it allocates a full 4KB
buffer for any DATA_IN command that is not a READ, and then the
different SCSI commands that fall into this category (INQUIRY,
READ_CAPACITY, MODE_SENSE, etc.) can write as much data as they
want without having to worry about a buffer overrun. Code higher
up the stack makes sure we only send the correct amount of data back
to the iSCSI initiator.
This patch fixes this behavior for standard INQUIRY (EVPD = 0).
Future patches will fix the behavior for other non-READ DATA_IN
commands, at which point we can remove the 4KB allocation and
only allocate the amount of data specified in the CDB.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If5e4a10eeba9851e2d91cab71228d2fc2d5baad0
The "+" is not correct, should be "-". Currently,
the issue doest not happen since the offset is 0,
then both + and - is OK. But if we adjust the location
of spdk_nvmf_conn or spdk_nvmf_request, we can find
this bug.
Change-Id: Ib358dc729da901a69442d0402a6089989f49b05c
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The table of bdev function pointers should not need to be modified at
runtime.
Change-Id: I3e8876fc83df9296ce528231269b1a905c96072c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If a bdev doesn't need to be polled, allow it to specify NULL for the
check_io function pointer to indicate that no poller needs to be
registered.
This will be useful for virtual blockdevs that don't have any associated
hardware to poll.
Change-Id: I0ef8f848587b0c200296805ccc710340dde683b5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When an I/O with children is being freed, also free its child I/O
requests that were allocated via spdk_bdev_get_child_io().
Change-Id: I2d44aed845c1035ae8f8cb07c5992da855f1dc99
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This callback was only used for freeing buffers, but the buffers are now
managed by the bdev core, so none of the free_request callbacks actually
do anything.
Change-Id: Icfe2e6169e829159dda5e3d75a27d8f040de07c6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add unmap support to the ramdisk block device for testing purposes.
Change-Id: Ibeb5530b2b5a31603d09d2d1de07760f32dea0f8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The bdev layer can be used independently of iSCSI, so fix the
misleading names.
Change-Id: I3fd5b113403acdd7578ce93234dde0fd4f148e96
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Check that the number of blocks/ranges in the command fits within the
length specified by the SGL.
Change-Id: I21aded797dc1f1e752fe0bc9cec27310a4fb106a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The Dataset Management command allows several operations to be specified
at once; the virtual controller only supports deallocate for now, but it
should just ignore the other bits in order to be spec compliant: "If the
Dataset Management command is supported, all combinations of attributes
[...] may be set".
The spec also explicitly states that it is acceptable for controllers to
choose to take no action based on information provided, so not
implementing the other attributes is fine.
Change-Id: Ia989dc1faa9c852660bf1299ea18fa8e7bdf4053
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Also add a diagnostic message if the requested log page ID is not
supported.
Change-Id: I7551b5905d5ebc29356839f0f9153dc86f237106
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rather than comparing the bdev name against "NVMe", use the new I/O type
supported API to query whether the unmap operation is supported.
Change-Id: I62c7a1ea5529366ff2ae4723b62f24ea78aa8193
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Bdev modules need a separate interface than public
consumers of the blockdevs.
Change-Id: I581ee493570c114f7e96b31a425bc077a791c71e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This compilation unit depends on bdev.h definitions, but
was only getting them due to #include ordering elsewhere.
Change-Id: I4fcbdb2582a40836bcabc3539cc558614fbfacfd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Some block devices do not support the unmap operation, and we may add
other optional I/O types in the future. Add a method to check which I/O
types a specific block device supports.
Change-Id: I6e6414bf6b6482ea0224022d8326b252bd363c7f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Switch from the non-portable <sys/endian.h> functions (htobeXX/beXXtoh)
to the SPDK endian conversion functions.
Change-Id: Id49b87f2e536c68f0d5d567e78e1990c0a37ef14
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Intel DC P3*** NVMe devices specify a desired stripe size, which was
used for splitting I/O. Not all devices, however, specify a desired
stripe size (such as the Intel DC D3*** line), and for only these
devices there was a logic mistake that overwrote the maximum I/O
size with a 2MB default. This patch corrects that error.
Change-Id: I94b72a3a3dd1dfa18bd638daf7e01a592eb6ed17
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Move the NQN validation into the subsytem creation function, and fix the
allowed size to match the spec.
The spec is not clear about the allowed NQN size; for now, interpret it
as 223 bytes, including the null terminator (222 bytes of actual NQN
plus one terminator byte).
Change-Id: If9743ab2fe009d9d852e8b03317d9b38d8af18dc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
DPDK 16.07 introduced a new PCI ID field for matching by class code
instead of vendor/device ID. Use it to match all NVMe devices instead of
explicitly listing vendor and device ID pairs.
Change-Id: Ib2a5cc6833bf2b793d37d77caab97207f365df8f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
SUBNQN is a UTF-8 null terminated string according to the NVMe base
spec, so pad it with zeroes using strncpy().
Change-Id: I486161b26d91f3ea1fd17428e220b9f20a874732
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These are specified as "ASCII string", which means they should be
left-aligned and padded with spaces, according to the NVMe base
specification.
Change-Id: I25babe0ca417c2e16137b0bfc41fc7834277114e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will be useful outside of the SCSI code, so put it in the common
string utility file.
Also reorder the parameters so they match the order used in strncpy().
Change-Id: I9e25a59b64e4bedf04e5a96de463b1d8aa0ddac3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Clean up the poller and only then free the associated subsystem's
memory. This prepares for future dynamic subsystem creation/deletion.
Change-Id: I9e56cbf8822814930fdbb662095c51b6ad40fbc4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently the NVMf target listens for new connections on any address.
Instead, listen only on the addresses specified by the user.
Change-Id: Idb6d37c422e442fc70a8673bd3fcfb9c27b57828
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
assert is part of the C standard library and is available
on any platform we'd consider porting to. Don't put a
wrapper around it.
Change-Id: I0acfdd6a8a269d6c37df38fb7ddf4f1227630223
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
pthreads are widely supported and are available on any
platform we currently foresee porting to. Use that API
instead of attempting to abstract it away to simplify
the code.
Change-Id: I822f9c10910020719e94cce6fca4e1600a2d9f2a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
pthreads are widely supported and are available on any
platform we currently foresee porting to. Use that API
instead of attempting to abstract it away to simplify
the code.
Change-Id: I28123d427ea8da07c6329b0233f0702f2d85c2a0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Comments are not allowed in the JSON RFC, but some JSON libraries accept
JavaScript-style comments.
Add a flag that enables non-spec-compliant comment parsing.
Change-Id: I9dfb66bb46ecff1a22d8af5a9c50620686a4707c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the event framework's new delay parameter to allow
for idle cores to sleep for up to 1ms at a time.
Change-Id: I665f38e590c07338418892afe0e75b0b2c79706e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
It is no longer needed, since the nvmf_tgt app handles initialization
and shutdown.
Change-Id: I051afe2b4fcbd09b32998386c63f591a0ab343c2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The user can now specify a maximum delay, in microseconds, that
defines the maximum amount of time a reactor will sleep for
between polling for new events. By default, the time is 0
which means the reactor will never sleep.
Change-Id: I94cddb69c832524878cad97b66673daa4bd5c721
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This will be used in future patches outside the library.
Change-Id: I1fcf5709944a884e161e5a6a9eaec033a995a812
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe over Fabrics target library now exposes a simple function call
that polls the acceptor once, and the application handles registration
of the poller.
Also rename the transport function pointers related to the acceptor so
they better reflect their purpose.
Change-Id: I5fa0d516586bf17e73afeb88ff3c2d5b0d46794d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will become more important when other transports are added.
For now, it is also useful to be able to start nvmf_tgt on systems
without RDMA hardware.
Change-Id: I6b9002cc7711f928c4e6b73adcd9b677349ebdd6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_shutdown_nvmf_subsystems() was removing the subsystem from the
list, but nvmf_delete_subsystem() also wants to remove it, so drop the
extra removal.
Also rewrite the shutdown loop as a TAILQ_FOREACH_SAFE() to make the
static analyzer happy (and make it more obvious that the loop will
terminate).
Change-Id: Iccadafa77d9cd3e26be21c0f11e62cfc1ef0197c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Verify that the record format is the one we support (only 0 is defined
by the spec for now).
Change-Id: Iddf038b381e540134abf572e0545c97a0ef71d5f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spec requires that NQNs are null terminated and maximum of 223 bytes
long, despite the Connect command fields being larger (256 bytes), so
add checks for both subsystem NQN and host NQN before using them as null
terminated strings.
Change-Id: I343d9e44a09ab4d0f6654feba460b31e976c4e56
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Since we bind the NVMe device to UIO driver to protect against native
NVMe driver, but for Admin queue, there are still INTx interrupts
exist, as all the completion for Admin queue will be processed in
user space, so we don't need INTx anymore.
Change-Id: Ife5b3e410ae95690ed0f3f9a2f2dfaf55a7797b5
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Users can specify the core for each subsystem and the acceptor listen routine
to run on different cores for performance consideration.
Change-Id: I4bd1a96f39194c870863b4b778e6ea7cf8fc1a2d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
This is causing issues during shutdown because the poller removal is not
synchronized with the rest of the cleanup path.
This reverts commit 7dfc5e922d.
Change-Id: If95c4b72c5d120f18bdc3db6d7d532ad1aada642
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The lcore_id field in the get_iscsi_connections RPC was removed in
commit 5d8c94536a7d1d4c1f0ee3349188bf0e7e8c9e74; add a field to
spdk_iscsi_conn to track the lcore so this can be re-added.
Change-Id: I6c9574829466b168880728f4620401987fc7dd3c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This should enhance performance, since the hardware admin queue poll
function takes a mutex and should not be in the performance path.
Change-Id: I7e4acde0337aaf7079811612cba5348acf0a467d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This leaves more flexibility for future changes to the poller
representation without requiring API changes (after this one).
It also prevents the user from accidentally using poller fields in a
non-thread-safe way, since they can't be accessed directly anymore.
Change-Id: I7677d5b93668665d29ae39c5e0ba74333ad3f878
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Replace other critical rte_zmalloc() sites that actually depend on the
memory being zeroed.
Change-Id: If6856ad44a4c50869811d3ce9411c993ce88018d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Linux block layer driver will use the maximum transfer length field to
split IOs larger than this value. We should set the field according to
iSCSI target limitation.
Change-Id: I03ee35bb96f0949418bb976a6c8013f88622a324
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Allow the tables to be in the read-only data section.
Change-Id: I58199a86d4d44dbad7baed397b2e148c45b3a3de
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
rte_zmalloc() is broken and does not actually return zeroed memory on at
least DPDK 16.07 on FreeBSD, so do it ourselves.
Change-Id: If8da93ead0b3911c8bca24aa27ed90dc00b8a9a4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For VPD page 0xB1 and 0xB2, the scsi target did not return correct
value to the initiator, so return the length with correct value.
Change-Id: Ic17d804ca00d490fd6a2f833db5c9b73ce8dc160
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
This value was incremented and decremented, but it was never used
otherwise.
Change-Id: I6e83a504cf2ef4043363ca04b77556c612068658
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
In files that don't otherwise use DPDK, switch to the standard C library
assert().
Change-Id: I79756908ecf9a2e141b036321e42309db30b5e0f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe submission queue head wraparound point can be determined in the
generic NVMe over Fabrics layer; it should not be using the RDMA
connection queue depth.
Change-Id: I9da8f09e4f057f8fdc1ff4c6cc5f48cea7123e11
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Report the maximum admin queue size correctly.
Change-Id: I52cad654bf59806e0abb8d869c22973647056617
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the max_queue_depth parameter rather than rdma_conn->max_queue_depth
so that we can start to eliminate rdma_conn->max_queue_depth.
Change-Id: I1670c634e6d12aa004fb5a10338b7624850fbc4a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There were two unchecked allocations in the nvmf library. Check
for allocation failures.
Change-Id: Ic6b3104d825dba1ee6bd1748fa99e132702f300c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This fixes a static analysis warning for unsigned/signed
mismatch.
Change-Id: I49bd8d6d195f13b402e14a85503a5de6114f5b7f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is the size of a logical block in bytes; 4 GB is more than plenty.
Also allows cleaning up casts to uint32_t in the SCSI translation layer.
Change-Id: I3ec2e2f41fd378f1a83f31aac25c46ef780f63e9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The large buffer pool allocation was using the per-connection queue
depth, whereas the RDMA memory region registration was using the global
RDMA max queue depth. These sizes need to match, so use the global RDMA
max queue depth for both calls.
Change-Id: Iae161b719e09e19ca3e81df6593b68a4a2e86614
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is a step towards enabling sharing SPDK NVMe
device access from multiple processes using DPDK's
multi-process framework.
Change-Id: I57d5eec158b42addc1036bd2583596471a467a95
Signed-off-by: GangCao <gang.cao@intel.com>
Similar to our NVMf target, this is an iSCSI target that
can interoperate with the Linux and Windows standard iSCSI
initiators.
Change-Id: I6961c5ef99f7b161c396330ed5b543ea29b0ca7b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is a useful abstraction when you want to plug in
a userspace networking layer instead of using the kernel.
Change-Id: I7039d2987e6abad9dcd1987fa105282b1598e2f5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The public header file was missing some required definitions.
Change-Id: Ic4f8028367b1e21ea00c02660ca36be28da54e37
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Use the new timer-based poller functionality to replace rte_timer.
Change-Id: Ic40653306cc73b40139fe18e06bab29b35721a43
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Allow pollers to be scheduled to be run periodically every N
microseconds instead of every iteration of the reactor loop.
Change-Id: Iaea3e98965d81044e6dc5ce5f406bcb7a455289e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Just getting a reference to a bdev should not claim it.
Change-Id: I21e07160662490ec95b52fa31ea1d2ae93a21f09
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Combine the necessary functionality with the main bdev file.
Change-Id: I96d796bc87ac2a8688cdf1fd3c16d2a7c8aef730
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The rte_ring used for pollers is already single-producer and
single-consumer, so it is not providing any thread safety guarantees.
ALl modifications to the active_pollers ring are done from the core that
is running the reactor (via events). This means the rte_ring can be
replaced with a simpler intrusive linked list.
This simplifies the removal of pollers in the middle of the list and
avoids extra allocations for the ring.
Change-Id: Ica149b7a1668a8af1e6ca8f741c48f2217f6f9bf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
We reported virtualized NVMe devices through NVMe over Fabric specification,
with 1.2.1 NVMe version. For direct mode, the NVMe device maybe has lower
version, such as 1.0, the identify namespace list can not support in those
devices, so we need to add helper function here to simulate such commands
from initiator.
Change-Id: I226f4f34bf61017f538d2dd80332f1d054a501f1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Allow higher queue depths by allowing many more send/recv
operations than read/write.
Change-Id: I66c424a6463e5e09be6d5463667241ce9271404b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The target can only provide updates to sq_head inside
of completions. Therefore, we must update sq_head prior
to sending the completion or we'll incorrectly get into
queue full scenarios.
Change-Id: If2925d39570bbc247801219f352e690d33132a2d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This allows the target to poll for internal completions
at higher priority.
Change-Id: I895c33a594a7d7c0545aa3a8405a296be3c106fb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This ensures that the data buffers are not in use
when we go to send the completion.
Change-Id: I30467b3e3964001150f81b21e5b695dcd0974b0c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is useful for holding session-wide buffer pools.
Change-Id: I7024da24b210a2205bf1e159d5935e0093b81120
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
For small SGLs, even if they are keyed and not inline, use the
buffer we allocated for inline data.
Change-Id: I5051c43aabacb20a4247b2feaf2af801dba5f5a9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Read/Write depth is much lower than Send/Recv depth.
Calculate them separately to prepare for supporting
a larger number of receives than read/writes.
Currently, the target still only exposes a queue depth
equal to the read/write depth.
Change-Id: I08a7434d4ace8d696ae7e1eee241047004de7cc5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These don't actually work quite yet, but pipe the
configuration file data through to where it will
be needed.
Change-Id: I95512d718d45b936fa85c03c0b80689ce3c866bc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
For each connection, allocate a single buffer each
of requests, inline data buffers, commands, and
completions.
Change-Id: Ie235a3c0c37a3242831311fa595c8135813ae49e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This can be used to release requests that don't
require a completion to be sent.
Change-Id: I8fb932ea8569bf3c45342d9fa4e270af5510c60c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
PORT IDs indicate hardware failure domains according
to the NVMf specification, which means they should
indicate which transport addresses are on the same
NIC. Unfortunately, that doesn't really make sense for
IP-based fabrics because IP addresses can move. The
safest way to present this is to show all IP addresses
as part of different subsystem ports.
Change-Id: I056a50c69be70b4fbf1f896e684ce65bd792241e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The NVMe over Fabrics 1.0 spec corresponds to the NVMe base spec version
1.2.1, so we should pretend to be at least that new.
Change-Id: I36fc44c780de01d6c666e87b803cd47dba0e74c5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These belong in nvme_spec.h anyway and are not used.
Change-Id: I889dfebee523dc5ae503fd0370bb800f1d17fb5d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is a leftover from a previous controller numbering scheme that is
no longer used.
Change-Id: I3058802f0324b0e38708111634ee993c6e884087
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the ctrlr and io_qpair out of spdk_nvmf_subsystem, package them
as a new data structure. Union the direct and virtual mode namespaces.
Change-Id: I839aee3372c6c57aa03a0be76f8aaeb5045ecdaf
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
CAP.CQR indicates whether contiguous queues are required; this is
meaningless in NVMe over Fabrics, since queue creation is handled
implicitly for each connection, but the spec requires it to be set to 1.
Change-Id: I6b05954eefa6928beecd7a640bbbdbd835c6b69a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the size of the applicable structs directly.
Change-Id: I4a65de548d409c9962b11a75d3fde2bfe434a3ec
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvmf_create_subsystem() already copies the name, so the strdup() in the
caller is unnecessary.
Change-Id: I225f0f077fee30051b197a4b1d7276b113ec6b01
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It isn't actually necessary to drain the cq before
destroying it.
Change-Id: I6f77ae578176a14b5de935274a14cfd165229ec5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This logically belongs inside the session handling code, not
in the transport-specific layer.
Change-Id: I93b2271f38dbfc742162c98c40acb153c7e9022a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Track and print out the currently outstanding I/O in debug
mode with rdma tracing enabled.
Change-Id: I0a1f0cd6e22dbf21e18ca0ec7d0c2c6d194509e3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Instead of reimplementing handling for checking the
completion queue, nvmf_rdma_accept can now call
the general purpose poller.
Change-Id: Id2c899d1e500a8cb8491e51cc101a1bf0e167764
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
AER breaks our current model of requests/completion pairs.
Temporarily handle it by immediately re-posting the
capsule while we work on a real solution.
Change-Id: Ie7a4d88030b6fff5a11c4697eec0f024f9737f27
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Inline this code into the places that called it. These two
spots will be combined into a single path in a later patch.
Change-Id: Ice2f009ad56b783dc28ebbf1abbb877ce6000293
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is an RDMA-specific operation, so hide it inside
the transport-specific layer.
Change-Id: Iaa097e8dde78d820547b3a39e9717c992581340b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These can be done at the same time now that the queue depth
is known ahead of time.
Change-Id: I7ecef30ebb4311e0a1c88f37461d34534f8600bf
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Calculate queue depth into a local variable without
touching the rdma_conn.
Change-Id: Ie804ed39ddecbf59015a4e4f7aa127f1381d9080
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Make sure the trace history that is exported via shared memory is always
the same size, regardless of DPDK configuration.
Also removes the necessity of including DPDK headers from spdk/trace.h
(so we have to fix up other files to include what they use).
Change-Id: I32f88921fd95c64a9d1f4ba768ae75e2ca5d91da
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is not currently configurable, but this will allow us to make the
discovery subsystem have config options (e.g. which lcore to run on).
Change-Id: I788a64ba4462b023453191e509ce8de59fd90ae4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is a much simpler approach and is only slightly
less efficient.
Change-Id: I909de376d576a74156c1be447e90e7dbc240f025
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Drop the redundant controller ready check.
nvmf_process_io_cmd() was checking CSTS.RDY, but this is not necessary,
since its only caller, spdk_nvmf_request_exec(), is already checking
CC.EN, which always matches RDY in our virtual controller
implementation.
The initialization of status is a dead store -
nvmf_complete_cmd() always writes the full response, and the only other
branch is the return immediately below the call, which also sets status.
Change-Id: I1ec2b8a225a91c4b2997d8ab4f45d050cc216de3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
No reason to use DPDK in this file just for an equivalent to assert().
Change-Id: Ic6932a16d0a36cd1a3cb25c8cc5e295c59f3e2db
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Temporarily set the in-capsule data size to the maximum data transfer
length. This should actually be updated by the transport layer, but for
now, the only transport (RDMA) supports the full bounce buffer size.
Also drop the check that prevents admin connections from using
in-capsule data; the host may send in-capsule data for the Connect on an
I/O queue, and we don't know the type of connection until after Connect
is processed.
Fixes: 828dca7 ("nvmf: Move some stray session init code to the right place")
Change-Id: I369ee5497247d7e875ad0b6f0aaf6c47c1d3887c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make sure no response fields are left over from the previous command in
the spdk_nvmf_request.
Change-Id: I42937e991d9dd6550fd4bc9b6d0dd66b44c6b83e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The kernel driver unloading/loading code is Linux specific; replace it
with stubs on FreeBSD for now.
Change-Id: Ic67c1d89b2fb9a65e9ce5b88d27b6cd6af5554a7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_nvmf_request_complete() always sets CID to the value in the
command, so there is no need to set it in the command execution
functions.
Change-Id: Ibbe745b862e27fff7c55e553758ef093e3ef7f6d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the passthrough command for all Identify commands except Identify
Controller.
Also only check the CNS field of CDW10 and use the new enumerated names
instead of magic numbers.
Change-Id: Ia94f820ac85a2d6b2d0ae02659e73c53f1b1a4cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Drop the special-case preprocessor definition for PCI access library now
that config.h is available with an equivalent SPDK_CONFIG_PCIACCESS
define.
Change-Id: I4891d0f2fd7d3eea51b767df9e594555b36265ea
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If we connected a subsystem twice from the initiator, the second
connection will be rejected by the NVMf target, however, the previous
connection will also be impacted because we destroy the connection id
before ack the disconnect event.
Change-Id: Ib597cc68a7823524460693053898f4d6e5499eb4
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
There is no need to handle Read and Write commands separately; the
generic raw I/O command case can handle them just as well.
Change-Id: I8475eed0a20bd809c447ed2ccac0b99f6c2a9b4d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Replace use of the newly-deprecated rte_mempool_count() with the new
name, rte_mempool_avail_count().
Also add a compatibility wrapper so that builds against older DPDK
versions still work.
Change-Id: If3c44bdef4bbcf7a456a1dfa272348ccc6f35261
Reported-by: Jay Sternberg <jay.e.sternberg@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The host is not allowed to send normal admin or I/O commands until the
controller is enabled (via the Fabric Property Set command).
Change-Id: Ib62be3a3792fc0b36bace28b4c9afdf78dad3bcd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Only allow Connect on a new connection (one that has no associated
session yet), and only allow Propert Set/Get on admin queues.
Change-Id: Iae22379ee47b095333372e6d151a7a1509acf654
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe spec requires that the I/O queue entry size values in CC are
set before any I/O queues may be created.
Change-Id: I4f0c9a9c20411223d281993745c85a8431197961
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Track each individual bit in the Set Property handler for CC, and fail
the request if any unhandled bits are modified.
Also add handlers for IOSQES and IOCQES (I/O submission and completion
queue entry size).
Change-Id: I374dc3c15197e029ba07fd9ee1cff0e38a0a884d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is not implemented yet, but add a message to remind us to write it
later.
Change-Id: Ic1c35a0d35f728bc63b38c334d9c622493bee967
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Property Set of CC.SHN is not supposed to terminate the session - remove
the commented-out code that was attempting to do this.
Change-Id: I1db230df9be549764287a8fd45ccdebea1d22a8b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Set CSTS.SHST = 10b to indicate that shutdown is complete, and
CSTS.RDY = 0 to match the state of CC.EN.
Change-Id: Ia651c34427526a38f22cba3910df2cf7d4bedd92
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Explicitly include spdk.common.mk at the top of all lib Makefiles so
that CONFIG options and other predefined variables are set.
Change-Id: I1e560c294fe8242602e45191a280f4295533ae44
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is no need to allocate ibv_sge structures within the RDMA request;
we can just fill them out on the stack right before submitting each
request.
Change-Id: I438ff0be2f6d07ffa933255c92c4ec964aa1b235
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Just return success or failure - the actual count was not used.
Change-Id: I26e7c4c6319af444d221d9b0f313fb7071733619
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
All of the WC events that we handle map back to a request, so look it up
before checking the opcode.
Change-Id: I1b70a773374f64387df0a21a4f7fd64b26534b14
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make sure all tracelogs in rdma.c use SPDK_TRACE_RDMA.
Change-Id: Idc3d3b6654215b5ab3ee84a106e46ffd3019cc7a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These NVMf spec structure definitions are the same as the equivalent
NVMe structs.
Change-Id: I21c45973b7843e3767c48f97ec42e7b446df296f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There was only one function and a structure declaration
left.
Change-Id: I63277b4182120e7a76a925ed0bf7378ec7c23f20
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These can be simplified and merged into the subsystem.
Remove the concept of mappings from subsystems and replace
it with a list of hosts and ports. The host is optional -
not specifying a host means any host can connect.
Change-Id: Ib3786acb40a34b7e10935af55f4b6756d40cc906
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Make the transport responsible for filling out the fabric-specific
details in the discovery log entry.
Change-Id: I41d871c605becd557dca18f8ef7e80da66950257
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the core NVMf to transport interface generic and allow for multiple
transport types to be registered.
Change-Id: I0a2767a47d55999c45f788ae1318bb50af60ab4e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change the Port configuration file entries to a new format:
[Port1]
Listen <transport> <address>:<service>
Initially, this still only supports RDMA, but the new format will allow
specifying other transports once they are added.
Change-Id: Iadfd19b91db57b571064379368dbe77204ccecbb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>