It's not the whole transport - it's just an enum for the
type of transport.
Change-Id: Ia435a21792f221ddf50ddf4f0923c6152622eccb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
When we enabled the ERL1 configuation, for the DATAIN task release
process, we will queue the task to the SNACK list firstly, and then
remove the list when got ACK from initiator, but for this part of
logic, the reference count of primary task was not released correctly.
Change-Id: Ic5959cf644c74f676be0b84c5650292dc426b2d8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change it according to the spec thus we can test
kernel nvmf target
Change-Id: Ica98dd40503a40c0f0de8efaefb1f6f67a89cde8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change the PCI enumeration API to individual functions per device type
so that only the drivers that are actually in use get linked into the
final executable. All of the common code is still shared internally in
the env_dpdk library.
Change-Id: I2ba83afe59202a510f999a0674e23e60b6581221
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is not necessary, and it prevents the linker from removing unused
object files.
Fix the iscsi_tgt Makefile's library order so that env is added at the
end after the libraries that use it.
Change-Id: I241eb46703c12691444037a350be65143259e82e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add the following infromation.
- PCI Address
- Vendor ID
- Model Number
- Serial Number
- Firmware Revision
- NVMe spec version
- Namespace sector size
- Namespace total size
The user's remove_cb should detach the NVMe controller when it can
ensure that it is no longer in use. In the interim (between remove_cb
and spdk_nvme_detach()), the controller will remain in a failed state,
so any new I/O submissions will return an error code but not crash.
examples/nvme/hotplug is not yet updated for this change, but that will
be done in a separate patch.
Change-Id: I8827ba36f9688ccb734e7871f20f11ec11e88f96
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
While we're here, fix up typos and add error logs for all error exits
in nvme_rdma_qpair_connect().
Change-Id: I236fe6571c2012ca047aa8a447638d9227454c2f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This version of multi-process support needs to have DPDK 16.11 builtin.
Change-Id: I3352944516f327800b4bd640347afc6127d82ed4
Signed-off-by: GangCao <gang.cao@intel.com>
The discover and probe 'nqn' fields are subsystem NQNs, so name them
subnqn to be consistent with the spec and the rest of the code and to
distinguish them from host NQNs.
Change-Id: I4a80fbc1f4b037c8a4f91c8f28d2a96e47c66c47
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Allow the host NQN to be overriden when connecting to NVMe over Fabrics
controllers.
Change-Id: I8fcf2e89ae7d9722677e834f76a8fe805c52f91b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This makes the function and file/line info actually useful (instead of
pointing to the helper function itself).
Change-Id: I22bac68827115880a49d456706a7eaecdc12e9b5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Each transport should handle its own qpair cleanup internally.
Change-Id: I7dd737be820ea6bad686f4aad7d74044fad58a47
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Let the transport access the controller options during
ctrlr_construct().
Change-Id: I83590c111e75c843685dd9315f0f08416168356d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvme_rdma_req_get() is an internal function, and its only caller already
checks for a valid rqpair, so the NULL check is unnecessary.
Also clean up the redundant STAILQ_EMPTY/STAILQ_FIRST logic and use
STAILQ_REMOVE_HEAD.
Change-Id: Ic3828e8b5e881879173cb59350e39c5fac90e6ef
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvme_rdma_pre_copy_mem() does not have any failure cases, so remove its
return value and remove the never-taken branch in its only caller,
nvme_rdma_qpair_submit_request().
Change-Id: I91011734ed0c20f8db691d62172fe1a3021dd3a1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvme_rdma_req_put() is an internal nvme_rdma.c function, and all of the
callers already have the rqpair, so pass it directly. We also already
verify that all of the callers have a valid rqpair and req before
calling nvme_rdma_req_put(), so it doesn't need to check for NULL
pointers.
This also means that spdk_nvme_rdma_req doesn't need to hold a pointer
to its rqpair anymore.
Change-Id: I893a46a9074f0a843e379d10c123f9292eb3b1a4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The only place where outstanding_reqs was checked was in
nvme_rdma_req_put(), but the error case there could only happen if some
kind of internal programming error occurred (e.g. calling
nvme_rdma_req_put() on an invalid request).
Change-Id: I71e40ce562a8720dfaf70437ffd4c6493327c091
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvme_rdma_ibv_send_wr_init() was only called in one place, so just move
its contents into nvme_rdma_qpair_submit_request() since it allows
simplification of the code:
- req was always NULL, so remove the code that used req entirely.
- wr and sg_list are never NULL, so remove the checks for those.
Change-Id: I12a4f3502219d3681607686945e343f6808c0d2f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
We currently don't handle discovery service referrals, so skip those, as
well as any other unknown subsystem type.
Change-Id: I64f889e9272fb57b5cf9bb5467b3abca3955baf5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
QEMU's virtual NVMe controller device does not support the AER Set
Feature, so ignore its failure and continue.
Change-Id: I8b5c217a3112edabb6f76ec3e5f4ef774981a1d7
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Catch SIGBUS and handle it by remapping new memory into the
location where the BAR previously was.
Change-Id: Ie8d00a60a0bbe7f7ec57a5c39c0a63c5d9443206
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
These functions will attach or detach from a PCI device. Attaching
typically means mapping the BAR.
Change-Id: Iaaf59010b8a0366d32ec80bb90c1c277ada7cfe7
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This patch make sure the connection in normal state before any further
operation on this connection.
Change-Id: I776740b5b33b1de6707990c09d9131c385adf556
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
spdk_nvme_probe frees ctrlr when nvme_ctrlr_process_init is failed. But
ctrlr has already been freed while calling nvme_ctrlr_destruct. So
spdk_nvme_probe doen't need to free ctrlr.
The generic NVMe library controller initialization process already
handles enabling the controller; the RDMA transport should not need to
set EN itself.
For now, the discovery controller is cheating and not using the normal
initialization process, so move the EN = 1 hack to the discovery
controller bringup until it is overhauled to use the full
nvme_ctrlr_process_init() path.
The previous code where CC.EN was set to 1 before going through the
controller init process would cause an EN = 1 to EN = 0 transition,
which triggers a controller level reset.
This change stops us from causing a reset during the controller
startup sequence, which is defined by the NVMe over Fabrics spec as
terminating the host/controller association (breaking the connection).
Our NVMe over Fabrics target does not yet implement this correctly, but
we should still do the right thing in preparation for a full reset
implementation.
This patch also reverts the NVMe over Fabrics target reset
handling hack that was added as part of the NVMe over Fabrics host
commit to its previous state of just printing an error message.
Change-Id: I0aedd73dfd2dd1168e7b13b79575cc387737d4f0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Most of the NOTICE level messages should have been TRACE.
Change-Id: Icbc4d398ab2580cf3a2349be11441b7a09603020
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Verify that qpair is not NULL before doing pointer math on it.
The NULL check after calling nvme_rdma_qpair(qpair) would not
trigger if qpair was NULL.
Fixes a crash if the Connect command failed, causing
nvme_rdma_ctrlr_create_qpair() to return NULL.
Change-Id: I158a5b1752892a7d5a72a9ac20c0c5b2cd781a81
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows us to print better error messages when connecting to a
subsystem that exists but does not allow a specific host.
Additionally, we can now return the correct error code for a host that
is not allowed.
Change-Id: I16cd4ac2745cf50bb54601b464b0d23954f86fda
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Official installs of DPDK place headers in a 'dpdk'
subdirectory under include, so detect that.
Change-Id: If64421c84c91cae31688994484c22fce398dc622
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The status.done flag polled by nvme_ctrlr_set_keep_alive_timeout()
was never initialized.
Change-Id: I323fae5f4ce12209a9699965ce07894bc3c6205a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The code in virtual.c and direct.c was identical - move it to session.c
to share it.
Change-Id: Ic6e4e9238e8ffacb212e76293c440109aa839f8c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the current Virtual mode implementation to session.c and use it for
Direct as well.
Change-Id: I3f0ac93b4247b93d158b0dcb77e257b4b91be129
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Store the host identifier from the Connect command and report it via Get
Features.
Change-Id: I79bc27e05c5944549e7986aadb919c19748e7474
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Also return Invalid Field rather than Invalid Opcode to be more
accurate. The spec doesn't seem to define any more specific error code
for this case.
Change-Id: I992c6cca3020ff80b8495c71170222bc75316800
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
None of the log pages are actually implemented yet, but at the very
least, we don't want to leak random bits of uninitialized data.
Change-Id: Ic889260eb18d49122f2f250b645bdc5be3561dc5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the NVMe over Fabrics spec definitions for TRTYPE rather than the
internal library transport type.
Change-Id: Idead559a8f8d95274fc580d10e82033822e6eda8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These need to be available for the lifetime of the probe_info structure,
so they can't be pointing at e.g. temporary buffers on the stack.
Change-Id: I5aaa898acf9314aab51600dd756f966965d37fd0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
-Wformat-nonliteral needs to be disabled since clang triggers it on the
call to vsnprintf() now that it is nested two calls deep.
Change-Id: I228b9d099cfc2b65181941cbb4798b7f8eae3baa
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is a counterpart to spdk_strcpy_pad() which determines the length
of a string in a fixed-size buffer that may be right-padded with a
specific character.
Change-Id: I2dab8d218ee9d55f7c264daa3956c2752d9fc7f7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It always points to the same internal RDMA request complete function, so
just call that function directly.
Change-Id: Ic1fb6236bf43eaad62413df77d43be9ab855e5c7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
We can't transfer more than the bounce buffer in a single command, so
report that rather than some bogus value.
Change-Id: I39b147916dcc2ee478470917298763a239a6a35a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Record the user-provided asynchronous event configuration set via Set
Features, and return it in Get Features.
This value is not actually used, since AER is not implemented yet in the
virtual controller model, but it at least implements the mandatory
Set/Get Features.
This allows the hack in the NVMe host code that ignored the Set Features
failure to be reverted.
Change-Id: I2ac639eb8b069ef8e87230a21fa77225f32aedde
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fill in the cached copy of CAP in the generic NVMe controller to match
the PCIe transport.
This is not really early enough, since CAP is used during the reset
process to determine the reset timeout, but that will have to be fixed
separately by rearranging some of the transport callbacks.
Change-Id: Ia8e20dbb8f21c2871afb9e00db56d0730e597331
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make sure the entire NQN field is zero-padded, rather than using
strlen() on the input.
Change-Id: Icee68bd033feed057813beeb30cec102ed90840e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This fixes a compiler warning about unhandled enum cases in a switch.
Change-Id: Icecb56b47a05c13f390f03b877f8eae243b481a6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
- add SPDK_NVME_OPC_KEEP_ALIVE to admin_opcode
- add SPDK_NVME_SC_INVALID_SGL_OFFSET, SPDK_NVME_SC_INVALID_SGL_OFFSET,
SPDK_NVME_SC_HOSTID_INCONSISTENT_FORMAT, SPDK_NVME_SC_KEEP_ALIVE_EXPIRED
and SPDK_NVME_SC_KEEP_ALIVE_INVALID to generic_status
Make it easier to use SPDK libraries by putting them all in a single
directory that can be added with -L rather than scattered around the
source tree.
Change-Id: I5c0f5dd6e7058b5f92fa9bc41548190ffc064761
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Track the maximum copy task size as modules are registered rather than
recalculating it every time spdk_copy_task_size() is called.
Change-Id: I141aca61e7075402dac41915080d1b43faee32ce
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the public API clearer - if the user wants to allocate a
spdk_copy_task directly, they need to allocate spdk_copy_task_size()
bytes.
Also change the return type to size_t for consistency.
Change-Id: I0f3757056757c510421d680c5b4532edd9bc2561
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The SPDK_TRACELOG macro depends on a CONFIG setting (DEBUG), so it
should not be part of the public API.
Create a new include/spdk_internal directory for headers that should
only be used within SPDK, not exported for public use.
Change-Id: I39b90ce57da3270e735ba32210c4b3a3468c460b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove usage of the conf structs so they can be moved out of the public
API header.
Change-Id: I1c7375ec7708b323f50af09aeb7b2b2c9c770df4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Considering the process can be terminated in the cases like ctrl+c,
kill command or memory fault, the ref is tracked in the per process
structure spdk_nvme_controller_process and whenever there is other
process attaches or detaches the controller, a scan will be issued
to cleanup those unexpectedly exited processes.
Change-Id: Ib4f974f567a865748d42da4ead49edd383dfc752
Signed-off-by: GangCao <gang.cao@intel.com>
These APIs can be used to register/unregister regions
of pinned, huge page memory that are separate from
huge page memory allocated by the default DPDK
allocations. These APIs will be used by an upcoming
SPDK vhost-scsi target to enable SPDK to target
NVMe DMA operations directly to VM memory that has
been allocated by QEMU using pinned huge pages.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I649a4adeeb758b29bd29cd42c8872eed3d5d6ce9
Now that the env PCI framework already requires enumerating devices
based on an enum of specific device types, it is not useful to query the
class code of a PCI device handle.
It is currently unused and does not work in its current form on FreeBSD
(it reads a file from /sys). This lets us drop a big chunk of file
reading and parsing code.
Change-Id: I1d720398416ba3d6f91e077b807ec11a6de562cf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The details of the structure were removed earlier, but
now remove all references even to a pointer to the
structure. The user can refer to transports by their
string name.
Change-Id: I273356f46329ea5372dcd951eda6f14767477d69
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is a step toward abstracting away the definition
of the subsystem.
Change-Id: I88b2aa107b27152620f51a1ca2a153792b4c85e9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Remove 4k allocation size in spdk_scsi_task_alloc_data(). From now on
all commands must obay allocation length.
Change-Id: Ica9384c62d431483ae1d0bd2e6fdee18b570861f
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This refactor MODE SENSE 6 and 10 related functions to respect buffer
size parameter.
Change-Id: I03bad456bac0554a8bf7b56f69d1f9cf5b1991f6
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This patch is preparation for fixing alloc_len overrun in SENSE 6/10 and
READCAP 6/10. To simplify code forbid usage of iov outside of
scsi/task.c.
This also drop SPDK_SCSI_TASK_ALLOC_BUFFER flag that obfuscate code. As
a replacement assume that if field alloc_len is non zero it mean that
iov.buffer is internally allocated. Functions
spdk_scsi_task_free_data(), spdk_scsi_task_set_data() and
spdk_scsi_task_alloc_data() manage this field.
Change-Id: Ife357a5bc36121f93a4c5d259b9a5a01559e7708
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Use the len field from the generic spdk_bdev_io instead of duplicating
it in blockdev_rbd_io.
Change-Id: I3ebfab8dd1303add83bc2206fc87319ba7d605b3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This function needs to check for SGEs that straddle a
2MB page boundary, and ensure it does not return
a length that will cross that boundary.
This cannot happen in practice currently with SPDK
since all buffers are allocated using rte_malloc(),
but an upcoming vhost-scsi target may produce
SGEs from a guest VM's physical memory that span
a 2MB boundary.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8b83c7c39c4cf33815abb22ff2ebc90941b21e28
No functional change, but removes a few assumptions
that will be invalid in a future patch that fixes a
bug in this function. Primarily we no longer assume
that this function will always increment the
iovpos and reset iov_offset to 0.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I770f2f24c37626063e113af850a2af792aed332a
The bdev function table should not be part of the public API.
Change-Id: I5d6f40d1b37c4471041c1c9d6253a3f92e9e9701
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It was written but never read (and the I/O channel is already stored in
the generic spdk_bdev_io).
Change-Id: Id33392e9d3940b2c1439e9fed2553aa091ecedf8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
No need to duplicate the bdev-defined I/O type.
Change-Id: I15cb68c3c68b3f25b286b04500b53081ed5e7881
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The status field in blockdev_rbd_io was only used within
blockdev_rbd_io_poll(), so replace it with a local variable.
Change-Id: I3629225f28b752a3acc7521699c33bc98f1e4b7b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Instead of the next_sge callback returning the physical address
directly, make it return the virtual address and convert to physical
address inside the NVMe library.
This is necessary for NVMe over Fabrics host support, since the RDMA
userspace API requires virtual addresses rather than physical addresses.
It is also more consistent with the normal non-SGL NVMe functions that
already take virtual addresses.
Change-Id: I79a7af64ead987535f6bf3057b2b22aef3171c5b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove the complex list management for pool_name and just strdup() it
directly. It is not worth the trouble to save a few bytes.
Change-Id: I8a4f7eeea619bd824ea593854423e317041c540e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove a DPDK dependency from generic code.
Change-Id: I8e3e2c0a36d980b426a1967ed1f88fb8b855c382
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Custom bdev modules can return any SCSI status and SCSI sense
information to a host by this patch. This is usefull when a custome bdev
module detect an error in the module and need to return meaningful
information to a host.
Function pointers will not work for the DPDK multi-process model (they
can have different addresses in different processes), so define a
transport enum and dispatch functions that switch on the transport type
instead.
Change-Id: Ic16866786eba5e523ce533e56e7a5c92672eb2a5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make a wrapper that spdk can call a function without thread affinity, and
call this wrapper to open rbd image.
Change-Id: Iadc87a948f43632abf497f88165483a0e269ba54
This enables using SPDK within a larger process that
is SPDK-centric. In this case the process may start
SPDK and then wish to stop it explicitly (without a
signal).
While here, remove an incorrect comment - DPDK mempools
can be used from non-DPDK threads. Also set the
g_shutdown_event to NULL after it is called. After the
event executes, the event is freed and is no longer valid.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie4f07bee7d05fae683c72f6680cb3bcce2d4a119
The initialization of dev_addr was replaced with probe_info.pci_addr,
but its use in spdk_pci_addr_compare() wasn't replaced to match.
Fixes commit fcb00f3780 (nvme: expand
probe information to a struct).
Change-Id: Ic4c273d2aa0bf1f9e3e1527f3ab09d3c019158cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Since we are usually going to be removing multiple events from the queue
at once, use the DPDK burst dequeue interface to improve efficiency.
Also rework the event queue runner to always process a fixed maximum
number of events per timeslice for simplicity. This removes the
rte_ring_count() call from the hot path and improves fairness between
events and pollers.
Now that events are dequeued in bulk, we can also put the event objects
back into the mempool in bulk. Add an env wrapper around
rte_mempool_put_bulk() and use it to free all of the events at once.
Basic performance benchmark using test/lib/event/event/event -t 10
is improved: previously ~40 million events per second, now ~46 million
events per second.
Change-Id: I432e8a48774a087eec2be3a64c38c339608af42a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It always returns NULL in the current DPDK env implementation and was
not used outside of a few ioat examples where it is not particularly
informational.
Change-Id: I14b237c33bc25ddebc6b36bfbd6a4edf6762e3ca
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This removes the 2 bytes of SenseLength from the beginning of the SCSI
sense_data buffer, so now the offsets within sense.data match up to the
expected values from the SCSI spec.
Change-Id: I9188560096a9ec5a8fcf83bec95201521b127494
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_nvme_probe() will now provide a struct spdk_nvme_probe_info to the
probe and attach callbacks in place of the PCI device pointer.
This struct contains the useful information that could be retrieved from
the PCI device during probe.
The goal of this change is to allow expansion of the probe information
in the future when other transports (specifically, NVMe over Fabrics)
are added that do not necessarily use PCI addressing or device IDs.
Change-Id: I59a2a9e874e248ce5fa1d7f4b57c8056962ff3cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add a helper function that converts a PCI address from a string into a
struct spdk_pci_addr and use it in place of the various sscanf()
invocations throughout SPDK.
Change-Id: Id2749723f76db741567e01b4bcb0fffb0e425fcd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Print the error information when the kernel RNIC driver did not load
properly, and fix the cleanup logic for the exceptional exit.
Change-Id: I97a45e73d830280b994818f3defc491bc2b6b020
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
As we can support multiple sessions now for each Subsystem, the Host
will use cntlid field to create IO queues, if 2 different Hosts
connected to the same Subsystem, for IO queues' creation process, it
will use cntlid field with 0 for current code logic.
Change-Id: I6fd437892e8eb3146f62f4b211c0baadd70b505e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Add an RPC interface to list all blockdevs and their properties.
Change-Id: I50db730d5eff8cffcbe8fe5df6b3461457e8581e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The PCI device claim function does not need the whole spdk_pci_device
structure, just the address.
Change-Id: If59df512043ee062cf9f759bdc104fc522625ba8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The NVMe over Fabrics target was storing the PCI device pointer for each
direct-mode controller, but it only really needs the PCI address, which
is exposed via the get_nvmf_subsystems RPC.
Also update the same code path to use the new spdk_pci_device_get_addr()
function for brevity.
Change-Id: I0708b3331b7c279c1a86f0d7459b5deb40dd7c89
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the new public PCI ID structure in the NVMe library to replace the
previously private struct pci_id.
Change-Id: I267d343917f60bdae949a824bc0fe67457cbbc0d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
- Split the part that gets a PCI device's address into its own function,
spdk_pci_device_get_addr(). This is useful outside of the comparison
function and is orthogonal to comparing addresses.
- Make the comparison function take two addresses instead of a device
and an address. The more general form will be useful with addresses
that are not directly associated with a device. Because of this, also
rename the function from spdk_pci_device_compare_addr() to
spdk_pci_addr_compare().
- Return a signed value similar to strcmp() so that addresses can be
ordered, not just compared for equality.
Change-Id: Idf304454af09ea57f1e1d5dc3a39b077378cecad
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rename the construct_rbd_bdev "size" parameter to block_size so that it
is consistent with other bdev construct RPCs.
Change-Id: I88f8ed35444495ffce9550dc224fbcbd58231787
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When creating a bdev via the RPC interface, there was no way to know
what name it was assigned (other than predicting it based on the
numbering scheme). Change all of the relevant RPC interfaces to return
an array of bdev names so they can be used to construct LUNs/subsystems
dynamically in scripts.
Change-Id: I8e03349bdc81afd3d69247396a20df5fcf050f40
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add a field to struct spdk_nvme_ctrlr_opts that allows the user to
specify a keep alive timeout, and add automatic submission of Keep Alive
commands to spdk_nvme_ctrlr_process_admin_completions().
Change-Id: Ib282299a571d8edc59c7933418751bc3a6c98b40
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Specify SPDK_JSON_WRITE_FLAG_FORMATTED when creating a write context to
output more human-readable JSON.
Change-Id: Ie1f0451496aae7e36e4cdb1f05edb4bc4963be17
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If the RDMA transport failed to initialize, g_rdma.event_channel may be
NULL.
Change-Id: I4510ee5893389f244f0fbaa1cd4a182868939b25
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For iWARP devices, buffers that are intended to be the
target of an RDMA read initiated by the target must additionally
have IBV_ACCESS_REMOTE_WRITE permission. This is because iWARP's
RDMA read path essentially requests the remote side to do
an RDMA write.
This is unfortunate because there is no way to differentiate between
memory that the remote side can do an RDMA write to and memory
that will only be the target of RDMA reads initiated by the
target. There is nothing we can do about this serious deficiency in
the specification, however, so we have to live with it.
Change-Id: I3d2f2814ce0cb1df4e5347296ef371db4d16be21
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This patch adds support for spdk_bdev_readv in scsi layer.
It also fixes write so that it uses multiple iov's instead of one.
Currently we should use only task->iov (for single vector operation)
or task->iovs (for multiple vector operations).
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ia3b2f6d18fd212b11d7b63b11dc46ec5bbc74788
Make the quirks mechanism generic in preparation for quirks for devices
from other vendors.
Change-Id: Ic003b020a38f1b966021db30e3f2bce9cf6a1a0d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch removes reduntant field in spdk_scsi_task and
fixes all logic to use iov.iov_base
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ie2fa1e2357b6383c118d05aec9206d1c60537d40
Previously, if spdk_rpc_setup() returned early due to the RPC service
being disabled in the configuration file, it would leave itself
registered as a poller and continue to run for the life of the app.
Change-Id: I0532fe23a732b87d68f83847b2db7627f87e9a1c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
I believe this is required for NICs to report, but handle
the case where it isn't reported.
Change-Id: I38d10c3590d1df8bb902ab312af0f9e01b9e5032
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This makes it consistent with the way connections and
requests work.
Change-Id: Ifb97499ba72f7dfd02ac54ba1b622726d266262c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The shared memory pool for a session is associated with
a particular RNIC via the protection domain. New connections
attempting to join a session that came in on a different RNIC
can't use that memory, so must be rejected.
Change-Id: Ibd79fe90566a231f76b7472e5e9b484c3e528454
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Rearrange the functions in rdma.c to match the order
of the function pointers in the transport. No other
code changes.
Change-Id: I9dbc68912ecd5dfdf53f20b4807d4116933a3c3a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Use the lower level registration functions. The RDMA-CM
examples use the ibv_* versions, so who knows if the
rdma_reg_* wrappers are even well tested.
Change-Id: I8e8250ab09a1401e636aebe2fc04a60806f7a827
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Add a transport function to get the max data transfer size to break the
dependency on NVME_MAX_XFER_SIZE.
Change-Id: I846d12878bdd8b80903ca1b1b49b3bb8e2be98bb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the PCIe-specific admin queue setup to nvme_pcie_ctrlr_enable.
Change-Id: Ic3f5625fa804f719040ba86b7fc3bf82fcc057c0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, we mixed use free and spdk_nvmf_rdma_conn_destroy to
free allocated spdk_nvmf_rdma_conn structure, which sounds not
exactly free all the resources.
Change-Id: I2917b442c34d63ba5c014add58f429ae4b831595
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The RDMA API doesn't say whether the wr is copied, so be
safe and allocate it on the heap.
Change-Id: I091af50aa031e1861333f19d864eb52335d6b756
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
A status member of spdk_bdev_io structure is set after the if block.
Therefore a status parameter should be checked instead of a status
member.
Change-Id: I4030a7fcdb36d9c589802ec5b4e424591dc2a3b6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The value of CAP should not change during the lifetime of a controller,
so read it once during ctrlr_construct and store it in the ctrlr.
Change-Id: I089d4141b4e0c9aae6c53abf9bb0ef6577dabe0b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rather than embedding adminq directly in the spdk_nvme_ctrlr structure,
change it to a pointer to a spdk_nvme_qpair. This is necessary to allow
the transport to extend the qpair structure.
Change-Id: I041685d5037088cf56d046fe99bf204edcfc57b1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, we directly assigned the pointer of pool_name
and rbd_name, and this is not safe. After the rpc test,
we found the string value is not correct, so use strdup.
Change-Id: Ibadc57d3cb5b9869b7db5a22c2459769e92edebd
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This requires a couple of related changes:
- I/O queue IDs are now allocated by using a bit array of free queue IDs
instead of keeping an array of pre-initialized qpair structures.
- The "create I/O qpair" function has been split into two: one to create
the queue pair at startup, and one to reinitialize an existing qpair
structure after a reset.
Change-Id: I4ff3bf79b40130044428516f233b07c839d1b548
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the transport ctrlr_construct callback responsible for allocating
its own controller.
Change-Id: I5102ee233df23e27349410ed063cde8bfdce4c67
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Four read/write functions share the same code for checking
IO len and offset. Extract this code into separate function.
Change-Id: I40f0021e70a60c591b048ad3a70b22eaa07af3b4
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
These are specific to local NVMe PCIe devices, so move them out of the
generic NVMe code into the PCIe transport.
Change-Id: Iea2056a4c438b7d3a303b4b5e977ce7aa9e58c05
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows the entire transport structure definition
to become private.
Change-Id: I9ca19edbfc3cfb75b9b113a89bb2b90bc499ab16
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This changes as little code as possible while still creating
a single public API header. This enables future clean up
of the public API and clarification of the exposed
concepts.
Change-Id: I780e7a5a9afd27acf0276516bd71b896ad301c50
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Only call spdk_bdev_io_complete() where IO error is seen.
Change-Id: I829e4c589dbcb47017e810035837a4c61c3428f9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This will allow factoring out PCIe-specific code into a swappable
transport so that NVMe over Fabrics host support can be added.
Change-Id: I4df74dd268d655e3b36e8d6114ebe7d79a24844d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For large writes that require multiple SCSI tasks (one for immediate
data, then one or more for R2T-solicited data), we bump the refcount
for the task associated with the initial immediate data PDU, to
ensure it does not get freed until all of the child tasks are
completed. But in some cases this initial immediate data PDU could
complete after all of the R2T-soliciated data PDUs. The
completion code was not handling this case correctly which would
result in the iSCSI connection thinking it still had outstanding
SCSI tasks when the connection closed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f9c5322755462d1918fde0075c87c84295cb10c
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: Ie2c1f6853bfd54ebd8039df9a0305854ca3297b9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Make sure the function reentrant, prepare for rpc method.
Change-Id: Ie5230e4ac6c9a750e8e779c5e0b67134729c07e3
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This prepares for future scatter-gather support.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie21c4d86c1e932dcaf63cf13d7a7198890595d79
Return void in main I/O path, and have functions
explicitly complete the I/O back to the bdev layer
if any failures are encountered.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia729b0af555f87c2fb36b92e79a47d19a325de7a
Add public function which could be used by rpc method.
Change-Id: Id9d2938801e0acdf0f9827ef2990a54c75aec22a
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
This patch removes the lock in RBD module. And it requires
the librbd library supports rbd_poll_io_events function.
Change-Id: I040a7d8369ab4f69f41d1d0233115f885168f019
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This patch makes lun_name:lun_id pair as one object,
the same for the pg_tag:ig_tag.
Change-Id: Ib08450d12bde9b8388d4ae41e214cc0ba64c8b1e
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
For the nvme readv/writev APIs, the PRP checking logic was
incorrectly failing single SGE payloads that were larger
than 4KB. This patch adds a test case for this scenario,
and fixes the PRP checking logic.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6357d620599666046d2cb74d7923dac1f75418c5
Use standard GCC style atomic operations instead of
the DPDK calls. The DPDK calls end up translating
to the gcc standard inline calls in the generic
case anyway.
Change-Id: I0ea760c4e23c3660b082a803bbc174de7250f365
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Remove #includes for all DPDK headers that weren't
necessary.
Change-Id: Ib02522e0f04e64a1c98afceb7508cc0e8d931a9d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This was only used for debugging. Everywhere else
used the spdk_memzone abstraction.
Change-Id: I8a828ea3c7abccb66c8a027cb13de43c560ff7a1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This converts some, but not all, usage of rte_mempool
to spdk_mempool. The remaining rte_mempools use features
we elected not to expose through spdk_mempool such as
constructors, so that will need to be revisited.
Change-Id: I6528809a864ab466b8d19431789bf0f976b648b6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change the type from int to bool and change the name
from data_ref to data_from_mempool.
Change-Id: If1fc11761e63561443ed44d6a0860e416e424df8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This is required since pollers are now directly removed
(rather than scheduling an event) when the unregister call
is made on the poller's lcore.
Without this change, if a poller is registered then
immediately unregistered, the unregistration will seg
fault since the event adding the poller has not executed
yet.
Also add a test case that exhibits the sequence of events
described in this commit message.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5c6ba0ee224ac1f8f3ebb8e7571714e718bd42db
Use the env library to perform all memory allocations
that previously called DPDK directly.
Change-Id: I6d33e85bde99796e0c85277d6d4880521c34f10d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Enforce exactly one trailing \n, and fix all of the existing cases.
Change-Id: I6218e4700e90aeb647eaee78089530c79993c8c8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: I5f401527c2717011ecc21116363bbb722e804112
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
'virtual' is a keyword in C++, so avoid using it in variable
and structure names in case any files are eventually
included from a C++ project.
Change-Id: I2122750445def63038af68a3000758e33b937f9d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
All completion queues for the same listen address
now share a common completion queue channel.
Change-Id: I42c149fe7e221951e8a3826b1713482c37a265b8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
These 4 callbacks can be condensed into two callbacks, which
simplifies the API.
Change-Id: I069da00de34b252753cdc8961439e13a75d1cc68
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Validate the number of unmap descriptors in the generic bdev layer
before calling the blockdev-specific unmap function.
Change-Id: Ib24e7ec63f782f23f2ee3e63393aa8463123fdb4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
All timers have been converted to SPDK pollers.
If an app requires rte_timer support, it should register its own poller
that calls rte_timer_manage().
Change-Id: I8a827a357b344deac76d42357a5a84ac2daabbf8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows users to swap their PCI library from
libpciaccess/dpdk to another mechanism using the standard
method for swapping out the env library.
Change-Id: Ib2248f8b43754a540de2ec01897e571f0302b667
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This allows users to swap out SPDK's third party
libraries for an implementation based on their own
framework.
Change-Id: Ia0b7384ce5e31acba5ad0d7002dec9e95b759c52
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The new env library will wrap all third-party library
calls and be easily swappable with alternate implementations
at build time. For now, it's just the memory library
renamed.
Change-Id: I26a70933289f8137107208ba75f7520fd7a33da0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Increased printed data width, added data offset indices.
Change-Id: I44f81396e33870109c2bece5e152657f8a24a56a
Signed-off-by: Krzysztof Jakimiak <krzysztof.jakimiak@intel.com>
Preparation for SGL support (readv/writev).
Change-Id: I14a116d764ebc582ea0a0077cc5a0d0bac638cb0
Signed-off-by: Krzysztof Jakimiak <krzysztof.jakimiak@intel.com>
This patch also drops support for automatically unbinding
devices from the kernel - run scripts/setup.sh first.
Our generic pci interface is now hidden behind include/spdk/pci.h
and implemented in lib/util/pci.c. We no longer wrap the calls
in nvme_impl.h or ioat_impl.h. The implementation now only uses
DPDK and the libpciaccess dependency has been removed. If using
a version of DPDK earlier than 16.07, enumerating devices
by class code isn't available and only Intel SSDs will be
discovered. DPDK 16.07 adds enumeration by class code and all
NVMe devices will be correctly discovered.
Change-Id: I0e8bac36b5ca57df604a2b310c47342c67dc9f3c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>