Considering the process can be terminated in the cases like ctrl+c,
kill command or memory fault, the ref is tracked in the per process
structure spdk_nvme_controller_process and whenever there is other
process attaches or detaches the controller, a scan will be issued
to cleanup those unexpectedly exited processes.
Change-Id: Ib4f974f567a865748d42da4ead49edd383dfc752
Signed-off-by: GangCao <gang.cao@intel.com>
Instead of the next_sge callback returning the physical address
directly, make it return the virtual address and convert to physical
address inside the NVMe library.
This is necessary for NVMe over Fabrics host support, since the RDMA
userspace API requires virtual addresses rather than physical addresses.
It is also more consistent with the normal non-SGL NVMe functions that
already take virtual addresses.
Change-Id: I79a7af64ead987535f6bf3057b2b22aef3171c5b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Function pointers will not work for the DPDK multi-process model (they
can have different addresses in different processes), so define a
transport enum and dispatch functions that switch on the transport type
instead.
Change-Id: Ic16866786eba5e523ce533e56e7a5c92672eb2a5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_nvme_probe() will now provide a struct spdk_nvme_probe_info to the
probe and attach callbacks in place of the PCI device pointer.
This struct contains the useful information that could be retrieved from
the PCI device during probe.
The goal of this change is to allow expansion of the probe information
in the future when other transports (specifically, NVMe over Fabrics)
are added that do not necessarily use PCI addressing or device IDs.
Change-Id: I59a2a9e874e248ce5fa1d7f4b57c8056962ff3cd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the new public PCI ID structure in the NVMe library to replace the
previously private struct pci_id.
Change-Id: I267d343917f60bdae949a824bc0fe67457cbbc0d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
- Split the part that gets a PCI device's address into its own function,
spdk_pci_device_get_addr(). This is useful outside of the comparison
function and is orthogonal to comparing addresses.
- Make the comparison function take two addresses instead of a device
and an address. The more general form will be useful with addresses
that are not directly associated with a device. Because of this, also
rename the function from spdk_pci_device_compare_addr() to
spdk_pci_addr_compare().
- Return a signed value similar to strcmp() so that addresses can be
ordered, not just compared for equality.
Change-Id: Idf304454af09ea57f1e1d5dc3a39b077378cecad
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add a field to struct spdk_nvme_ctrlr_opts that allows the user to
specify a keep alive timeout, and add automatic submission of Keep Alive
commands to spdk_nvme_ctrlr_process_admin_completions().
Change-Id: Ib282299a571d8edc59c7933418751bc3a6c98b40
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the quirks mechanism generic in preparation for quirks for devices
from other vendors.
Change-Id: Ic003b020a38f1b966021db30e3f2bce9cf6a1a0d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add a transport function to get the max data transfer size to break the
dependency on NVME_MAX_XFER_SIZE.
Change-Id: I846d12878bdd8b80903ca1b1b49b3bb8e2be98bb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Move the PCIe-specific admin queue setup to nvme_pcie_ctrlr_enable.
Change-Id: Ic3f5625fa804f719040ba86b7fc3bf82fcc057c0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The value of CAP should not change during the lifetime of a controller,
so read it once during ctrlr_construct and store it in the ctrlr.
Change-Id: I089d4141b4e0c9aae6c53abf9bb0ef6577dabe0b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rather than embedding adminq directly in the spdk_nvme_ctrlr structure,
change it to a pointer to a spdk_nvme_qpair. This is necessary to allow
the transport to extend the qpair structure.
Change-Id: I041685d5037088cf56d046fe99bf204edcfc57b1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This requires a couple of related changes:
- I/O queue IDs are now allocated by using a bit array of free queue IDs
instead of keeping an array of pre-initialized qpair structures.
- The "create I/O qpair" function has been split into two: one to create
the queue pair at startup, and one to reinitialize an existing qpair
structure after a reset.
Change-Id: I4ff3bf79b40130044428516f233b07c839d1b548
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Make the transport ctrlr_construct callback responsible for allocating
its own controller.
Change-Id: I5102ee233df23e27349410ed063cde8bfdce4c67
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These are specific to local NVMe PCIe devices, so move them out of the
generic NVMe code into the PCIe transport.
Change-Id: Iea2056a4c438b7d3a303b4b5e977ce7aa9e58c05
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will allow factoring out PCIe-specific code into a swappable
transport so that NVMe over Fabrics host support can be added.
Change-Id: I4df74dd268d655e3b36e8d6114ebe7d79a24844d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For the nvme readv/writev APIs, the PRP checking logic was
incorrectly failing single SGE payloads that were larger
than 4KB. This patch adds a test case for this scenario,
and fixes the PRP checking logic.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6357d620599666046d2cb74d7923dac1f75418c5
This was only used for debugging. Everywhere else
used the spdk_memzone abstraction.
Change-Id: I8a828ea3c7abccb66c8a027cb13de43c560ff7a1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Enforce exactly one trailing \n, and fix all of the existing cases.
Change-Id: I6218e4700e90aeb647eaee78089530c79993c8c8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows users to swap their PCI library from
libpciaccess/dpdk to another mechanism using the standard
method for swapping out the env library.
Change-Id: Ib2248f8b43754a540de2ec01897e571f0302b667
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This patch also drops support for automatically unbinding
devices from the kernel - run scripts/setup.sh first.
Our generic pci interface is now hidden behind include/spdk/pci.h
and implemented in lib/util/pci.c. We no longer wrap the calls
in nvme_impl.h or ioat_impl.h. The implementation now only uses
DPDK and the libpciaccess dependency has been removed. If using
a version of DPDK earlier than 16.07, enumerating devices
by class code isn't available and only Intel SSDs will be
discovered. DPDK 16.07 adds enumeration by class code and all
NVMe devices will be correctly discovered.
Change-Id: I0e8bac36b5ca57df604a2b310c47342c67dc9f3c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Provide a convenience wrapper for general purpose dataset
management commands. The previous wrapper for deallocate
was difficult to use correctly and only for deallocate.
Note that the name is "dataset_management" as opposed to
"data_set_management" to match the NVMe specification.
It's questionable whether "dataset" is valid English, but
it is best to match the specification.
Change-Id: Ifc03d66dbabeabe8146968cf8a09f7ac3446ad68
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
If posix_memalign() fails, it may not have updated buf, so set it to
NULL explicitly.
Change-Id: I756bdc59ec1e31987ad3e6754eec4e2194b95074
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rather than forcing the NVMe library user to pass a specially-allocated
block of memory (e.g. rte_malloc() in the case of the default
nvme_impl.h), just make the NVMe library allocate a suitable buffer
itself and copy to/from the user buffer as needed.
The fast path I/O functions still require special rte_malloc()
allocations, since we don't want to add an allocation and copy to the
I/O critical path.
Change-Id: I7fe88c0ba60c859a33bbe95b7713f423c6bf1ea8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
assert is part of the C standard library and is available
on any platform we'd consider porting to. Don't put a
wrapper around it.
Change-Id: I0acfdd6a8a269d6c37df38fb7ddf4f1227630223
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
pthreads are widely supported and are available on any
platform we currently foresee porting to. Use that API
instead of attempting to abstract it away to simplify
the code.
Change-Id: I822f9c10910020719e94cce6fca4e1600a2d9f2a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is a step towards enabling sharing SPDK NVMe
device access from multiple processes using DPDK's
multi-process framework.
Change-Id: I57d5eec158b42addc1036bd2583596471a467a95
Signed-off-by: GangCao <gang.cao@intel.com>
Previously, we used cap_lo and cap_hi to represent the 32-bit halves of
the full CAP register. However, it is simpler to keep them in a single
64-bit structure, and is no less efficient on 64-bit platforms.
Also name the NSSRS field from NVMe 1.2, which was previously reserved.
Change-Id: I1d5d9b0dccbb12373b4aed3db29c883881d43223
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use the knowledge that both the source and destination of
nvme_copy_command() are aligned to emit the aligned variants of the
SSE2/AVX mov instructions.
Change-Id: I0a7e32a3bb10b9a1920cd85691b79fa7172eecb3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Match the expected alignment for nvme_alloc_request() and nvme_malloc()
to allow optimized memory copy code to work even in the unit tests.
Change-Id: I546692a6df9615a12a8209618fb6159a9c9e426b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This moves some definitions from nvmf_spec.h to
nvme_spec.h based on the latest publication.
Change-Id: I51b0abd16f7d034696239894aea5089f8ac70c40
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
NVMe opcodes contain a two-bit field that encodes the expected data
direction for each command. Add an enum and a function to extract these
bits.
Change-Id: Ie214319f121cf0899c6aa5663866f2988b128dd2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use SPDK_CU_ASSERT_FATAL instead of CU_ASSERT_FATAL so static analyzers
recognize that g_request cannot be NULL in the following lines.
Change-Id: Ie7ab3bd34a177bea0d565441014e8db12be8bb01
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The D3700/D3600 series support Controller Memory Buffer(CMB) feature,
CMB is available for holding submission queues, for those controllers
which can support submission queues in CMB, user can set the option
whether to enable it or not.
Change-Id: I8b0dc9e28dd6f5bb01bee99a532087212c04e492
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Resolve relative paths before using them to clean up command lines.
This should also help shorten the overall command line length that gets
embedded in the binary and used when locating the executable from a
coredump.
Change-Id: Ibff9849ede198bb04313496c8b7131485ffaf14f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fix copy of uninitialized data in the unit tests.
Change-Id: Ib71a6fc90edb07f7ddac3067ff0efbda7938bf17
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch add support for Intel specific log pages :
marketing description page.
Change-Id: I87bccb2af286279598c9dd3c870094b384a0d2f7
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Rename nvme_remove_child_request() to nvme_request_remove_child() and
move it next to nvme_request_add_child() to make the symmetry clear.
Change-Id: I78747c44ab3db1a656b33555a45f634dc5a55b31
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This can happen if the controller is still resetting as the SPDK NVMe
driver takes control.
Change-Id: I263ae8f2e7b271e0448450557452a115c90c4fb6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch is used to add a nvme_request remove child
helpler function
Change-Id: I1e5bb228d53333ca3601f4ae30fcd801ea39e532
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
If the controller is failed, attempting to submit additional I/O is
futile - it will be immediately failed using the completion callback,
which can result in infinite recursion if the application code resubmits
I/Os on failure.
Instead, provide a way for request submission to indicate failure, and
use it to exit early if the controller is failed; this can only happen
when a reset failed (timed out).
If a request is submitted directly by the user when the controller has
failed, we can return an error code directly. For the case where I/O
was queued and is being resubmitted after a reset, we still need to call
the completion handler via _nvme_fail_request_ctrlr_failed().
Change-Id: I9e144328d524b25db2acf48e923b584746e8d0b6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Provide a new structure, spdk_nvme_ctrlr_opts, to let the user modify
the default controller initialization options during probe/attach.
Currently, only the number of queue pairs can be modified in this way;
other options will be added later.
Change-Id: Ie27b9429291d93a9353c0d820f0ad467d3b0e7cb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Replace the previous code that allocated each tracker individually with
one large allocation per queue pair.
struct nvme_tracker is now explicitly padded to reach exactly 4096 bytes
to allow normal array indexing to work correctly while maintaining the
alignment requirement that ensures each tracker's PRP list does not
cross a page boundary.
This also allows removal of the act_tr array, since the tr array can be
indexed directly now, and each tracker can store its own active state.
Change-Id: Ia7c51735b96594d12f7f478cefcc4aedc84207ad
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The previous method for registering I/O queues did not allow the user
to specify queue priority for weighted round robin arbitration, and it
limited the application to one queue per controller per thread.
Change the API to require explicit allocation of each queue for each
controller using the new function spdk_nvme_ctrlr_alloc_io_qpair().
Each function that submits a command on an I/O queue now takes an
explicit qpair parameter rather than implicitly using the thread-local
queue.
This also allows the application to allocate different numbers of
threads per controller; previously, the number of queues was capped at
the smallest value supported by any attached controller.
Weighted round robin arbitration is not supported yet; additional
changes to the controller startup process are required to enable
alternate arbitration methods.
Change-Id: Ia33be1050a6953bc5a3cca9284aefcd95b01116e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This field is write-only in the current code; the NVMe library does
not track timeouts on requests.
Change-Id: I50e53bb3c299bf16912c48be8aad3eec829154af
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Add CUnit test case to verify payload_offset value in split_test4
Change-Id: I4a9a33854295ed802709bbe4f11746d284ed8cbd
Signed-off-by: Liang Yan <liangx.yan@intel.com>
Add test case which verify invalid second phys_addr would cause
_nvme_fail_request_bad_vtophys be called.
Change-Id: Id1b62e249b7192e29334bd7cab33815722f5662d
Signed-off-by: Liang Yan <liangx.yan@intel.com>
For those NVMe controllers which can support SGL feature in
firmware, we will use SGL for scattered payloads.
Change-Id: If688e6494ed62e8cba1d55fc6372c6e162cc09c3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
This will be exposed in the public API. This rename is in a separate
commit to ease review.
Change-Id: I1b7fef36f85265db27935ac4d22ceef3c7282502
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Many of the internal controller initialization functions did not check
for allocation failure; add return codes and check them where
applicable.
Change-Id: Id1b33bb06fca84035369d8b7ecd4c36b8ba7134c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Prepare for qpair to be exposed as part of the public API.
Change-Id: Ia63e863e95554adceeade20c829f12fe346375d5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When multiple NVMe controllers are being initialized during
spdk_nvme_probe(), we can overlap the hardware resets of all controllers
to improve startup time.
Rewrite the initialization sequence as a polling function,
nvme_ctrlr_process_init(), that maintains a per-controller state machine
to determine which initialization step is underway. Each step also has
a timeout to ensure the process will terminate if the hardware is hung.
Currently, only the hardware reset (toggling of CC.EN and waiting for
CSTS.RDY) is done in parallel; the rest of initialization is done
sequentially in nvme_ctrlr_start() as before. These steps could also be
parallelized in a similar framework if measurements indicate that they
take a significant amount of time.
Change-Id: I02ce5863f1b5c13ad65ccd8be571085528d98bd5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Two test cases should be added.
1. ns cmd don't have child requests;
2. Assert that the correct number of child requests are created
and verify that each one has the correct payload_offset.
Change-Id: I1182a1a6673ceaf2ba35be268f80d8668af82848
Signed-off-by: Liang Yan <liangx.yan@intel.com>
This fixes a memory leak in the unit test.
Change-Id: Ie94459e8e46e966c437ad43702a04f1a8f9becdb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
PCI_VENDOR_ID_INTEL -> SPDK_PCI_VID_INTEL
Also change the inclusion guard macro to be consistent with the other
SPDK headers.
Change-Id: I29346267172cb8c07cc4289eed4eca2d55e942d6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Rename all functions with a spdk_ prefix, and provide enough of an API
to avoid apps needing to #include <pciaccess.h>.
The opaque type used in the public API for a PCI device is now
struct spdk_pci_device *.
Change-Id: I1e7a09bbc5328c624bec8cf5c8a69ab0ea8e8254
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The new probing API will find all NVMe devices on the system and ask the
caller whether to attach to each one. The caller will then receive a
callback once each controller has finished initializing and has been
attached to the driver.
This will enable cleanup of the PCI abstraction layer (enabling us to
use DPDK PCI functionality) as well as allowing future work on parallel
NVMe controller startup and PCIe hotplug support.
Change-Id: I3cdde7bfab0bc0bea1993dd549b9b0e8d36db9be
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch adds Intel NVMe device list and overrides the
supported log pages according to the quirk list.
In particular, the READ_CMD_LATENCY and WRITE_CMD_LATENCY pages are
supported on Intel DC P3x00 devices despite not being listed in the
Intel vendor-specific log page directory.
Change-Id: I3a2b6a5fa142c6e9c93567df65e85980bd3c7cc0
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Also add a space between Copyright and (c).
The copyright year can be determined using git metadata.
Also remove the duplicated "All rights reserved." - every instance of
this line already has a corresponding "All rights reserved" immediately
below it, except for examples/ioat/kperf/kmod/dma_perf.c, where I have
added it manually.
Performed using this command:
git ls-files | xargs sed -i -e 's/Copyright(c) \(.*\) Intel Corporation. All rights reserved./Copyright (c) Intel Corporation./'
Change-Id: I3779f404966800709024eb1eb66a50068af2716c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>