Each fio thread can have multiple files that it writes to.
Which is why the per thread spdk_fio_setup() fio callback does
for_each_file() {...}.
One of these files can be e.g. a zoned namespace with append support,
another file could be a zoned namespace on another controller without
append support, and a third file could be a conventional namespace
(which never supports the zone append command).
Right now, we will return a fatal error if a thread has e.g. a zoned
namespace (with append support) together with a conventional namespace.
Instead of returning a fatal error, enable zone append only on the
namespaces that support zone append, and allow namespaces that do
not support zone append to continue as usual (using regular writes).
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ic6456d408cbe91563acd337a4b70c6e871fe34c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7611
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Commit f69367c788 ("fio_nvme: defer qpair allocation to file_open
callback") moved the qpair allocation from spdk_fio_setup() to
spdk_fio_open(). This broke spdk_fio_report_zones(), which needs a
qpair in order to get the initial state of the zones.
setup_files() in FIO calls td->io_ops->setup() (spdk_fio_setup()),
followed by zbd_init_files(), which calls zbd_init_zone_info(),
which calls zbd_create_zone_info(), which calls parse_zone_info(),
which calls zbd_report_zones(), which calls td->io_ops->report_zones()
(spdk_fio_report_zones()).
i.e. spdk_fio_report_zones() will always be called directly after
spdk_fio_setup(). .report_zones() is even called before the per
thread ioengine .init() callback.
Therefore, spdk_fio_report_zones() is called before the ioengine
.open_file() callback.
This is done in order to ensure that all threads will share the same
zbd_info struct, which contains the per zone locks.
Since SPDK nvme ioengine no longer initializes the qpairs in .setup(),
create a temporary qpair in .report_zones().
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ic376ac7844e40fceff092900ae7e4714bccf38e6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7590
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Commit f69367c788 ("fio_nvme: defer qpair allocation to file_open
callback") moved the qpair allocation from spdk_fio_setup() to
spdk_fio_open(). This broke --initial_zone_reset, which needs a qpair
in order to perform the initial zone reset.
While at it, move the initial zone reset from spdk_fio_setup() to
attach_cb(), as this is where all the other fio options are verified.
By placing it in attach_cb(), after the duplicated file check, we
avoid the need to loop through the whole fio_thread->fio_qpair list.
Since SPDK nvme ioengine no longer initializes the qpairs in .setup(),
create a temporary qpair, if the --initial_zone_reset option was used.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I7950304c58aef3ec783f7cd99cfb1e7d7817a197
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7589
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
status code and type is inspected and reported
Fix issue #1893
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: I6f181d8c9464182b23c658f4c268b900398fd751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7567
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
All jobs are created at boot, meaning the setup callback
is invoked for all jobs before any are executed.
But it may be useful to put 'stonewall' parameters in
the job file to execute a bunch of workloads in succession,
starting one workload when the previous one completes.
But since qpairs are created currently during setup, the
total number of workloads that can be expressed is limited
since qpairs for all workloads are allocated up front.
So instead defer allocation of the io qpairs until the
file_open callback. These don't get called until the
job associated with the 'file' (in this case, the
nvme namespace) is ready to execute.
Note that we cannot free the qpairs in the file_close
callback, since fio may 'close' the file before all
I/O have been completed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3c60cf27c3660a3c94042c0de719f5bebdb9b417
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7481
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_nvme_zns_report_zones() is implemented using
nvme_allocate_request_user_copy(), which under the hood will do
a spdk_zmalloc() with the SPDK_MALLOC_DMA flag set, and will copy
over the result to our buffer.
Therefore, it is redundant for us to use spdk_dma_zmalloc(),
because it will cause us to allocate twice the amount of memory
from the precious DMA pool than needed.
Changing this zone report buffer allocation to a calloc also
has the benefit of making the code uniform with all other
spdk_nvme_zns_report_zones() call sites in the SPDK codebase.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia354fa51c66ae07a38a9a57b07c15d145dd609f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7005
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If PRACT is enabled, and metadata size is 8 bytes, for extended
LBA format, the controller will insert/strip the metadata, so
we don't need to pass the metadata buffer, so we should exclude
this metadata buffer from host buffer.
So here add a function to calculate host buffer size.
Change-Id: I42d8d9cbfbf7ba2bc4bf64d65260c6cfe9bd4cb1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6789
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
print_qid_mappings=1 will now add logging messages
showing the {filename,qid} tuples associated with
each job.
Note that for the nvme plugin, the filename is
essentially the transport ID. We just print that
filename for simplicity rather than reconstructing
a transport ID string from the ctrlr object.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9b714ac009fd16b96ed87c2c056be251009815b8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6396
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Now when we have support for spdk_nvme_zns_zone_append() and
spdk_nvme_zns_zone_appendv(), hook them up in the nvme fio plugin.
Note that fio itself does not have support for zone append,
since unlike SPDK, there is no user facing zone append API in
Linux. Therefore, this new option simply replaces writes with
zone appends in the SPDK fio backend.
This is however still useful for the following reasons:
-Provides a way to test zone append in SPDK.
-By using zone append, we can test with iodepth > 1.
With regular writes, the user can only specify iodepth=1.
This is because for zone namespaces, writes have to target
the write pointer. Having more than one write in flight, per
zone, will lead to I/O errors.
In Linux, it is possible to use fio with iodepth > 1
on zoned namespaces, simply because of the mq-deadline
scheduler, which throttles writes such that there is only
one write in flight, per zone, even if user space has
queued up more.
Since a user might not want to use zone append unconditionally,
even on a namespace that supports it, make this an option
rather than enabling it unconditionally.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I028b79f6445bc63b68c97d1370c6f8139779666d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6330
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We konw bs should be extended_sector_size(ns) * num_blocks.
In other words, bs should be an integral multiple of extended_sector_size.
num_blocks cannot be got here, so we used integral multiple.
Change-Id: Ie521db194cdad6f2d2247fd2704cab92c36ddb82
Signed-off-by: wanghailiangx <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5881
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
According to the SPDK nvme fio plugin documentation:
"Blocksize should be set as the sum of data and metadata.
For example, if data blocksize is 512 Byte, host generated
PI metadata is 8 Byte, then blocksize in fio configure file
should be 520 Byte."
Error out if this requirement is not satisfied.
This requirement does not apply for the separate metadata case.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I730a83beb6a85695c8a4b80995340b4064375d5a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5557
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There is no need for an additional function to calculate the max transfer
size based on mdts.
nvme_ctrlr_identify_done() already initializes ctrlr->max_xfer_size
based on mdts, and spdk_nvme_ns_get_max_io_xfer_size() simply returns
ns->ctrlr->max_xfer_size.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I747ff8ac9767eababffc3c7e0b6846029a98b826
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4985
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Simon A. F. Lund <simon.lund@samsung.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Added plugin-option 'initial_zone_reset', providing the option to reset
all zones on all namespaces with the Zoned Command Set enabled upon
initialization.
The default is not to reset. The option is exposed even when the ZBD
plumbing is not available. However, it will then inform the user that
ZBD/ZNS is not supported instead of resetting.
The plugin-option provides a short-term solution to an observed issue
with consecutive invocations of fio exhausting maximum-active-resources.
A longer-term solution would be to add a 'max_active_zones' limit in fio
and ensure that fio does not exceed that limit.
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: I65341c028a97657370b315fb298bf97651b9bffd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4949
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Preparation patch for the addition of the 'initial_zone_reset' plugin-option.
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: I768fc207b74cfa2a516009e10fc2a4646d06ba72
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4948
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
All zone management receive helper functions (including
spdk_nvme_zns_report_zones()) are implemented to match the parameters of
the zone management receive function in the ZNS specification.
The documentation for spdk_nvme_zns_report_zones() states:
"param partial_report If true, nr_zones field in the zone report indicates
the number of zone descriptors that were successfully written to the zone
report. If false, nr_zones field in the zone report indicates the number
of zone descriptors that match the report_opts criteria."
This matches the description of the "Partial Report" bit in the ZNS spec.
Since the FIO function parse_zone_info() calls the io_ops->report_zones()
function multiple times, until all zones have been reported, it expects
the return from this function to represent the number of zones that were
successfully reported.
By setting the partial_report bit to false, the controller will return
the total number of zones, and since spdk_fio_report_zones() loops until
idx < report->nr_zones, and writes to zbdz[idx], the current code will
overwrite heap memory, since idx will take on index values that are out
of bounds for the memory allocated by the FIO function parse_zone_info().
Therefore, spdk_fio_report_zones() has to set the partial_report bit to
true when calling the NVMe level function spdk_nvme_zns_report_zones().
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I8846711bfed4faadac0315b450158293cefa36f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4871
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Simon A. F. Lund <simon.lund@samsung.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In spdk_fio_report_zones(), log_err did not prefix messages with
"spdk/nvme", making it hard to determine who dumped the error-message.
In spdk_fio_reset_wp() log_err described the wrong function.
This change fixes the above.
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: I41df6d451e88942806c8b5a3cf9a0902d98cb186
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4916
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
When _reset_wp() received a range to reset, then the loop kept resetting
the first zone in the range.
Also, the processing of command-completion were re-using the same
'completion' state, thus a previous completion would short-circuit
command-completion such that it would never be processed.
This change fixes that.
Also, the reset-loop assumes that the given offset is a valid zone-start
LBA, a check is added to verify that and return -EINVAL if it is not.
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: I1a1e4be2e1f67c2d8fecb5fc36a211b2dbb5a921
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4915
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
When a device has resource-limitations such as the
maximum-open-resources (mor) and this threshold is exceeded, then IO
will fail upon completion. Such behavior is not the most user-friendly
way to tell the user that they should provide a value for the
fio-parameter 'max_open_zones'.
This change provides an arguably more user-friendly approach by checking
whether the device is limited and in case it is:
* Provide a default value for 'max_open_zones', inform the user, and
continue
* Verify 'max_open_zones' and in case of error inform the user and
return error
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: I76cb045d560b9ec5701d97b82a62947af11960b6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4914
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This adds initial support for ZNS by aligning the NVMe spec. defined ZNS
structures and commands with the fio Zone representation and
implementation of the following io-engine functions:
get_zoned_model() / spdk_fio_get_zoned_model(), when namespace is ZNS
and the Zoned-Command-Set is enabled, then this function informs fio
that the device is ZBD_HOST_MANAGED.
report_zones() / spdk_fio_report_zones(), submits a single
zone-mgmt-recv and waits for its completion, converts the spec-defined
zone-descriptors to the fio ZBD_ZONE representation and returns the
number of zones in the converted report.
reset_wp() / spdk_fio_reset_wp(), submits multiple zone-mgmt-send,
covering the range [offset, offset+length] and waits for their
completion.
Four helper-functions are added to assist in the above implementations.
get_fio_qpair(), this helper is added to retrieve the namespace matching
the given fio-file, ensuring that management commands reach the correct
namespace.
spdk_fio_qpair_mdts_nbytes(), this helper is added to assist
report_zones() retrieve the zone-report within the bounds of the
maximum-data-transfer of the device.
The functions pcu() and pcu_cb() provide a means to submit
management-commands and waiting for their completions. They are needed
since, although mgmt-send/recv are IO-commands in the context of NVMe,
then for fio they are not part of the regular queue/event/getevents but
utilized in a synchronous/blocking manner.
Note, in the fio-zone-representation, then the start/len/capacity/wp
fields are in units of bytes, whereas the corresponding values in NVMe
are in lbas/sectors. It is worth noting as the offset <-> lba
conversions do not take NVMe configurations with extended-lba format
into account. Thus, the implementation is initial support for ZNS as
more work is needed to support pi/extended-lba configurations.
Note, a guard FIO_HAS_ZBD checks for the required io-engine ops version
and indirectly testing for available of fio Zone representation by
testing for a macro introduced in the same fio-release as the required
fio Zone representation.
Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Change-Id: Id3d1d61a52db2e55019032c724197df4d559271a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4836
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Use spdk_nvme_detach_async() and spdk_nvme_detach_poll_async() with
a local variable detach_ctx to detach multiple controllers.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I05c504428df56f4ab5d1ffdd19ac81e6c062c89d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4439
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will make the object relationship cleaner and the asynchronous
detach operation easier to implement.
Change FIO plugin in an independent patch to make review easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If89d189e79506726f2d20cebc100f8a8294b9111
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4431
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
This will open up the way for probing and connecting to
all of the namespaces on a pci controller.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I8fa3dde9f249ce826659882e66f630b8c25e2701
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2779
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We only enable this feature for READ commands in the NVMe driver, so
also ignore the WRITE commands in fio plugin tool.
Change-Id: Iecf43326e1a2a3b3540a1391e09a33d2443bd546
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2730
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
For the purpose to enable the stress test with Bit Bucket SGL,
here we added one new parameters: "bit_bucket_data_len".
For testing it, user should set "enable_sgl=1" and
"bit_bucket_data_len=4096". This means total 4096 Bytes of
data will be described by Bit Bucket SGL, note that the value
should be less than block size specified in the fio command
line.
We will count the Bit Bucket data from the beginning of each
I/O for simplification.
Currently it's only valid for READ test. User can see the
performance improvement when enabled the Bit Bucket.
Change-Id: Ia481a324c25942d6ca051c71cb90f87d21955259
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1623
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
In spdk_fio_setup when the connections are too many,
for example 16 subsystems, it would take too much time to
complete the probe. And it takes the mutex that makes the
poll_ctrlr function can't send the keep alive cmd which
causes the target timeout. Split the mutex so the poll_ctrlr
has the chance to sent keep alive.
Fixes issue: #1286
Change-Id: I300513b5e8761d9eaadb4c5cbc8ed97fe84d02df
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1407
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Added spdk_vmd_fini(), which detaches all PCI devices acquired by the
VMD subsystem.
Fixes#1148
Change-Id: I43218ef5f9a764546b655c28688897fb91b779cb
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482852
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously we only initialize the DIF context when PRACT is 0, because
the DIF library can only support that case, but when end-to-end data
protection feature is enabled and PRACT is set to 1, the controller
will help to check the metadata, but we still need to pass
appmask/apptak to controller. This patch will fix this case.
Change-Id: Ia62d4f8a7adf822b75541f69ce57aeff8f9eb505
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482047
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
'delay_pcie_doorbel' parameter in 'spdk_nvme_io_qpair_opts' structure
was renamed to 'delay_cmd_submit' to make it suitable for every
transport. Old name is also kept for backward compatibility.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I09ef8028133c4a3d4a5bbc5329ced1f065bcaa46
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
When initiating the io_u strcture from FIO, always
set the engine_data to NULL first as the io_u structure
is just malloced from FIO without specifically setting
to zero.
In the io_u free path, the engine_data field is checked
whether to NULL or not. To avoid mischeck issue, explicitly
set the engine_data to NULL at the beginning of the io_u
init path.
Change-Id: I52c8c251f36925650a44d14e35781bd8494ff358
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472916
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When PRACT is set, if metadata size is 8 bytes, PI is stripped
(read) or inserted (write). Hence block size must not include
metadata size for extended LBA payload. This patch fixes the issue
by reducing metadata size from block size for this case.
On the other hand, When PRACT is set, if metadata size is larger
than 8 bytes, PI is passed (read) or replaced (write). So block
size is not necessary to change for this case.
Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I930c8a07519a4742c44240801b068fac2c4802a7
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465708
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If PRACT is enabled, DIF context was not initialized. However it was
expected that PRACT is passed through DIF flags of the DIF context.
Hence PRACT was not set in NVMe command even if user set PRACT.
This patch fixes the issue by passing fio_qpair->io_flags instead
of dif_ctx->dif_flags.
Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibcb74fc8f74f863d8b53d53484fdea66f4b5db8e
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468016
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: James Bergsten <jrb@thebergstens.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
DIF context has to be initialized both for read and write I/O. However,
it had been initialized only for write I/O unintentionally after
refining error processing.
This patch fixes the issue.
Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I363da40ddba186e52fd0dfce37cfb0dea325040d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468015
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: James Bergsten <jrb@thebergstens.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Fio will allocate metadata buffers for each request, even the NVMe namespace
wasn't formatted to separate metadata, it's not an error to set the metadata
pointer to NVMe command, but still it's better to set it with real cases.
Change-Id: I1d29b6be65cfa6ba1c20d31906bcee5e8e2decf8
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461349
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In order for fio_plugin to compile a replacement is needed
for CLOCK_MONOTONIC_RAW, since FreeBSD does not support it.
In that case CLOCK_MONOTONIC is used.
Change-Id: I234ce4d932baf9c5399a46f9f4676315351e720c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458072
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Try to enumerate VMD devices in fio_plugin.
New flag enable_vmd was added to fio config.
Change-Id: I5546665719e4ef2b169d403db8bf0398e834dbc4
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456992
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Data offset are intended to correspond to DATAO in NVMe/TCP and
Buffer Offset in iSCSI.
Previously for iSCSI, buffer offset had been merged to start block
address, but passing buffer offset separately from start block address
clarifies the logic more.
On the other hand, for NVMe/TCP, passing DATAO separately from start
block address will be critically important because DATAO will bave any
alignment and will be necessary to use for not only reference tag
but also guard computation.
This patch adds data_offset to struct spdk_dif_ctx and adds it to the
parameters of spdk_dif_ctx_init(). ref_tag_offset is also added to struct
spdk_dif_ctx and it is computed by dividing data_offset by data_block_size
and is used to compute reference tag.
The next patch will use this change when getting DIF context in SCSI.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id0e12ca9b1dc75d0589787520feb0c2ee0f844a5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457540
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
fio_thread->iocq allocated on line 403 was leaked when
fio finished its run.
Change-Id: I740dcaa1e0037283d099ddf4bc125cec57cfdbcc
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456623
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
FIO is going to always present a contiguous buffer to us. But we can
fake out the nvme driver with a couple of global variables.
Change-Id: I038e70582043e1d7c1800ed065fe126aa091c290
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/439608
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Since fio is continually polling for completions, this
option can be safely enabled.
Change-Id: I02ee3d2507d3b37f79e14d69fe90ee19c4b4eea2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447711
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Existing fio plugin tool uses hardcoded application
tag and application tag mask for end-to-end data
protection, we export the two options to users
now.
Change-Id: I64d89c29e99030ce8daa2947e73d941b73ac4a8e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446384
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Existing APIs used in the fio plugin tool already
contain the separate metadata parameter, so we just
need to allocate a separate metadata buffer for each
request, by default, each request will have 4096
metadata buffer size when PI enabled with separate
metadata, but also providing an option here to let
users can input bigger value in case one request
will need larger metadata buffer size.
Change-Id: I51679c5cb7f7b1599b81287b1fbb8d9be7959191
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446375
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also set the errno for submitting and verification path.
Change-Id: I97e94eb3c63167eed2f0b14fa7b79c42add834a1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447558
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
If multiple numjobs and filename were used in fio tests, one thread may
have a list of queue pairs, so we should store the queue pair when
submitting a new request.
Change-Id: I585cd40ea4295b94c8766f9adfa5a7344cb0bc3c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447272
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Allow user to add seed value for guard compuation to DIF context.
This will avoid the guard being zero in case of all zero data.
NVMe controller doesn't support seed value for guard computation
explicitly, and hence if we want to use such a seed value in
NVMe controller, we have to format metadata more than 8 byte,
and add seed value into the reserved metadata field.
But some popular iSCSI/FC HBAs and SAS controllers have supported
seed value for guard computation, and so supporting seed value
in the SPDK DIF library is very helpful for some use cases.
Hence this patch makes the DIF library possible to specify seed
value for those use cases.
Change-Id: I7e9e87cb441bf263e64605c7820409fdc22dd977
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/444334
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Apply the new DIF library to DIF generation and verification.
Condition to use DIF generation and verification are changed
only about DIF type. DIF type 3 is supported in this patch.
DIX is not supported in this patch.
The case that PI is located to the first 8 bytes of the metadata
is not supported in this patch too to because how to pass PI location
is not fixed yet. But this limitation will not be critical because
PI is located to the last 8 bytes of the metadata by default.
DIF insertion and strip will be required not to destroy data by
DIF generation. But this is still added in the TODO list even after
this patch.
Change-Id: If08bcaaaa9f4e0fb4f373ef844b88b38cfffc6b5
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/441283
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>