diff --git a/doc/blob.md b/doc/blob.md index 88293e159..8c56771c1 100644 --- a/doc/blob.md +++ b/doc/blob.md @@ -295,19 +295,23 @@ contribute to the Blobstore effort itself. The Blobstore owns the entire storage device. The device is divided into clusters starting from the beginning, such that cluster 0 begins at the first logical block. - LBA 0 LBA N - +-----------+-----------+-----+-----------+ - | Cluster 0 | Cluster 1 | ... | Cluster N | - +-----------+-----------+-----+-----------+ +```text +LBA 0 LBA N ++-----------+-----------+-----+-----------+ +| Cluster 0 | Cluster 1 | ... | Cluster N | ++-----------+-----------+-----+-----------+ +``` Cluster 0 is special and has the following format, where page 0 is the first page of the cluster: - +--------+-------------------+ - | Page 0 | Page 1 ... Page N | - +--------+-------------------+ - | Super | Metadata Region | - | Block | | - +--------+-------------------+ +```text ++--------+-------------------+ +| Page 0 | Page 1 ... Page N | ++--------+-------------------+ +| Super | Metadata Region | +| Block | | ++--------+-------------------+ +``` The super block is a single page located at the beginning of the partition. It contains basic information about the Blobstore. The metadata region is the remainder of cluster 0 and may extend to additional clusters. Refer diff --git a/doc/concurrency.md b/doc/concurrency.md index 3f734ae83..508b78132 100644 --- a/doc/concurrency.md +++ b/doc/concurrency.md @@ -140,6 +140,7 @@ function `foo` performs some asynchronous operation and when that completes function `bar` is called, then function `bar` performs some operation that calls function `baz` on completion, a good way to write it is as such: +```c void baz(void *ctx) { ... } @@ -151,6 +152,7 @@ calls function `baz` on completion, a good way to write it is as such: void foo(void *ctx) { async_op(bar, ctx); } +``` Don't split these functions up - keep them as a nice unit that can be read from bottom to top. @@ -162,6 +164,7 @@ them in C we can still write them out by hand. As an example, here's a callback chain that performs `foo` 5 times and then calls `bar` - effectively an asynchronous for loop. +```c enum states { FOO_START = 0, FOO_END, @@ -244,6 +247,7 @@ an asynchronous for loop. run_state_machine(sm); } +``` This is complex, of course, but the `run_state_machine` function can be read from top to bottom to get a clear overview of what's happening in the code diff --git a/doc/ftl.md b/doc/ftl.md index 92df7a595..b34ed8279 100644 --- a/doc/ftl.md +++ b/doc/ftl.md @@ -27,6 +27,7 @@ well as their validity, as some of the data will be invalidated by subsequent wr logical address. The L2P mapping can be restored from the SSD by reading this information in order from the oldest band to the youngest. +```text +--------------+ +--------------+ +--------------+ band 1 | zone 1 +--------+ zone 1 +---- --- --- --- --- ---+ zone 1 | +--------------+ +--------------+ +--------------+ @@ -42,16 +43,19 @@ from the oldest band to the youngest. +--------------+ +--------------+ +--------------+ parallel unit 1 pu 2 pu n +``` The address map and valid map are, along with a several other things (e.g. UUID of the device it's part of, number of surfaced LBAs, band's sequence number, etc.), parts of the band's metadata. The metadata is split in two parts: +```text head metadata band's data tail metadata +-------------------+-------------------------------+------------------------+ |zone 1 |...|zone n |...|...|zone 1 |...| | ... |zone m-1 |zone m| |block 1| |block 1| | |block x| | | |block y |block y| +-------------------+-------------+-----------------+------------------------+ +``` - the head part, containing information already known when opening the band (device's UUID, band's sequence number, etc.), located at the beginning blocks of the band, @@ -73,6 +77,7 @@ support writes to a single block, the data needs to be buffered. The write buffe this problem. It consists of a number of pre-allocated buffers called batches, each of size allowing for a single transfer to the SSD. A single batch is divided into block-sized buffer entries. +```text write buffer +-----------------------------------+ |batch 1 | @@ -89,6 +94,7 @@ for a single transfer to the SSD. A single batch is divided into block-sized buf | |entry 1|entry 2| |entry n| | | +-----------------------------+ | +-----------------------------------+ +``` When a write is scheduled, it needs to acquire an entry for each of its blocks and copy the data onto this buffer. Once all blocks are copied, the write can be signalled as completed to the user. @@ -108,12 +114,14 @@ situation in which all of the bands contain some valid data and no band can be e can be executed anymore. Therefore a mechanism is needed to move valid data and invalidate whole bands, so that they can be reused. +```text band band +-----------------------------------+ +-----------------------------------+ | ** * * *** * *** * * | | | |** * * * * * * *| +----> | | |* *** * * * | | | +-----------------------------------+ +-----------------------------------+ +``` Valid blocks are marked with an asterisk '\*'. diff --git a/doc/porting.md b/doc/porting.md index b6872bef1..0f8adda74 100644 --- a/doc/porting.md +++ b/doc/porting.md @@ -18,4 +18,6 @@ a new version of the *env* library. The new implementation can be integrated into the SPDK build by updating the following line in CONFIG: - CONFIG_ENV?=$(SPDK_ROOT_DIR)/lib/env_dpdk +```bash +CONFIG_ENV?=$(SPDK_ROOT_DIR)/lib/env_dpdk +``` diff --git a/examples/bdev/fio_plugin/README.md b/examples/bdev/fio_plugin/README.md index 84ef213bb..c91f68f87 100644 --- a/examples/bdev/fio_plugin/README.md +++ b/examples/bdev/fio_plugin/README.md @@ -8,36 +8,48 @@ the GPL license. Clone the fio source repository from https://github.com/axboe/fio +```bash git clone https://github.com/axboe/fio cd fio +``` Compile the fio code and install: +```bash make make install +``` ## Compiling SPDK Clone the SPDK source repository from https://github.com/spdk/spdk +```bash git clone https://github.com/spdk/spdk cd spdk git submodule update --init +``` Then, run the SPDK configure script to enable fio (point it to the root of the fio repository): +```bash cd spdk ./configure --with-fio=/path/to/fio/repo +``` Finally, build SPDK: +```bash make +``` **Note to advanced users**: These steps assume you're using the DPDK submodule. If you are using your own version of DPDK, the fio plugin requires that DPDK be compiled with -fPIC. You can compile DPDK with -fPIC by modifying your DPDK configuration file and adding the line: - EXTRA_CFLAGS=-fPIC +```bash +EXTRA_CFLAGS=-fPIC +``` ## Usage @@ -45,20 +57,28 @@ To use the SPDK fio plugin with fio, specify the plugin binary using LD_PRELOAD fio and set ioengine=spdk_bdev in the fio configuration file (see example_config.fio in the same directory as this README). - LD_PRELOAD=/build/fio/spdk_bdev fio +```bash +LD_PRELOAD=/build/fio/spdk_bdev fio +``` The fio configuration file must contain one new parameter: - spdk_json_conf=./examples/bdev/fio_plugin/bdev.json +```bash +spdk_json_conf=./examples/bdev/fio_plugin/bdev.json +``` You can specify which block device to run against by setting the filename parameter to the block device name: - filename=Malloc0 +```bash +filename=Malloc0 +``` Or for NVMe devices: - filename=Nvme0n1 +```bash +filename=Nvme0n1 +``` fio by default forks a separate process for every job. It also supports just spawning a separate thread in the same process for every job. The SPDK fio plugin is limited to this latter thread @@ -79,7 +99,9 @@ NVMe Zoned Namespaces (ZNS), and the virtual zoned block device SPDK module. If you wish to run fio against a SPDK zoned block device, you can use the fio option: - zonemode=zbd +```bash +zonemode=zbd +``` It is recommended to use a fio version newer than version 3.26, if using --numjobs > 1. If using --numjobs=1, fio version >= 3.23 should suffice. @@ -108,7 +130,9 @@ zones limit, the easiest way to work around that fio does not manage this constr with a clean state each run (except for read-only workloads), by resetting all zones before fio starts running its jobs by using the engine option: - --initial_zone_reset=1 +```bash +--initial_zone_reset=1 +``` ### Zone Append @@ -116,7 +140,9 @@ When running fio against a zoned block device you need to specify --iodepth=1 to "Zone Invalid Write: The write to a zone was not at the write pointer." I/O errors. However, if your zoned block device supports Zone Append, you can use the engine option: - --zone_append=1 +```bash +--zone_append=1 +``` To send zone append commands instead of write commands to the zoned block device. When using zone append, you will be able to specify a --iodepth greater than 1. diff --git a/examples/nvme/fio_plugin/README.md b/examples/nvme/fio_plugin/README.md index 91e725846..528dc1686 100644 --- a/examples/nvme/fio_plugin/README.md +++ b/examples/nvme/fio_plugin/README.md @@ -4,33 +4,45 @@ First, clone the fio source repository from https://github.com/axboe/fio +```bash git clone https://github.com/axboe/fio +``` Then check out the latest fio version and compile the code: +```bash make +``` ## Compiling SPDK First, clone the SPDK source repository from https://github.com/spdk/spdk +```bash git clone https://github.com/spdk/spdk git submodule update --init +``` Then, run the SPDK configure script to enable fio (point it to the root of the fio repository): +```bash cd spdk ./configure --with-fio=/path/to/fio/repo +``` Finally, build SPDK: +```bash make +``` **Note to advanced users**: These steps assume you're using the DPDK submodule. If you are using your own version of DPDK, the fio plugin requires that DPDK be compiled with -fPIC. You can compile DPDK with -fPIC by modifying your DPDK configuration file and adding the line: - EXTRA_CFLAGS=-fPIC +```bash +EXTRA_CFLAGS=-fPIC +``` ## Usage @@ -38,20 +50,28 @@ To use the SPDK fio plugin with fio, specify the plugin binary using LD_PRELOAD fio and set ioengine=spdk in the fio configuration file (see example_config.fio in the same directory as this README). - LD_PRELOAD=/build/fio/spdk_nvme fio +```bash +LD_PRELOAD=/build/fio/spdk_nvme fio +``` To select NVMe devices, you pass an SPDK Transport Identifier string as the filename. These are in the form: - filename=key=value [key=value] ... ns=value +```bash +filename=key=value [key=value] ... ns=value +``` Specifically, for local PCIe NVMe devices it will look like this: - filename=trtype=PCIe traddr=0000.04.00.0 ns=1 +```bash +filename=trtype=PCIe traddr=0000.04.00.0 ns=1 +``` And remote devices accessed via NVMe over Fabrics will look like this: - filename=trtype=RDMA adrfam=IPv4 traddr=192.168.100.8 trsvcid=4420 ns=1 +```bash +filename=trtype=RDMA adrfam=IPv4 traddr=192.168.100.8 trsvcid=4420 ns=1 +``` **Note**: The specification of the PCIe address should not use the normal ':' and instead only use '.'. This is a limitation in fio - it splits filenames on @@ -83,17 +103,23 @@ but it is not good to use one thread against many I/O devices. Running with PI setting, following settings steps are required. First, format device namespace with proper PI setting. For example: +```bash nvme format /dev/nvme0n1 -l 1 -i 1 -p 0 -m 1 +``` In fio configure file, add PRACT and set PRCHK by flags(GUARD|REFTAG|APPTAG) properly. For example: - pi_act=0 - pi_chk=GUARD +```bash +pi_act=0 +pi_chk=GUARD +``` Blocksize should be set as the sum of data and metadata. For example, if data blocksize is 512 Byte, host generated PI metadata is 8 Byte, then blocksize in fio configure file should be 520 Byte: - bs=520 +```bash +bs=520 +``` The storage device may use a block format that requires separate metadata (DIX). In this scenario, the fio_plugin will automatically allocate an extra 4KiB buffer per I/O to hold this metadata. For some cases, such as 512 byte @@ -108,18 +134,24 @@ tag mask are set to 0x1234 and 0xFFFF by default. To enable VMD enumeration add enable_vmd flag in fio configuration file: - enable_vmd=1 +```bash +enable_vmd=1 +``` ## ZNS To use Zoned Namespaces then build the io-engine against, and run using, a fio version >= 3.23 and add: - zonemode=zbd +```bash +zonemode=zbd +``` To your fio-script, also have a look at script-examples provided with fio: - fio/examples/zbd-seq-read.fio - fio/examples/zbd-rand-write.fio +```bash +fio/examples/zbd-seq-read.fio +fio/examples/zbd-rand-write.fio +``` ### Maximum Open Zones @@ -140,7 +172,9 @@ When running with the SPDK/NVMe fio io-engine you can be exposed to error messag completion errors, with the NVMe status code of 0xbd ("Too Many Active Zones"). To work around this, then you can reset all zones before fio start running its jobs by using the engine option: - --initial_zone_reset=1 +```bash +--initial_zone_reset=1 +``` ### Zone Append @@ -148,7 +182,9 @@ When running FIO against a Zoned Namespace you need to specify --iodepth=1 to av "Zone Invalid Write: The write to a zone was not at the write pointer." I/O errors. However, if your controller supports Zone Append, you can use the engine option: - --zone_append=1 +```bash +--zone_append=1 +``` To send zone append commands instead of write commands to the controller. When using zone append, you will be able to specify a --iodepth greater than 1. @@ -157,7 +193,9 @@ When using zone append, you will be able to specify a --iodepth greater than 1. If your device has a lot of zones, fio can give you errors such as: - smalloc: OOM. Consider using --alloc-size to increase the shared memory available. +```bash +smalloc: OOM. Consider using --alloc-size to increase the shared memory available. +``` This is because fio needs to allocate memory for the zone-report, that is, retrieve the state of zones on the device including auxiliary accounting information. To solve this, then you can follow diff --git a/mdl_rules.rb b/mdl_rules.rb index d6aa3e4e8..98d2369e4 100644 --- a/mdl_rules.rb +++ b/mdl_rules.rb @@ -9,4 +9,3 @@ exclude_rule 'MD031' exclude_rule 'MD033' exclude_rule 'MD034' exclude_rule 'MD041' -exclude_rule 'MD046'