doc: Fix Markdown MD032 linter warnings
"MD032 Lists should be surrounded by blank lines" Fix this markdown linter error by inserting newlines or adjusting text to list points using spaces. Signed-off-by: Karol Latecki <karol.latecki@intel.com> Change-Id: I09e1f021b8e95e0c6c58c393d7ecc11ce61c3132 Signed-off-by: Karol Latecki <karol.latecki@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/434 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Maciej Wawryk <maciejx.wawryk@intel.com>
This commit is contained in:
parent
acac4b3813
commit
3d8a0b19b0
12
CHANGELOG.md
12
CHANGELOG.md
@ -227,11 +227,13 @@ Added `spdk_bdev_get_write_unit_size()` function for retrieving required number
|
|||||||
of logical blocks for write operation.
|
of logical blocks for write operation.
|
||||||
|
|
||||||
New zone-related fields were added to the result of the `get_bdevs` RPC call:
|
New zone-related fields were added to the result of the `get_bdevs` RPC call:
|
||||||
|
|
||||||
- `zoned`: indicates whether the device is zoned or a regular
|
- `zoned`: indicates whether the device is zoned or a regular
|
||||||
block device
|
block device
|
||||||
- `zone_size`: number of blocks in a single zone
|
- `zone_size`: number of blocks in a single zone
|
||||||
- `max_open_zones`: maximum number of open zones
|
- `max_open_zones`: maximum number of open zones
|
||||||
- `optimal_open_zones`: optimal number of open zones
|
- `optimal_open_zones`: optimal number of open zones
|
||||||
|
|
||||||
The `zoned` field is a boolean and is always present, while the rest is only available for zoned
|
The `zoned` field is a boolean and is always present, while the rest is only available for zoned
|
||||||
bdevs.
|
bdevs.
|
||||||
|
|
||||||
@ -949,6 +951,7 @@ parameter. The function will now update that parameter with the largest possible
|
|||||||
for which the memory is contiguous in the physical memory address space.
|
for which the memory is contiguous in the physical memory address space.
|
||||||
|
|
||||||
The following functions were removed:
|
The following functions were removed:
|
||||||
|
|
||||||
- spdk_pci_nvme_device_attach()
|
- spdk_pci_nvme_device_attach()
|
||||||
- spdk_pci_nvme_enumerate()
|
- spdk_pci_nvme_enumerate()
|
||||||
- spdk_pci_ioat_device_attach()
|
- spdk_pci_ioat_device_attach()
|
||||||
@ -958,6 +961,7 @@ The following functions were removed:
|
|||||||
|
|
||||||
They were replaced with generic spdk_pci_device_attach() and spdk_pci_enumerate() which
|
They were replaced with generic spdk_pci_device_attach() and spdk_pci_enumerate() which
|
||||||
require a new spdk_pci_driver object to be provided. It can be one of the following:
|
require a new spdk_pci_driver object to be provided. It can be one of the following:
|
||||||
|
|
||||||
- spdk_pci_nvme_get_driver()
|
- spdk_pci_nvme_get_driver()
|
||||||
- spdk_pci_ioat_get_driver()
|
- spdk_pci_ioat_get_driver()
|
||||||
- spdk_pci_virtio_get_driver()
|
- spdk_pci_virtio_get_driver()
|
||||||
@ -1138,6 +1142,7 @@ Dropped support for DPDK 16.07 and earlier, which SPDK won't even compile with r
|
|||||||
### RPC
|
### RPC
|
||||||
|
|
||||||
The following RPC commands deprecated in the previous release are now removed:
|
The following RPC commands deprecated in the previous release are now removed:
|
||||||
|
|
||||||
- construct_virtio_user_scsi_bdev
|
- construct_virtio_user_scsi_bdev
|
||||||
- construct_virtio_pci_scsi_bdev
|
- construct_virtio_pci_scsi_bdev
|
||||||
- construct_virtio_user_blk_bdev
|
- construct_virtio_user_blk_bdev
|
||||||
@ -1326,6 +1331,7 @@ respectively.
|
|||||||
### Virtio
|
### Virtio
|
||||||
|
|
||||||
The following RPC commands have been deprecated:
|
The following RPC commands have been deprecated:
|
||||||
|
|
||||||
- construct_virtio_user_scsi_bdev
|
- construct_virtio_user_scsi_bdev
|
||||||
- construct_virtio_pci_scsi_bdev
|
- construct_virtio_pci_scsi_bdev
|
||||||
- construct_virtio_user_blk_bdev
|
- construct_virtio_user_blk_bdev
|
||||||
@ -1346,6 +1352,7 @@ spdk_file_get_id() returning unique ID for the file was added.
|
|||||||
Added jsonrpc-client C library intended for issuing RPC commands from applications.
|
Added jsonrpc-client C library intended for issuing RPC commands from applications.
|
||||||
|
|
||||||
Added API enabling iteration over JSON object:
|
Added API enabling iteration over JSON object:
|
||||||
|
|
||||||
- spdk_json_find()
|
- spdk_json_find()
|
||||||
- spdk_json_find_string()
|
- spdk_json_find_string()
|
||||||
- spdk_json_find_array()
|
- spdk_json_find_array()
|
||||||
@ -1785,6 +1792,7 @@ write commands.
|
|||||||
|
|
||||||
New API functions that accept I/O parameters in units of blocks instead of bytes
|
New API functions that accept I/O parameters in units of blocks instead of bytes
|
||||||
have been added:
|
have been added:
|
||||||
|
|
||||||
- spdk_bdev_read_blocks(), spdk_bdev_readv_blocks()
|
- spdk_bdev_read_blocks(), spdk_bdev_readv_blocks()
|
||||||
- spdk_bdev_write_blocks(), spdk_bdev_writev_blocks()
|
- spdk_bdev_write_blocks(), spdk_bdev_writev_blocks()
|
||||||
- spdk_bdev_write_zeroes_blocks()
|
- spdk_bdev_write_zeroes_blocks()
|
||||||
@ -1965,6 +1973,7 @@ current set of functions.
|
|||||||
Support for SPDK performance analysis has been added to Intel® VTune™ Amplifier 2018.
|
Support for SPDK performance analysis has been added to Intel® VTune™ Amplifier 2018.
|
||||||
|
|
||||||
This analysis provides:
|
This analysis provides:
|
||||||
|
|
||||||
- I/O performance monitoring (calculating standard I/O metrics like IOPS, throughput, etc.)
|
- I/O performance monitoring (calculating standard I/O metrics like IOPS, throughput, etc.)
|
||||||
- Tuning insights on the interplay of I/O and compute devices by estimating how many cores
|
- Tuning insights on the interplay of I/O and compute devices by estimating how many cores
|
||||||
would be reasonable to provide for SPDK to keep up with a current storage workload.
|
would be reasonable to provide for SPDK to keep up with a current storage workload.
|
||||||
@ -2115,6 +2124,7 @@ NVMe devices over a network using the iSCSI protocol. The application is located
|
|||||||
in app/iscsi_tgt and a documented configuration file can be found at etc/spdk/spdk.conf.in.
|
in app/iscsi_tgt and a documented configuration file can be found at etc/spdk/spdk.conf.in.
|
||||||
|
|
||||||
This release also significantly improves the existing NVMe over Fabrics target.
|
This release also significantly improves the existing NVMe over Fabrics target.
|
||||||
|
|
||||||
- The configuration file format was changed, which will require updates to
|
- The configuration file format was changed, which will require updates to
|
||||||
any existing nvmf.conf files (see `etc/spdk/nvmf.conf.in`):
|
any existing nvmf.conf files (see `etc/spdk/nvmf.conf.in`):
|
||||||
- `SubsystemGroup` was renamed to `Subsystem`.
|
- `SubsystemGroup` was renamed to `Subsystem`.
|
||||||
@ -2135,6 +2145,7 @@ This release also significantly improves the existing NVMe over Fabrics target.
|
|||||||
|
|
||||||
This release also adds one new feature and provides some better examples and tools
|
This release also adds one new feature and provides some better examples and tools
|
||||||
for the NVMe driver.
|
for the NVMe driver.
|
||||||
|
|
||||||
- The Weighted Round Robin arbitration method is now supported. This allows
|
- The Weighted Round Robin arbitration method is now supported. This allows
|
||||||
the user to specify different priorities on a per-I/O-queue basis. To
|
the user to specify different priorities on a per-I/O-queue basis. To
|
||||||
enable WRR, set the `arb_mechanism` field during `spdk_nvme_probe()`.
|
enable WRR, set the `arb_mechanism` field during `spdk_nvme_probe()`.
|
||||||
@ -2215,6 +2226,7 @@ This release adds a user-space driver with support for the Intel I/O Acceleratio
|
|||||||
This is the initial open source release of the Storage Performance Development Kit (SPDK).
|
This is the initial open source release of the Storage Performance Development Kit (SPDK).
|
||||||
|
|
||||||
Features:
|
Features:
|
||||||
|
|
||||||
- NVMe user-space driver
|
- NVMe user-space driver
|
||||||
- NVMe example programs
|
- NVMe example programs
|
||||||
- `examples/nvme/perf` tests performance (IOPS) using the NVMe user-space driver
|
- `examples/nvme/perf` tests performance (IOPS) using the NVMe user-space driver
|
||||||
|
@ -10,6 +10,7 @@ interrupts, which avoids kernel context switches and eliminates interrupt
|
|||||||
handling overhead.
|
handling overhead.
|
||||||
|
|
||||||
The development kit currently includes:
|
The development kit currently includes:
|
||||||
|
|
||||||
* [NVMe driver](http://www.spdk.io/doc/nvme.html)
|
* [NVMe driver](http://www.spdk.io/doc/nvme.html)
|
||||||
* [I/OAT (DMA engine) driver](http://www.spdk.io/doc/ioat.html)
|
* [I/OAT (DMA engine) driver](http://www.spdk.io/doc/ioat.html)
|
||||||
* [NVMe over Fabrics target](http://www.spdk.io/doc/nvmf.html)
|
* [NVMe over Fabrics target](http://www.spdk.io/doc/nvmf.html)
|
||||||
@ -172,6 +173,7 @@ of the SPDK static ones.
|
|||||||
|
|
||||||
In order to start a SPDK app linked with SPDK shared libraries, make sure
|
In order to start a SPDK app linked with SPDK shared libraries, make sure
|
||||||
to do the following steps:
|
to do the following steps:
|
||||||
|
|
||||||
- run ldconfig specifying the directory containing SPDK shared libraries
|
- run ldconfig specifying the directory containing SPDK shared libraries
|
||||||
- provide proper `LD_LIBRARY_PATH`
|
- provide proper `LD_LIBRARY_PATH`
|
||||||
|
|
||||||
|
@ -189,8 +189,8 @@ time the SPDK virtual bdev module supports cipher only as follows:
|
|||||||
|
|
||||||
- AESN-NI Multi Buffer Crypto Poll Mode Driver: RTE_CRYPTO_CIPHER_AES128_CBC
|
- AESN-NI Multi Buffer Crypto Poll Mode Driver: RTE_CRYPTO_CIPHER_AES128_CBC
|
||||||
- Intel(R) QuickAssist (QAT) Crypto Poll Mode Driver: RTE_CRYPTO_CIPHER_AES128_CBC
|
- Intel(R) QuickAssist (QAT) Crypto Poll Mode Driver: RTE_CRYPTO_CIPHER_AES128_CBC
|
||||||
(Note: QAT is functional however is marked as experimental until the hardware has
|
(Note: QAT is functional however is marked as experimental until the hardware has
|
||||||
been fully integrated with the SPDK CI system.)
|
been fully integrated with the SPDK CI system.)
|
||||||
|
|
||||||
In order to support using the bdev block offset (LBA) as the initialization vector (IV),
|
In order to support using the bdev block offset (LBA) as the initialization vector (IV),
|
||||||
the crypto module break up all I/O into crypto operations of a size equal to the block
|
the crypto module break up all I/O into crypto operations of a size equal to the block
|
||||||
|
109
doc/blob.md
109
doc/blob.md
@ -40,22 +40,22 @@ NAND too.
|
|||||||
The Blobstore defines a hierarchy of storage abstractions as follows.
|
The Blobstore defines a hierarchy of storage abstractions as follows.
|
||||||
|
|
||||||
* **Logical Block**: Logical blocks are exposed by the disk itself, which are numbered from 0 to N, where N is the
|
* **Logical Block**: Logical blocks are exposed by the disk itself, which are numbered from 0 to N, where N is the
|
||||||
number of blocks in the disk. A logical block is typically either 512B or 4KiB.
|
number of blocks in the disk. A logical block is typically either 512B or 4KiB.
|
||||||
* **Page**: A page is defined to be a fixed number of logical blocks defined at Blobstore creation time. The logical
|
* **Page**: A page is defined to be a fixed number of logical blocks defined at Blobstore creation time. The logical
|
||||||
blocks that compose a page are always contiguous. Pages are also numbered from the beginning of the disk such
|
blocks that compose a page are always contiguous. Pages are also numbered from the beginning of the disk such
|
||||||
that the first page worth of blocks is page 0, the second page is page 1, etc. A page is typically 4KiB in size,
|
that the first page worth of blocks is page 0, the second page is page 1, etc. A page is typically 4KiB in size,
|
||||||
so this is either 8 or 1 logical blocks in practice. The SSD must be able to perform atomic reads and writes of
|
so this is either 8 or 1 logical blocks in practice. The SSD must be able to perform atomic reads and writes of
|
||||||
at least the page size.
|
at least the page size.
|
||||||
* **Cluster**: A cluster is a fixed number of pages defined at Blobstore creation time. The pages that compose a cluster
|
* **Cluster**: A cluster is a fixed number of pages defined at Blobstore creation time. The pages that compose a cluster
|
||||||
are always contiguous. Clusters are also numbered from the beginning of the disk, where cluster 0 is the first cluster
|
are always contiguous. Clusters are also numbered from the beginning of the disk, where cluster 0 is the first cluster
|
||||||
worth of pages, cluster 1 is the second grouping of pages, etc. A cluster is typically 1MiB in size, or 256 pages.
|
worth of pages, cluster 1 is the second grouping of pages, etc. A cluster is typically 1MiB in size, or 256 pages.
|
||||||
* **Blob**: A blob is an ordered list of clusters. Blobs are manipulated (created, sized, deleted, etc.) by the application
|
* **Blob**: A blob is an ordered list of clusters. Blobs are manipulated (created, sized, deleted, etc.) by the application
|
||||||
and persist across power failures and reboots. Applications use a Blobstore provided identifier to access a particular blob.
|
and persist across power failures and reboots. Applications use a Blobstore provided identifier to access a particular blob.
|
||||||
Blobs are read and written in units of pages by specifying an offset from the start of the blob. Applications can also
|
Blobs are read and written in units of pages by specifying an offset from the start of the blob. Applications can also
|
||||||
store metadata in the form of key/value pairs with each blob which we'll refer to as xattrs (extended attributes).
|
store metadata in the form of key/value pairs with each blob which we'll refer to as xattrs (extended attributes).
|
||||||
* **Blobstore**: An SSD which has been initialized by a Blobstore-based application is referred to as "a Blobstore." A
|
* **Blobstore**: An SSD which has been initialized by a Blobstore-based application is referred to as "a Blobstore." A
|
||||||
Blobstore owns the entire underlying device which is made up of a private Blobstore metadata region and the collection of
|
Blobstore owns the entire underlying device which is made up of a private Blobstore metadata region and the collection of
|
||||||
blobs as managed by the application.
|
blobs as managed by the application.
|
||||||
|
|
||||||
@htmlonly
|
@htmlonly
|
||||||
|
|
||||||
@ -115,19 +115,19 @@ For all Blobstore operations regarding atomicity, there is a dependency on the u
|
|||||||
operations of at least one page size. Atomicity here can refer to multiple operations:
|
operations of at least one page size. Atomicity here can refer to multiple operations:
|
||||||
|
|
||||||
* **Data Writes**: For the case of data writes, the unit of atomicity is one page. Therefore if a write operation of
|
* **Data Writes**: For the case of data writes, the unit of atomicity is one page. Therefore if a write operation of
|
||||||
greater than one page is underway and the system suffers a power failure, the data on media will be consistent at a page
|
greater than one page is underway and the system suffers a power failure, the data on media will be consistent at a page
|
||||||
size granularity (if a single page were in the middle of being updated when power was lost, the data at that page location
|
size granularity (if a single page were in the middle of being updated when power was lost, the data at that page location
|
||||||
will be as it was prior to the start of the write operation following power restoration.)
|
will be as it was prior to the start of the write operation following power restoration.)
|
||||||
* **Blob Metadata Updates**: Each blob has its own set of metadata (xattrs, size, etc). For performance reasons, a copy of
|
* **Blob Metadata Updates**: Each blob has its own set of metadata (xattrs, size, etc). For performance reasons, a copy of
|
||||||
this metadata is kept in RAM and only synchronized with the on-disk version when the application makes an explicit call to
|
this metadata is kept in RAM and only synchronized with the on-disk version when the application makes an explicit call to
|
||||||
do so, or when the Blobstore is unloaded. Therefore, setting of an xattr, for example is not consistent until the call to
|
do so, or when the Blobstore is unloaded. Therefore, setting of an xattr, for example is not consistent until the call to
|
||||||
synchronize it (covered later) which is, however, performed atomically.
|
synchronize it (covered later) which is, however, performed atomically.
|
||||||
* **Blobstore Metadata Updates**: Blobstore itself has its own metadata which, like per blob metadata, has a copy in both
|
* **Blobstore Metadata Updates**: Blobstore itself has its own metadata which, like per blob metadata, has a copy in both
|
||||||
RAM and on-disk. Unlike the per blob metadata, however, the Blobstore metadata region is not made consistent via a blob
|
RAM and on-disk. Unlike the per blob metadata, however, the Blobstore metadata region is not made consistent via a blob
|
||||||
synchronization call, it is only synchronized when the Blobstore is properly unloaded via API. Therefore, if the Blobstore
|
synchronization call, it is only synchronized when the Blobstore is properly unloaded via API. Therefore, if the Blobstore
|
||||||
metadata is updated (blob creation, deletion, resize, etc.) and not unloaded properly, it will need to perform some extra
|
metadata is updated (blob creation, deletion, resize, etc.) and not unloaded properly, it will need to perform some extra
|
||||||
steps the next time it is loaded which will take a bit more time than it would have if shutdown cleanly, but there will be
|
steps the next time it is loaded which will take a bit more time than it would have if shutdown cleanly, but there will be
|
||||||
no inconsistencies.
|
no inconsistencies.
|
||||||
|
|
||||||
### Callbacks
|
### Callbacks
|
||||||
|
|
||||||
@ -183,22 +183,22 @@ When the Blobstore is initialized, there are multiple configuration options to c
|
|||||||
options and their defaults are:
|
options and their defaults are:
|
||||||
|
|
||||||
* **Cluster Size**: By default, this value is 1MB. The cluster size is required to be a multiple of page size and should be
|
* **Cluster Size**: By default, this value is 1MB. The cluster size is required to be a multiple of page size and should be
|
||||||
selected based on the application’s usage model in terms of allocation. Recall that blobs are made up of clusters so when
|
selected based on the application’s usage model in terms of allocation. Recall that blobs are made up of clusters so when
|
||||||
a blob is allocated/deallocated or changes in size, disk LBAs will be manipulated in groups of cluster size. If the
|
a blob is allocated/deallocated or changes in size, disk LBAs will be manipulated in groups of cluster size. If the
|
||||||
application is expecting to deal with mainly very large (always multiple GB) blobs then it may make sense to change the
|
application is expecting to deal with mainly very large (always multiple GB) blobs then it may make sense to change the
|
||||||
cluster size to 1GB for example.
|
cluster size to 1GB for example.
|
||||||
* **Number of Metadata Pages**: By default, Blobstore will assume there can be as many clusters as there are metadata pages
|
* **Number of Metadata Pages**: By default, Blobstore will assume there can be as many clusters as there are metadata pages
|
||||||
which is the worst case scenario in terms of metadata usage and can be overridden here however the space efficiency is
|
which is the worst case scenario in terms of metadata usage and can be overridden here however the space efficiency is
|
||||||
not significant.
|
not significant.
|
||||||
* **Maximum Simultaneous Metadata Operations**: Determines how many internally pre-allocated memory structures are set
|
* **Maximum Simultaneous Metadata Operations**: Determines how many internally pre-allocated memory structures are set
|
||||||
aside for performing metadata operations. It is unlikely that changes to this value (default 32) would be desirable.
|
aside for performing metadata operations. It is unlikely that changes to this value (default 32) would be desirable.
|
||||||
* **Maximum Simultaneous Operations Per Channel**: Determines how many internally pre-allocated memory structures are set
|
* **Maximum Simultaneous Operations Per Channel**: Determines how many internally pre-allocated memory structures are set
|
||||||
aside for channel operations. Changes to this value would be application dependent and best determined by both a knowledge
|
aside for channel operations. Changes to this value would be application dependent and best determined by both a knowledge
|
||||||
of the typical usage model, an understanding of the types of SSDs being used and empirical data. The default is 512.
|
of the typical usage model, an understanding of the types of SSDs being used and empirical data. The default is 512.
|
||||||
* **Blobstore Type**: This field is a character array to be used by applications that need to identify whether the
|
* **Blobstore Type**: This field is a character array to be used by applications that need to identify whether the
|
||||||
Blobstore found here is appropriate to claim or not. The default is NULL and unless the application is being deployed in
|
Blobstore found here is appropriate to claim or not. The default is NULL and unless the application is being deployed in
|
||||||
an environment where multiple applications using the same disks are at risk of inadvertently using the wrong Blobstore, there
|
an environment where multiple applications using the same disks are at risk of inadvertently using the wrong Blobstore, there
|
||||||
is no need to set this value. It can, however, be set to any valid set of characters.
|
is no need to set this value. It can, however, be set to any valid set of characters.
|
||||||
|
|
||||||
### Sub-page Sized Operations
|
### Sub-page Sized Operations
|
||||||
|
|
||||||
@ -210,10 +210,11 @@ requires finer granularity it will have to accommodate that itself.
|
|||||||
As mentioned earlier, Blobstore can share a single thread with an application or the application
|
As mentioned earlier, Blobstore can share a single thread with an application or the application
|
||||||
can define any number of threads, within resource constraints, that makes sense. The basic considerations that must be
|
can define any number of threads, within resource constraints, that makes sense. The basic considerations that must be
|
||||||
followed are:
|
followed are:
|
||||||
|
|
||||||
* Metadata operations (API with MD in the name) should be isolated from each other as there is no internal locking on the
|
* Metadata operations (API with MD in the name) should be isolated from each other as there is no internal locking on the
|
||||||
memory structures affected by these API.
|
memory structures affected by these API.
|
||||||
* Metadata operations should be isolated from conflicting IO operations (an example of a conflicting IO would be one that is
|
* Metadata operations should be isolated from conflicting IO operations (an example of a conflicting IO would be one that is
|
||||||
reading/writing to an area of a blob that a metadata operation is deallocating).
|
reading/writing to an area of a blob that a metadata operation is deallocating).
|
||||||
* Asynchronous callbacks will always take place on the calling thread.
|
* Asynchronous callbacks will always take place on the calling thread.
|
||||||
* No assumptions about IO ordering can be made regardless of how many or which threads were involved in the issuing.
|
* No assumptions about IO ordering can be made regardless of how many or which threads were involved in the issuing.
|
||||||
|
|
||||||
@ -267,18 +268,18 @@ relevant in understanding any kind of structure for what is on the Blobstore.
|
|||||||
There are multiple examples of Blobstore usage in the [repo](https://github.com/spdk/spdk):
|
There are multiple examples of Blobstore usage in the [repo](https://github.com/spdk/spdk):
|
||||||
|
|
||||||
* **Hello World**: Actually named `hello_blob.c` this is a very basic example of a single threaded application that
|
* **Hello World**: Actually named `hello_blob.c` this is a very basic example of a single threaded application that
|
||||||
does nothing more than demonstrate the very basic API. Although Blobstore is optimized for NVMe, this example uses
|
does nothing more than demonstrate the very basic API. Although Blobstore is optimized for NVMe, this example uses
|
||||||
a RAM disk (malloc) back-end so that it can be executed easily in any development environment. The malloc back-end
|
a RAM disk (malloc) back-end so that it can be executed easily in any development environment. The malloc back-end
|
||||||
is a `bdev` module thus this example uses not only the SPDK Framework but the `bdev` layer as well.
|
is a `bdev` module thus this example uses not only the SPDK Framework but the `bdev` layer as well.
|
||||||
|
|
||||||
* **CLI**: The `blobcli.c` example is command line utility intended to not only serve as example code but as a test
|
* **CLI**: The `blobcli.c` example is command line utility intended to not only serve as example code but as a test
|
||||||
and development tool for Blobstore itself. It is also a simple single threaded application that relies on both the
|
and development tool for Blobstore itself. It is also a simple single threaded application that relies on both the
|
||||||
SPDK Framework and the `bdev` layer but offers multiple modes of operation to accomplish some real-world tasks. In
|
SPDK Framework and the `bdev` layer but offers multiple modes of operation to accomplish some real-world tasks. In
|
||||||
command mode, it accepts single-shot commands which can be a little time consuming if there are many commands to
|
command mode, it accepts single-shot commands which can be a little time consuming if there are many commands to
|
||||||
get through as each one will take a few seconds waiting for DPDK initialization. It therefore has a shell mode that
|
get through as each one will take a few seconds waiting for DPDK initialization. It therefore has a shell mode that
|
||||||
allows the developer to get to a `blob>` prompt and then very quickly interact with Blobstore with simple commands
|
allows the developer to get to a `blob>` prompt and then very quickly interact with Blobstore with simple commands
|
||||||
that include the ability to import/export blobs from/to regular files. Lastly there is a scripting mode to automate
|
that include the ability to import/export blobs from/to regular files. Lastly there is a scripting mode to automate
|
||||||
a series of tasks, again, handy for development and/or test type activities.
|
a series of tasks, again, handy for development and/or test type activities.
|
||||||
|
|
||||||
## Configuration {#blob_pg_config}
|
## Configuration {#blob_pg_config}
|
||||||
|
|
||||||
@ -326,15 +327,16 @@ to the unallocated cluster - new extent is chosen. This information is stored in
|
|||||||
|
|
||||||
There are two extent representations on-disk, dependent on `use_extent_table` (default:true) opts used
|
There are two extent representations on-disk, dependent on `use_extent_table` (default:true) opts used
|
||||||
when creating a blob.
|
when creating a blob.
|
||||||
|
|
||||||
* **use_extent_table=true**: EXTENT_PAGE descriptor is not part of linked list of pages. It contains extents
|
* **use_extent_table=true**: EXTENT_PAGE descriptor is not part of linked list of pages. It contains extents
|
||||||
that are not run-length encoded. Each extent page is referenced by EXTENT_TABLE descriptor, which is serialized
|
that are not run-length encoded. Each extent page is referenced by EXTENT_TABLE descriptor, which is serialized
|
||||||
as part of linked list of pages. Extent table is run-length encoding all unallocated extent pages.
|
as part of linked list of pages. Extent table is run-length encoding all unallocated extent pages.
|
||||||
Every new cluster allocation updates a single extent page, in case when extent page was previously allocated.
|
Every new cluster allocation updates a single extent page, in case when extent page was previously allocated.
|
||||||
Otherwise additionally incurs serializing whole linked list of pages for the blob.
|
Otherwise additionally incurs serializing whole linked list of pages for the blob.
|
||||||
|
|
||||||
* **use_extent_table=false**: EXTENT_RLE descriptor is serialized as part of linked list of pages.
|
* **use_extent_table=false**: EXTENT_RLE descriptor is serialized as part of linked list of pages.
|
||||||
Extents pointing to contiguous LBA are run-length encoded, including unallocated extents represented by 0.
|
Extents pointing to contiguous LBA are run-length encoded, including unallocated extents represented by 0.
|
||||||
Every new cluster allocation incurs serializing whole linked list of pages for the blob.
|
Every new cluster allocation incurs serializing whole linked list of pages for the blob.
|
||||||
|
|
||||||
### Sequences and Batches
|
### Sequences and Batches
|
||||||
|
|
||||||
@ -393,5 +395,6 @@ example,
|
|||||||
~~~
|
~~~
|
||||||
|
|
||||||
And for the most part the following conventions are followed throughout:
|
And for the most part the following conventions are followed throughout:
|
||||||
|
|
||||||
* functions beginning with an underscore are called internally only
|
* functions beginning with an underscore are called internally only
|
||||||
* functions or variables with the letters `cpl` are related to set or callback completions
|
* functions or variables with the letters `cpl` are related to set or callback completions
|
||||||
|
@ -20,7 +20,7 @@ properties:
|
|||||||
because you don't have to change the data model from the single-threaded
|
because you don't have to change the data model from the single-threaded
|
||||||
version. You add a lock around the data.
|
version. You add a lock around the data.
|
||||||
* You can write your program as a synchronous, imperative list of statements
|
* You can write your program as a synchronous, imperative list of statements
|
||||||
that you read from top to bottom.
|
that you read from top to bottom.
|
||||||
* The scheduler can interrupt threads, allowing for efficient time-sharing
|
* The scheduler can interrupt threads, allowing for efficient time-sharing
|
||||||
of CPU resources.
|
of CPU resources.
|
||||||
|
|
||||||
|
@ -19,7 +19,7 @@ containerize your SPDK based application.
|
|||||||
3. Make sure your host has hugepages enabled
|
3. Make sure your host has hugepages enabled
|
||||||
4. Make sure your host has bound your nvme device to your userspace driver
|
4. Make sure your host has bound your nvme device to your userspace driver
|
||||||
5. Write your Dockerfile. The following is a simple Dockerfile to containerize the nvme `hello_world`
|
5. Write your Dockerfile. The following is a simple Dockerfile to containerize the nvme `hello_world`
|
||||||
example:
|
example:
|
||||||
|
|
||||||
~~~{.sh}
|
~~~{.sh}
|
||||||
# start with the latest Fedora
|
# start with the latest Fedora
|
||||||
|
@ -46,6 +46,7 @@ from the oldest band to the youngest.
|
|||||||
The address map and valid map are, along with a several other things (e.g. UUID of the device it's
|
The address map and valid map are, along with a several other things (e.g. UUID of the device it's
|
||||||
part of, number of surfaced LBAs, band's sequence number, etc.), parts of the band's metadata. The
|
part of, number of surfaced LBAs, band's sequence number, etc.), parts of the band's metadata. The
|
||||||
metadata is split in two parts:
|
metadata is split in two parts:
|
||||||
|
|
||||||
* the head part, containing information already known when opening the band (device's UUID, band's
|
* the head part, containing information already known when opening the band (device's UUID, band's
|
||||||
sequence number, etc.), located at the beginning blocks of the band,
|
sequence number, etc.), located at the beginning blocks of the band,
|
||||||
* the tail part, containing the address map and the valid map, located at the end of the band.
|
* the tail part, containing the address map and the valid map, located at the end of the band.
|
||||||
@ -146,6 +147,7 @@ bdev or OCSSD `nvme` bdev.
|
|||||||
Similar to other bdevs, the FTL bdevs can be created either based on JSON config files or via RPC.
|
Similar to other bdevs, the FTL bdevs can be created either based on JSON config files or via RPC.
|
||||||
Both interfaces require the same arguments which are described by the `--help` option of the
|
Both interfaces require the same arguments which are described by the `--help` option of the
|
||||||
`bdev_ftl_create` RPC call, which are:
|
`bdev_ftl_create` RPC call, which are:
|
||||||
|
|
||||||
- bdev's name
|
- bdev's name
|
||||||
- base bdev's name (base bdev must implement bdev_zone API)
|
- base bdev's name (base bdev must implement bdev_zone API)
|
||||||
- UUID of the FTL device (if the FTL is to be restored from the SSD)
|
- UUID of the FTL device (if the FTL is to be restored from the SSD)
|
||||||
@ -161,6 +163,7 @@ on [spdk-3.0.0](https://github.com/spdk/qemu/tree/spdk-3.0.0) branch.
|
|||||||
|
|
||||||
To emulate an Open Channel device, QEMU expects parameters describing the characteristics and
|
To emulate an Open Channel device, QEMU expects parameters describing the characteristics and
|
||||||
geometry of the SSD:
|
geometry of the SSD:
|
||||||
|
|
||||||
- `serial` - serial number,
|
- `serial` - serial number,
|
||||||
- `lver` - version of the OCSSD standard (0 - disabled, 1 - "1.2", 2 - "2.0"), libftl only supports
|
- `lver` - version of the OCSSD standard (0 - disabled, 1 - "1.2", 2 - "2.0"), libftl only supports
|
||||||
2.0,
|
2.0,
|
||||||
@ -240,6 +243,7 @@ Logical blks per chunk: 24576
|
|||||||
```
|
```
|
||||||
|
|
||||||
In order to create FTL on top Open Channel SSD, the following steps are required:
|
In order to create FTL on top Open Channel SSD, the following steps are required:
|
||||||
|
|
||||||
1) Attach OCSSD NVMe controller
|
1) Attach OCSSD NVMe controller
|
||||||
2) Create OCSSD bdev on the controller attached in step 1 (user could specify parallel unit range
|
2) Create OCSSD bdev on the controller attached in step 1 (user could specify parallel unit range
|
||||||
and create multiple OCSSD bdevs on single OCSSD NVMe controller)
|
and create multiple OCSSD bdevs on single OCSSD NVMe controller)
|
||||||
|
24
doc/iscsi.md
24
doc/iscsi.md
@ -309,20 +309,20 @@ sde
|
|||||||
At the iSCSI level, we provide the following support for Hotplug:
|
At the iSCSI level, we provide the following support for Hotplug:
|
||||||
|
|
||||||
1. bdev/nvme:
|
1. bdev/nvme:
|
||||||
At the bdev/nvme level, we start one hotplug monitor which will call
|
At the bdev/nvme level, we start one hotplug monitor which will call
|
||||||
spdk_nvme_probe() periodically to get the hotplug events. We provide the
|
spdk_nvme_probe() periodically to get the hotplug events. We provide the
|
||||||
private attach_cb and remove_cb for spdk_nvme_probe(). For the attach_cb,
|
private attach_cb and remove_cb for spdk_nvme_probe(). For the attach_cb,
|
||||||
we will create the block device base on the NVMe device attached, and for the
|
we will create the block device base on the NVMe device attached, and for the
|
||||||
remove_cb, we will unregister the block device, which will also notify the
|
remove_cb, we will unregister the block device, which will also notify the
|
||||||
upper level stack (for iSCSI target, the upper level stack is scsi/lun) to
|
upper level stack (for iSCSI target, the upper level stack is scsi/lun) to
|
||||||
handle the hot-remove event.
|
handle the hot-remove event.
|
||||||
|
|
||||||
2. scsi/lun:
|
2. scsi/lun:
|
||||||
When the LUN receive the hot-remove notification from block device layer,
|
When the LUN receive the hot-remove notification from block device layer,
|
||||||
the LUN will be marked as removed, and all the IOs after this point will
|
the LUN will be marked as removed, and all the IOs after this point will
|
||||||
return with check condition status. Then the LUN starts one poller which will
|
return with check condition status. Then the LUN starts one poller which will
|
||||||
wait for all the commands which have already been submitted to block device to
|
wait for all the commands which have already been submitted to block device to
|
||||||
return back; after all the commands return back, the LUN will be deleted.
|
return back; after all the commands return back, the LUN will be deleted.
|
||||||
|
|
||||||
## Known bugs and limitations {#iscsi_hotplug_bugs}
|
## Known bugs and limitations {#iscsi_hotplug_bugs}
|
||||||
|
|
||||||
|
@ -4950,6 +4950,7 @@ Either UUID or name is used to access logical volume store in RPCs.
|
|||||||
A logical volume has a UUID and a name for its identification.
|
A logical volume has a UUID and a name for its identification.
|
||||||
The UUID of the logical volume is generated on creation and it can be unique identifier.
|
The UUID of the logical volume is generated on creation and it can be unique identifier.
|
||||||
The alias of the logical volume takes the format _lvs_name/lvol_name_ where:
|
The alias of the logical volume takes the format _lvs_name/lvol_name_ where:
|
||||||
|
|
||||||
* _lvs_name_ is the name of the logical volume store.
|
* _lvs_name_ is the name of the logical volume store.
|
||||||
* _lvol_name_ is specified on creation and can be renamed.
|
* _lvol_name_ is specified on creation and can be renamed.
|
||||||
|
|
||||||
|
@ -6,9 +6,10 @@ Now nvme-cli can support both kernel driver and SPDK user mode driver for most o
|
|||||||
Intel specific commands.
|
Intel specific commands.
|
||||||
|
|
||||||
1. Clone the nvme-cli repository from the SPDK GitHub fork. Make sure you check out the spdk-1.6 branch.
|
1. Clone the nvme-cli repository from the SPDK GitHub fork. Make sure you check out the spdk-1.6 branch.
|
||||||
~~~{.sh}
|
|
||||||
git clone -b spdk-1.6 https://github.com/spdk/nvme-cli.git
|
~~~{.sh}
|
||||||
~~~
|
git clone -b spdk-1.6 https://github.com/spdk/nvme-cli.git
|
||||||
|
~~~
|
||||||
|
|
||||||
2. Clone the SPDK repository from https://github.com/spdk/spdk under the nvme-cli folder.
|
2. Clone the SPDK repository from https://github.com/spdk/spdk under the nvme-cli folder.
|
||||||
|
|
||||||
@ -19,47 +20,51 @@ git clone -b spdk-1.6 https://github.com/spdk/nvme-cli.git
|
|||||||
5. Execute "<spdk_folder>/scripts/setup.sh" with the "root" account.
|
5. Execute "<spdk_folder>/scripts/setup.sh" with the "root" account.
|
||||||
|
|
||||||
6. Update the "spdk.conf" file under nvme-cli folder to properly configure the SPDK. Notes as following:
|
6. Update the "spdk.conf" file under nvme-cli folder to properly configure the SPDK. Notes as following:
|
||||||
~~~{.sh}
|
|
||||||
spdk=1
|
|
||||||
Indicates whether or not to use spdk. Can be 0 (off) or 1 (on).
|
|
||||||
Defaults to 1 which assumes that you have run "<spdk_folder>/scripts/setup.sh", unbinding your drives from the kernel.
|
|
||||||
|
|
||||||
|
~~~{.sh}
|
||||||
|
spdk=1
|
||||||
|
Indicates whether or not to use spdk. Can be 0 (off) or 1 (on).
|
||||||
|
Defaults to 1 which assumes that you have run "<spdk_folder>/scripts/setup.sh", unbinding your drives from the kernel.
|
||||||
|
|
||||||
core_mask=0x1
|
core_mask=0x1
|
||||||
A bitmask representing which core(s) to use for nvme-cli operations.
|
A bitmask representing which core(s) to use for nvme-cli operations.
|
||||||
Defaults to core 0.
|
Defaults to core 0.
|
||||||
|
|
||||||
mem_size=512
|
mem_size=512
|
||||||
The amount of reserved hugepage memory to use for nvme-cli (in MB).
|
The amount of reserved hugepage memory to use for nvme-cli (in MB).
|
||||||
Defaults to 512MB.
|
Defaults to 512MB.
|
||||||
|
|
||||||
shm_id=0
|
shm_id=0
|
||||||
Indicates the shared memory ID for the spdk application with which your NVMe drives are associated,
|
Indicates the shared memory ID for the spdk application with which your NVMe drives are associated,
|
||||||
and should be adjusted accordingly.
|
and should be adjusted accordingly.
|
||||||
Defaults to 0.
|
Defaults to 0.
|
||||||
~~~
|
~~~
|
||||||
|
|
||||||
7. Run the "./nvme list" command to get the domain:bus:device.function for each found NVMe SSD.
|
7. Run the "./nvme list" command to get the domain:bus:device.function for each found NVMe SSD.
|
||||||
|
|
||||||
8. Run the other nvme commands with domain:bus:device.function instead of "/dev/nvmeX" for the specified device.
|
8. Run the other nvme commands with domain:bus:device.function instead of "/dev/nvmeX" for the specified device.
|
||||||
~~~{.sh}
|
|
||||||
Example: ./nvme smart-log 0000:01:00.0
|
~~~{.sh}
|
||||||
~~~
|
Example: ./nvme smart-log 0000:01:00.0
|
||||||
|
~~~
|
||||||
|
|
||||||
9. Run the "./nvme intel" commands for Intel specific commands against Intel NVMe SSD.
|
9. Run the "./nvme intel" commands for Intel specific commands against Intel NVMe SSD.
|
||||||
~~~{.sh}
|
|
||||||
Example: ./nvme intel internal-log 0000:08:00.0
|
~~~{.sh}
|
||||||
~~~
|
Example: ./nvme intel internal-log 0000:08:00.0
|
||||||
|
~~~
|
||||||
|
|
||||||
10. Execute "<spdk_folder>/scripts/setup.sh reset" with the "root" account and update "spdk=0" in spdk.conf to
|
10. Execute "<spdk_folder>/scripts/setup.sh reset" with the "root" account and update "spdk=0" in spdk.conf to
|
||||||
use the kernel driver if wanted.
|
use the kernel driver if wanted.
|
||||||
|
|
||||||
## Use scenarios
|
## Use scenarios
|
||||||
|
|
||||||
### Run as the only SPDK application on the system
|
### Run as the only SPDK application on the system
|
||||||
|
|
||||||
1. Modify the spdk to 1 in spdk.conf. If the system has fewer cores or less memory, update the spdk.conf accordingly.
|
1. Modify the spdk to 1 in spdk.conf. If the system has fewer cores or less memory, update the spdk.conf accordingly.
|
||||||
|
|
||||||
### Run together with other running SPDK applications on shared NVMe SSDs
|
### Run together with other running SPDK applications on shared NVMe SSDs
|
||||||
|
|
||||||
1. For the other running SPDK application, start with the parameter like "-i 1" to have the same "shm_id".
|
1. For the other running SPDK application, start with the parameter like "-i 1" to have the same "shm_id".
|
||||||
|
|
||||||
2. Use the default spdk.conf setting where "shm_id=1" to start the nvme-cli.
|
2. Use the default spdk.conf setting where "shm_id=1" to start the nvme-cli.
|
||||||
@ -67,21 +72,25 @@ use the kernel driver if wanted.
|
|||||||
3. If other SPDK applications run with different shm_id parameter, update the "spdk.conf" accordingly.
|
3. If other SPDK applications run with different shm_id parameter, update the "spdk.conf" accordingly.
|
||||||
|
|
||||||
### Run with other running SPDK applications on non-shared NVMe SSDs
|
### Run with other running SPDK applications on non-shared NVMe SSDs
|
||||||
|
|
||||||
1. Properly configure the other running SPDK applications.
|
1. Properly configure the other running SPDK applications.
|
||||||
~~~{.sh}
|
|
||||||
a. Only access the NVMe SSDs it wants.
|
~~~{.sh}
|
||||||
b. Allocate a fixed number of memory instead of all available memory.
|
a. Only access the NVMe SSDs it wants.
|
||||||
~~~
|
b. Allocate a fixed number of memory instead of all available memory.
|
||||||
|
~~~
|
||||||
|
|
||||||
2. Properly configure the spdk.conf setting for nvme-cli.
|
2. Properly configure the spdk.conf setting for nvme-cli.
|
||||||
~~~{.sh}
|
|
||||||
a. Not access the NVMe SSDs from other SPDK applications.
|
~~~{.sh}
|
||||||
b. Change the mem_size to a proper size.
|
a. Not access the NVMe SSDs from other SPDK applications.
|
||||||
~~~
|
b. Change the mem_size to a proper size.
|
||||||
|
~~~
|
||||||
|
|
||||||
## Note
|
## Note
|
||||||
|
|
||||||
1. To run the newly built nvme-cli, either explicitly run as "./nvme" or added it into the $PATH to avoid
|
1. To run the newly built nvme-cli, either explicitly run as "./nvme" or added it into the $PATH to avoid
|
||||||
invoke other already installed version.
|
invoke other already installed version.
|
||||||
|
|
||||||
2. To run the newly built nvme-cli with SPDK support in arbitrary directory, copy "spdk.conf" to that
|
2. To run the newly built nvme-cli with SPDK support in arbitrary directory, copy "spdk.conf" to that
|
||||||
directory from the nvme cli folder and update the configuration as suggested.
|
directory from the nvme cli folder and update the configuration as suggested.
|
||||||
|
27
doc/nvme.md
27
doc/nvme.md
@ -249,9 +249,10 @@ DPDK EAL allows different types of processes to be spawned, each with different
|
|||||||
on the hugepage memory used by the applications.
|
on the hugepage memory used by the applications.
|
||||||
|
|
||||||
There are two types of processes:
|
There are two types of processes:
|
||||||
|
|
||||||
1. a primary process which initializes the shared memory and has full privileges and
|
1. a primary process which initializes the shared memory and has full privileges and
|
||||||
2. a secondary process which can attach to the primary process by mapping its shared memory
|
2. a secondary process which can attach to the primary process by mapping its shared memory
|
||||||
regions and perform NVMe operations including creating queue pairs.
|
regions and perform NVMe operations including creating queue pairs.
|
||||||
|
|
||||||
This feature is enabled by default and is controlled by selecting a value for the shared
|
This feature is enabled by default and is controlled by selecting a value for the shared
|
||||||
memory group ID. This ID is a positive integer and two applications with the same shared
|
memory group ID. This ID is a positive integer and two applications with the same shared
|
||||||
@ -272,10 +273,10 @@ Example: identical shm_id and non-overlapping core masks
|
|||||||
|
|
||||||
1. Two processes sharing memory may not share any cores in their core mask.
|
1. Two processes sharing memory may not share any cores in their core mask.
|
||||||
2. If a primary process exits while secondary processes are still running, those processes
|
2. If a primary process exits while secondary processes are still running, those processes
|
||||||
will continue to run. However, a new primary process cannot be created.
|
will continue to run. However, a new primary process cannot be created.
|
||||||
3. Applications are responsible for coordinating access to logical blocks.
|
3. Applications are responsible for coordinating access to logical blocks.
|
||||||
4. If a process exits unexpectedly, the allocated memory will be released when the last
|
4. If a process exits unexpectedly, the allocated memory will be released when the last
|
||||||
process exits.
|
process exits.
|
||||||
|
|
||||||
@sa spdk_nvme_probe, spdk_nvme_ctrlr_process_admin_completions
|
@sa spdk_nvme_probe, spdk_nvme_ctrlr_process_admin_completions
|
||||||
|
|
||||||
@ -285,18 +286,18 @@ process exits.
|
|||||||
At the NVMe driver level, we provide the following support for Hotplug:
|
At the NVMe driver level, we provide the following support for Hotplug:
|
||||||
|
|
||||||
1. Hotplug events detection:
|
1. Hotplug events detection:
|
||||||
The user of the NVMe library can call spdk_nvme_probe() periodically to detect
|
The user of the NVMe library can call spdk_nvme_probe() periodically to detect
|
||||||
hotplug events. The probe_cb, followed by the attach_cb, will be called for each
|
hotplug events. The probe_cb, followed by the attach_cb, will be called for each
|
||||||
new device detected. The user may optionally also provide a remove_cb that will be
|
new device detected. The user may optionally also provide a remove_cb that will be
|
||||||
called if a previously attached NVMe device is no longer present on the system.
|
called if a previously attached NVMe device is no longer present on the system.
|
||||||
All subsequent I/O to the removed device will return an error.
|
All subsequent I/O to the removed device will return an error.
|
||||||
|
|
||||||
2. Hot remove NVMe with IO loads:
|
2. Hot remove NVMe with IO loads:
|
||||||
When a device is hot removed while I/O is occurring, all access to the PCI BAR will
|
When a device is hot removed while I/O is occurring, all access to the PCI BAR will
|
||||||
result in a SIGBUS error. The NVMe driver automatically handles this case by installing
|
result in a SIGBUS error. The NVMe driver automatically handles this case by installing
|
||||||
a SIGBUS handler and remapping the PCI BAR to a new, placeholder memory location.
|
a SIGBUS handler and remapping the PCI BAR to a new, placeholder memory location.
|
||||||
This means I/O in flight during a hot remove will complete with an appropriate error
|
This means I/O in flight during a hot remove will complete with an appropriate error
|
||||||
code and will not crash the application.
|
code and will not crash the application.
|
||||||
|
|
||||||
@sa spdk_nvme_probe
|
@sa spdk_nvme_probe
|
||||||
|
|
||||||
|
@ -201,6 +201,7 @@ NVMe Domain NQN = "nqn.", year, '-', month, '.', reverse domain, ':', utf-8 stri
|
|||||||
~~~
|
~~~
|
||||||
|
|
||||||
Please note that the following types from the definition above are defined elsewhere:
|
Please note that the following types from the definition above are defined elsewhere:
|
||||||
|
|
||||||
1. utf-8 string: Defined in [rfc 3629](https://tools.ietf.org/html/rfc3629).
|
1. utf-8 string: Defined in [rfc 3629](https://tools.ietf.org/html/rfc3629).
|
||||||
2. reverse domain: Equivalent to domain name as defined in [rfc 1034](https://tools.ietf.org/html/rfc1034).
|
2. reverse domain: Equivalent to domain name as defined in [rfc 1034](https://tools.ietf.org/html/rfc1034).
|
||||||
|
|
||||||
|
@ -11,6 +11,7 @@ for the next SPDK release.
|
|||||||
|
|
||||||
All dependencies should be handled by scripts/pkgdep.sh script.
|
All dependencies should be handled by scripts/pkgdep.sh script.
|
||||||
Package dependencies at the moment include:
|
Package dependencies at the moment include:
|
||||||
|
|
||||||
- configshell
|
- configshell
|
||||||
|
|
||||||
### Run SPDK application instance
|
### Run SPDK application instance
|
||||||
|
@ -31,6 +31,7 @@ copy the vagrant configuration file (a.k.a. `Vagrantfile`) to it,
|
|||||||
and run `vagrant up` with some settings defined by the script arguments.
|
and run `vagrant up` with some settings defined by the script arguments.
|
||||||
|
|
||||||
By default, the VM created is configured with:
|
By default, the VM created is configured with:
|
||||||
|
|
||||||
- 2 vCPUs
|
- 2 vCPUs
|
||||||
- 4G of RAM
|
- 4G of RAM
|
||||||
- 2 NICs (1 x NAT - host access, 1 x private network)
|
- 2 NICs (1 x NAT - host access, 1 x private network)
|
||||||
|
@ -347,9 +347,9 @@ To enable it on Linux, it is required to modify kernel options inside the
|
|||||||
virtual machine.
|
virtual machine.
|
||||||
|
|
||||||
Instructions below for Ubuntu OS:
|
Instructions below for Ubuntu OS:
|
||||||
|
|
||||||
1. `vi /etc/default/grub`
|
1. `vi /etc/default/grub`
|
||||||
2. Make sure mq is enabled:
|
2. Make sure mq is enabled: `GRUB_CMDLINE_LINUX="scsi_mod.use_blk_mq=1"`
|
||||||
`GRUB_CMDLINE_LINUX="scsi_mod.use_blk_mq=1"`
|
|
||||||
3. `sudo update-grub`
|
3. `sudo update-grub`
|
||||||
4. Reboot virtual machine
|
4. Reboot virtual machine
|
||||||
|
|
||||||
|
@ -89,6 +89,7 @@ device (SPDK) can access it directly. The memory can be fragmented into multiple
|
|||||||
physically-discontiguous regions and Vhost-user specification puts a limit on
|
physically-discontiguous regions and Vhost-user specification puts a limit on
|
||||||
their number - currently 8. The driver sends a single message for each region with
|
their number - currently 8. The driver sends a single message for each region with
|
||||||
the following data:
|
the following data:
|
||||||
|
|
||||||
* file descriptor - for mmap
|
* file descriptor - for mmap
|
||||||
* user address - for memory translations in Vhost-user messages (e.g.
|
* user address - for memory translations in Vhost-user messages (e.g.
|
||||||
translating vring addresses)
|
translating vring addresses)
|
||||||
@ -106,6 +107,7 @@ as they use common SCSI I/O to inquiry the underlying disk(s).
|
|||||||
|
|
||||||
Afterwards, the driver requests the number of maximum supported queues and
|
Afterwards, the driver requests the number of maximum supported queues and
|
||||||
starts sending virtqueue data, which consists of:
|
starts sending virtqueue data, which consists of:
|
||||||
|
|
||||||
* unique virtqueue id
|
* unique virtqueue id
|
||||||
* index of the last processed vring descriptor
|
* index of the last processed vring descriptor
|
||||||
* vring addresses (from user address space)
|
* vring addresses (from user address space)
|
||||||
|
@ -6,8 +6,9 @@ SPDK Virtio driver is a C library that allows communicating with Virtio devices.
|
|||||||
It allows any SPDK application to become an initiator for (SPDK) vhost targets.
|
It allows any SPDK application to become an initiator for (SPDK) vhost targets.
|
||||||
|
|
||||||
The driver supports two different usage models:
|
The driver supports two different usage models:
|
||||||
|
|
||||||
* PCI - This is the standard mode of operation when used in a guest virtual
|
* PCI - This is the standard mode of operation when used in a guest virtual
|
||||||
machine, where QEMU has presented the virtio controller as a virtual PCI device.
|
machine, where QEMU has presented the virtio controller as a virtual PCI device.
|
||||||
* vhost-user - Can be used to connect to a vhost socket directly on the same host.
|
* vhost-user - Can be used to connect to a vhost socket directly on the same host.
|
||||||
|
|
||||||
The driver, just like the SPDK @ref vhost, is using pollers instead of standard
|
The driver, just like the SPDK @ref vhost, is using pollers instead of standard
|
||||||
|
@ -72,21 +72,26 @@ VPP can be configured using a VPP startup file and the `vppctl` command; By defa
|
|||||||
Some key values from iSCSI point of view includes:
|
Some key values from iSCSI point of view includes:
|
||||||
|
|
||||||
CPU section (`cpu`):
|
CPU section (`cpu`):
|
||||||
|
|
||||||
- `main-core <lcore>` -- logical CPU core used for main thread.
|
- `main-core <lcore>` -- logical CPU core used for main thread.
|
||||||
- `corelist-workers <lcore list>` -- logical CPU cores where worker threads are running.
|
- `corelist-workers <lcore list>` -- logical CPU cores where worker threads are running.
|
||||||
|
|
||||||
DPDK section (`dpdk`):
|
DPDK section (`dpdk`):
|
||||||
|
|
||||||
- `num-rx-queues <num>` -- number of receive queues.
|
- `num-rx-queues <num>` -- number of receive queues.
|
||||||
- `num-tx-queues <num>` -- number of transmit queues.
|
- `num-tx-queues <num>` -- number of transmit queues.
|
||||||
- `dev <PCI address>` -- whitelisted device.
|
- `dev <PCI address>` -- whitelisted device.
|
||||||
|
|
||||||
Session section (`session`):
|
Session section (`session`):
|
||||||
|
|
||||||
- `evt_qs_memfd_seg` -- uses a memfd segment for event queues. This is required for SPDK.
|
- `evt_qs_memfd_seg` -- uses a memfd segment for event queues. This is required for SPDK.
|
||||||
|
|
||||||
Socket server session (`socksvr`):
|
Socket server session (`socksvr`):
|
||||||
|
|
||||||
- `socket-name <path>` -- configure API socket filename (curently SPDK uses default path `/run/vpp-api.sock`).
|
- `socket-name <path>` -- configure API socket filename (curently SPDK uses default path `/run/vpp-api.sock`).
|
||||||
|
|
||||||
Plugins section (`plugins`):
|
Plugins section (`plugins`):
|
||||||
|
|
||||||
- `plugin <plugin name> { [enable|disable] }` -- enable or disable VPP plugin.
|
- `plugin <plugin name> { [enable|disable] }` -- enable or disable VPP plugin.
|
||||||
|
|
||||||
### Example:
|
### Example:
|
||||||
|
@ -60,6 +60,7 @@ other than -t, -s, -n and -a.
|
|||||||
|
|
||||||
## fio
|
## fio
|
||||||
Fio job parameters.
|
Fio job parameters.
|
||||||
|
|
||||||
- bs: block size
|
- bs: block size
|
||||||
- qd: io depth
|
- qd: io depth
|
||||||
- rw: workload mode
|
- rw: workload mode
|
||||||
|
@ -8,7 +8,7 @@ The following guide explains how to use the scripts in the `spdk/scripts/vagrant
|
|||||||
4. Install and configure [Vagrant 1.9.4](https://www.vagrantup.com) or newer
|
4. Install and configure [Vagrant 1.9.4](https://www.vagrantup.com) or newer
|
||||||
|
|
||||||
* Note: The extension pack has different licensing than main VirtualBox, please
|
* Note: The extension pack has different licensing than main VirtualBox, please
|
||||||
review them carefully as the evaluation license is for personal use only.
|
review them carefully as the evaluation license is for personal use only.
|
||||||
|
|
||||||
## Mac OSX Setup (High Sierra)
|
## Mac OSX Setup (High Sierra)
|
||||||
|
|
||||||
@ -20,7 +20,8 @@ Quick start instructions for OSX:
|
|||||||
4. Install Vagrant Cask
|
4. Install Vagrant Cask
|
||||||
|
|
||||||
* Note: The extension pack has different licensing than main VirtualBox, please
|
* Note: The extension pack has different licensing than main VirtualBox, please
|
||||||
review them carefully as the evaluation license is for personal use only.
|
review them carefully as the evaluation license is for personal use only.
|
||||||
|
|
||||||
```
|
```
|
||||||
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
|
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
|
||||||
brew doctor
|
brew doctor
|
||||||
@ -38,7 +39,7 @@ review them carefully as the evaluation license is for personal use only.
|
|||||||
4. Install and configure [Vagrant 1.9.4](https://www.vagrantup.com) or newer
|
4. Install and configure [Vagrant 1.9.4](https://www.vagrantup.com) or newer
|
||||||
|
|
||||||
* Note: The extension pack has different licensing than main VirtualBox, please
|
* Note: The extension pack has different licensing than main VirtualBox, please
|
||||||
review them carefully as the evaluation license is for personal use only.
|
review them carefully as the evaluation license is for personal use only.
|
||||||
|
|
||||||
- Note: VirtualBox requires virtualization to be enabled in the BIOS.
|
- Note: VirtualBox requires virtualization to be enabled in the BIOS.
|
||||||
- Note: You should disable Hyper-V in Windows RS 3 laptop. Search `windows features` un-check Hyper-V, restart laptop
|
- Note: You should disable Hyper-V in Windows RS 3 laptop. Search `windows features` un-check Hyper-V, restart laptop
|
||||||
@ -58,7 +59,7 @@ Following the generic instructions should be sufficient for most Linux distribut
|
|||||||
7. rpm -ivh vagrant_2.1.2_x86_64.rpm
|
7. rpm -ivh vagrant_2.1.2_x86_64.rpm
|
||||||
|
|
||||||
* Note: The extension pack has different licensing than main VirtualBox, please
|
* Note: The extension pack has different licensing than main VirtualBox, please
|
||||||
review them carefully as the evaluation license is for personal use only.
|
review them carefully as the evaluation license is for personal use only.
|
||||||
|
|
||||||
## Configure Vagrant
|
## Configure Vagrant
|
||||||
|
|
||||||
|
@ -7,6 +7,7 @@ Multiple controllers and namespaces can be exposed to the fuzzer at a time. In o
|
|||||||
handle multiple namespaces, the fuzzer will round robin assign a thread to each namespace and
|
handle multiple namespaces, the fuzzer will round robin assign a thread to each namespace and
|
||||||
submit commands to that thread at a set queue depth. (currently 128 for I/O, 16 for Admin). The
|
submit commands to that thread at a set queue depth. (currently 128 for I/O, 16 for Admin). The
|
||||||
application will terminate under three conditions:
|
application will terminate under three conditions:
|
||||||
|
|
||||||
1. The user specified run time expires (see the -t flag).
|
1. The user specified run time expires (see the -t flag).
|
||||||
2. One of the target controllers stops completing I/O operations back to the fuzzer i.e. controller timeout.
|
2. One of the target controllers stops completing I/O operations back to the fuzzer i.e. controller timeout.
|
||||||
3. The user specified a json file containing operations to run and the fuzzer has received valid completions for all of them.
|
3. The user specified a json file containing operations to run and the fuzzer has received valid completions for all of them.
|
||||||
@ -14,8 +15,10 @@ application will terminate under three conditions:
|
|||||||
# Output
|
# Output
|
||||||
|
|
||||||
By default, the fuzzer will print commands that:
|
By default, the fuzzer will print commands that:
|
||||||
|
|
||||||
1. Complete successfully back from the target, or
|
1. Complete successfully back from the target, or
|
||||||
2. Are outstanding at the time of a controller timeout.
|
2. Are outstanding at the time of a controller timeout.
|
||||||
|
|
||||||
Commands are dumped as named objects in json format which can then be supplied back to the
|
Commands are dumped as named objects in json format which can then be supplied back to the
|
||||||
script for targeted debugging on a subsequent run. See `Debugging` below.
|
script for targeted debugging on a subsequent run. See `Debugging` below.
|
||||||
By default no output is generated when a specific command is returned with a failed status.
|
By default no output is generated when a specific command is returned with a failed status.
|
||||||
|
@ -14,6 +14,7 @@ Like the NVMe fuzzer, there is an example json file showing the types of request
|
|||||||
that the application accepts. Since the vhost application accepts both vhost block
|
that the application accepts. Since the vhost application accepts both vhost block
|
||||||
and vhost scsi commands, there are three distinct object types that can be passed in
|
and vhost scsi commands, there are three distinct object types that can be passed in
|
||||||
to the application.
|
to the application.
|
||||||
|
|
||||||
1. vhost_blk_cmd
|
1. vhost_blk_cmd
|
||||||
2. vhost_scsi_cmd
|
2. vhost_scsi_cmd
|
||||||
3. vhost_scsi_mgmt_cmd
|
3. vhost_scsi_mgmt_cmd
|
||||||
|
@ -10,6 +10,7 @@ to emulate an RDMA enabled NIC. NVMe controllers can also be virtualized in emul
|
|||||||
|
|
||||||
|
|
||||||
## VM Envronment Requirements (Host):
|
## VM Envronment Requirements (Host):
|
||||||
|
|
||||||
- 8 GiB of RAM (for DPDK)
|
- 8 GiB of RAM (for DPDK)
|
||||||
- Enable intel_kvm on the host machine from the bios.
|
- Enable intel_kvm on the host machine from the bios.
|
||||||
- Enable nesting for VMs in kernel command line (for vhost tests).
|
- Enable nesting for VMs in kernel command line (for vhost tests).
|
||||||
@ -28,6 +29,7 @@ configuration file. For a full list of the variable declarations available for a
|
|||||||
`test/common/autotest_common.sh` starting at line 13.
|
`test/common/autotest_common.sh` starting at line 13.
|
||||||
|
|
||||||
## Steps for Configuring the VM
|
## Steps for Configuring the VM
|
||||||
|
|
||||||
1. Download a fresh Fedora 26 image.
|
1. Download a fresh Fedora 26 image.
|
||||||
2. Perform the installation of Fedora 26 server.
|
2. Perform the installation of Fedora 26 server.
|
||||||
3. Create an admin user sys_sgsw (enabling passwordless sudo for this account will make life easier during the tests).
|
3. Create an admin user sys_sgsw (enabling passwordless sudo for this account will make life easier during the tests).
|
||||||
@ -60,6 +62,7 @@ created above and guest or VM refer to the Ubuntu VM created in this section.
|
|||||||
- move .qcow2 file and ssh keys to default locations used by vhost test scripts
|
- move .qcow2 file and ssh keys to default locations used by vhost test scripts
|
||||||
|
|
||||||
Alternatively it is possible to create the VM image manually using following steps:
|
Alternatively it is possible to create the VM image manually using following steps:
|
||||||
|
|
||||||
1. Create an image file for the VM. It does not have to be large, about 3.5G should suffice.
|
1. Create an image file for the VM. It does not have to be large, about 3.5G should suffice.
|
||||||
2. Create an ssh keypair for host-guest communications (performed on the host):
|
2. Create an ssh keypair for host-guest communications (performed on the host):
|
||||||
- Generate an ssh keypair with the name spdk_vhost_id_rsa and save it in `/root/.ssh`.
|
- Generate an ssh keypair with the name spdk_vhost_id_rsa and save it in `/root/.ssh`.
|
||||||
|
Loading…
Reference in New Issue
Block a user