Commit Graph

16 Commits

Author SHA1 Message Date
Jim Harris
2e6aac525c bdev/raid: use split_on_optimal_io_boundary
Set the bdev->optimal_io_boundary to the strip size, and
set split_on_optimal_io_boundary = true.  This will ensure
that all I/O submitted to the raid module do not cross
a strip boundary, meaning it does not need to be split
across multiple member disks.

This is a step towards removing the iovcnt == 1
limitation.  Further improvements and simplifications
will be made in future patches before removing this
restriction.

Unit tests need to be adjusted here to not span
boundaries either.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I08943805def673288f552a1b7662a4fbe16f25eb

Reviewed-on: https://review.gerrithub.io/423323
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-08-31 17:53:42 +00:00
Shuhei Matsumoto
a295285bf5 bdev/raid: Use descriptive error message for failure in JSON-RPC
Change-Id: I238c2ba43c77f47851aa9d28818d42e8d865f2d4
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423619
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-30 16:13:35 +00:00
Shuhei Matsumoto
0a46f5c2aa bdev/raid: Extract check claim operation from add base bdev operation
When a raid bdev is constructed by JSON-RPC construct_raid_bdev,
information about config and slot can be passed to
raid_bdev_add_base_device() and raid_bdev_can_claim() doesn't have
to be called.

Hence extract raid_bdev_can_claim_bdev() from raid_bdev_add_base_device()
and put it to raid_bdev_examine().

Change-Id: I92e02bf3661cb97b691246f32198ba946810d96c
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-30 16:13:35 +00:00
Shuhei Matsumoto
636b9d597c bdev/raid: Simplify variable name in struct raid_base_bdev_config
raid_base_bdev_config is suffciently descriptive as name and so
"name" can be used as the name of the base bdev.

Other structs have used "name" as the name of bdev.

Change-Id: I090ef541a84fd14d9ec3bfebbdc96ae649709551
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423614
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-30 16:13:35 +00:00
Shuhei Matsumoto
6058327b10 bdev/raid: Use more compact name for TAILQ_ENTRY of raid_bdev
According to the purpose of the lists, state_link and global_link
will be enough to understand the purpose well.

Besides, in raid_bdev_remove_base_bdev(), if any entry is not removed
during iteration, TAILQ_FOREACH_SAFE() is not necessary. Hence
change TAILQ_FOREACH_SAFE to TAILQ_FOREACH for this case.

Change-Id: I3022c58faf96721df9241e07dbb5a06d7de89e70
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-08-23 16:35:31 +00:00
Shuhei Matsumoto
a4b975244d bdev/raid: Simplify variable name in struct raid_bdev
struct raid_bdev has a pointer raid_bdev_config to its configuration,
but raid_bdev is duplicated between raid_bdev and raid_bdev_config
and just config may be enough as a name.

Change-Id: I77f6a69b913cd540c22f05803c486a4caf103f24
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422923
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-23 16:35:31 +00:00
Shuhei Matsumoto
ab9661c990 bdev/raid: Simplify variable name in struct raid_bdev_io_channel
base_bdevs_io_channel is good but base_channel may be enough and
fit other bdev modules.

Change-Id: I67a1d224f1ef4ca1fc048b4325333f2552a37150
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422921
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-23 16:35:31 +00:00
Shuhei Matsumoto
c7756fe71d bdev/raid: Simplify variable names in struct raid_base_bdev_info
raid_base_bdev_info is sufficiently descriptive as name and so
names of member variables in raid_base_bdev_info can be simplified.

Change-Id: I759a758ec0f3ac21de767ad379a902f1b04cc66d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422919
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-23 16:35:31 +00:00
Jim Harris
aec9c3cded test: use spdk_unittest.mk for raid unit tests
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I925bf71414fe74868eb14faf365e09258fea87eb

Reviewed-on: https://review.gerrithub.io/421177
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-08-15 23:32:04 +00:00
Shuhei Matsumoto
57dd0d5085 bdev/raid: Get struct raid_bdev from struct spdk_bdev_io in IO submission
During IO submission, raid_bdev can be get by bdev_io->bdev->ctxt.
Hence holding raid_bdev in raid_bdev_io_channel is duplicated.

Change-Id: I722432718aca8c5846541816b6ecca56821d77f6
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/421182
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2018-08-06 18:14:30 +00:00
Shuhei Matsumoto
056980c10c bdev/raid: Embed struct raid_bdev_ctxt into struct raid_bdev
Locating spdk_bdev structure to the beginning of raid_bdev structure
will simplify the hierarchy and match other bdev modules.

Change-Id: I1bfbf773bc96a4f144e6bff772ade05bb42762e9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/420818
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-08-06 18:14:30 +00:00
Shuhei Matsumoto
e923c66a07 bdev/raid: Free all strings allocated by JSON decoder on all paths in RPC
Use strdup for all strings allocated by JSON decoder and free them on
all paths.

Change-Id: Ia30d32c2fd6bd79f62588ca19bd3bb093d38a502
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/420325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
2018-07-30 15:44:27 +00:00
Shuhei Matsumoto
90f54670bc bdev/raid: Use TAILQ to manage RAID bdev configuration
Change-Id: I971df38c577284916212e58b77c73499127a980a
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/420217
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-07-30 15:44:27 +00:00
Shuhei Matsumoto
32cc0199d1 bdev/raid: Use single character variables to index and loop iterator
Change-Id: I1023a509f938fcbe6ac0c35dca5192c36990de91
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/420216
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-07-30 15:44:27 +00:00
Jim Harris
c60ae3c356 raid: get buffer on reads before sending child ios
For front-ends like iSCSI (and NVMe-oF in the future)
which want the backend to specify the data buffer, the
RAID module doesn't copy the pointer to the allocated
buffer from the child IO back to the parent IO.  It
really can't copy the pointer - the child IO owns it
and will free it.

So the RAID module needs to allocate the buffer first
and then pass it down.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3b1eceac9b1cdd26130e59e1d400c9869a19f881

Reviewed-on: https://review.gerrithub.io/420677
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-07-30 14:59:45 +00:00
Kunal Sablok
41586b0f1d bdev: add raid bdev module
Raid module:
============
- SPDK raid bdev module is a new bdev module which is
  responsible for striping various NVMe devices and expose the raid bdev
  to bdev layer which would enhance the performance and capacity.
- It can support theoretically 256 base devices (currently it is being
  tested max upto 8 base devices)
- Multiple strip sizes like 32KB, 64KB, 128KB, 256KB, 512KB etc is
  supported. Most of the current testing is focused on 64KB strip size.
- New RPC commands like "create raid bdev", "destroy raid bdev" and "get raid bdevs"
  are introduced to configure raid bdev dynamically in a running
SPDK system.
- Currently raid bdev configuration parameters are persisted in the
  current SPDK configuration file for across reboot support. DDF will be
introduced later.

High level testing done:
=======================
- Raid bdev is created with 8 base NVMe devices via configuration
  file and is exposed to initiator via existing methods. Initiator is
able to see a single NVMe namespace with capacity equal to sum of the
minimum capacities of 8 devices. Initiator was able to run raw
read/write workload, file system workload etc (tested with XFS file
system workload).
- Multiple raid bdevs are also created and exposed to initiator and
  tested with file system and other workloads for read/write IO.
- LVS / LVOL are created over raid bdev and exposed to initiator.
  Testing was done for raw read/write workloads and XFS file system
workloads.
- RPC testing is done where on the running SPDK system raid bdevs
  are created out of NVMe base devices. These raid bdevs (and LVOLs
over raid bdevs) are then exposed to initiator and IO workload was
tested for raw read/write and XFS file system workload.
- RPC testing is done for delete raid bdevs where all raid bdevs
  are deleted in running SPDK system.
- RPC testing is done for get raid bdevs where existing list of
  raid bdev names is printed (it can be all raid bdevs or only
online or only configuring or only offline).
- RPC testing is done where raid bdevs and underlying NVMe devices
  relationship was returned in JSON RPC commands

Change-Id: I10ae1266f8f2cca3c106e4df8c1c0993ddf435d8
Signed-off-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-on: https://review.gerrithub.io/410484
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-07-16 20:50:40 +00:00