test/nvmf: Make sure ctrl is fully released

There may be some timing issue where attempt to rebind the ctrl from
vfio_pci driver back to nvme, right after issuing the
bdev_nvme_detach_controller() call, fails as the vfio_pci might have
not fully released the device yet.

To mitigate, simply kill the application (as it's not needed anymore
at at point) before starting the kernel_target test - this should give
enough time for the device to be properly released.

As a precaution, make setup.sh to retry the probe attempt in case it
fails.

Signed-off-by: Michal Berger <michal.berger@intel.com>
Change-Id: Ifc4f4c18a90605154bf33b078575c8b41129f1f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15767
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kamil Godzwon <kamilx.godzwon@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This commit is contained in:
Michal Berger 2022-12-02 19:25:36 +01:00 committed by Tomasz Zawadzki
parent 784b9d4874
commit eab0c6649d
2 changed files with 19 additions and 7 deletions

View File

@ -150,20 +150,26 @@ function linux_bind_driver() {
return 0
fi
local probe_attempts=0
echo "$driver_name" > "/sys/bus/pci/devices/$bdf/driver_override"
echo "$bdf" > "/sys/bus/pci/drivers_probe"
while ! echo "$bdf" > "/sys/bus/pci/drivers_probe" && ((probe_attempts++ < 10)); do
pci_dev_echo "$bdf" "failed to bind to $driver_name, retrying ($probe_attempts)"
sleep 0.5
done 2> /dev/null
echo "" > "/sys/bus/pci/devices/$bdf/driver_override"
if [[ $driver_name == uio_pci_generic ]] && ! check_for_driver igb_uio; then
# Check if the uio_pci_generic driver is broken as it might be in
# some 4.18.x kernels (see centos8 for instance) - if our device
# didn't get a proper uio entry, fallback to igb_uio
if [[ ! -e /sys/bus/pci/devices/$bdf/uio ]]; then
if [[ ! -e /sys/bus/pci/drivers/$driver_name/$bdf ]]; then
if [[ $driver_name == uio_pci_generic ]] && ! check_for_driver igb_uio; then
# uio_pci_generic driver might be broken in some 4.18.x kernels (see
# centos8 for instance) so try to fallback to igb_uio.
pci_dev_echo "$bdf" "uio_pci_generic potentially broken, moving to igb_uio"
drivers_d["$bdf"]="no driver"
# This call will override $driver_name for remaining devices as well
linux_bind_driver "$bdf" igb_uio
return
fi
pci_dev_echo "$bdf" "failed to bind to $driver_name, aborting"
return 1
fi
iommu_group=$(basename $(readlink -f /sys/bus/pci/devices/$bdf/iommu_group))

View File

@ -56,6 +56,12 @@ spdk_target() {
rpc_cmd nvmf_delete_subsystem "$subnqn"
rpc_cmd bdev_nvme_detach_controller "$name"
# Make sure we fully detached from the ctrl as vfio-pci won't be able to release the
# device otherwise - we can either wait a bit or simply kill the app. Since we don't
# really need it at this point, reap it but leave the net setup around. See:
# https://github.com/spdk/spdk/issues/2811
killprocess "$nvmfpid"
}
kernel_target() {