Activate a dr volume as long as there is a ready replica. Ref: 1512 Signed-off-by: James Lu <james.lu@suse.com>
2.5 KiB
Forcibly Activate A Restoring/DR Volume
Summary
When users try to activate a restoring/DR volume with some replicas failed for some reasons, the volume will be stuck in attaching state and users can not do anything except deleting the volume. However the volume still can be rebuilt back to be normal as long as there is a healthy replica.
To improve user experience, Longhorn should forcibly activate a restoring/DR volume if there is a healthy replica and users allow the creation of a degraded volume.
Related Issues
https://github.com/longhorn/longhorn/issues/1512
Motivation
Goals
Allow users to activate a restoring/DR volume as long as there is a healthy replica and the volume works well.
Non-goals [optional]
None
Proposal
Forcibly activate a restoring/DR volume if there is a healthy replica and users enable the global setting allow-volume-creation-with-degraded-availability
.
User Stories
Users can activate a restoring/DR volume by the CLI kubectl
or Longhorn UI and the volume could work well.
User Experience In Detail
Prerequisites
Set up two Kubernetes clusters. These will be called cluster A and cluster B. Install Longhorn on both clusters, and set the same backup target on both clusters.
- In the cluster A, make sure the original volume X has a backup created or has recurring backups scheduled.
- In backup page of cluster B, choose the backup volume X, then create disaster recovery volume Y.
Kubectl
User set volume.spec.Standby
to false
by editing the volume CR or the manifest creating the volume to activate the volume.
Longhorn UI
UI has click Activate Disaster Recovery Volume
button in Volume
or Volume Details
pages to activate the volume.
API changes
None
Design
Implementation Overview
- Check if
volume.Spec.Standby
is set tofalse
- Get the global setting
allow-volume-creation-with-degraded-availability
- Activate the DR volume if
allow-volume-creation-with-degraded-availability
is set totrue
and there are one or more ready replicas.
Test plan
Test Forcibly Activated A Restoring/DR Volume
- Create a DR volume
- Set the global setting
concurrent-replica-rebuild-per-node-limit
to be 0 - Failed some replicas
- Check if there is at least one healthy replica
- Call the API
activate
- The volume could be activated
- Attach the volume to a node and check if data is correct
Upgrade strategy
None
Note [optional]
None