The LEP describes the desgin of recording and restoring recurring jobs into/from the backup volume on the backup target. Longhorn 2227 Signed-off-by: James Lu <james.lu@suse.com>
10 KiB
Record Recurring Jobs in the Backup Volume
Summary
Provide a way that users can record recurring jobs/groups during the backup creation and restore them during the backup restoration. The feature will back up all recurring jobs/groups of the volume into the backup volume configuration on the backup target and restore all jobs when users want to create a volume from a backup with recurring jobs/groups stored in the backup volume.
Related Issues
https://github.com/longhorn/longhorn/issues/2227
Motivation
Goals
- Backup or update all recurring jobs/groups to the backup volume during the backup creation.
- Create recurring jobs/groups and bind them to the volume restored from a backup optionally.
- It is backward compatible for current backups w/o recurring jobs info.
Non-goals [optional]
- Not support to back up or restore a specific recurring job/group.
- A DR volume will not restore recurring jobs/groups during a backup restoration.
Proposal
- Add a global boolean setting
restore-volume-recurring-jobs
. Default value isfalse
. When users create a volume from a backup and this setting is set to betrue
, it will automatically restore all recurring jobs/groups stored in the backup volume. - Add a customized string parameter
RestoreVolumeRecurringJob
inVolume
CR. Default value is"ignored"
."enabled"
is to restore recurring jobs/groups. By contrast,"disabled"
is not to restore. Users can override the default behavior during the restoration at runtime by this parameter.
User Stories
Story 1
Users can simply create recurring jobs from restoring a backup created by other Longhorn systems. And continue to back up this restoring volume to the backup target with the same recurring jobs settings.
Story 2
When the users delete recurring jobs of the volume by accident, they could restore some recurring jobs from the backup volume by restoring a backup if they do not want to create recurring jobs manually.
User Experience In Detail
Via Longhorn GUI
- Users can set
restore-volume-recurring-jobs
to betrue
on theSettings
page. - When users restore a backup to create a volume, they can see the recurring jobs/groups are restored and enabled automatically on the volume details page.
- Users can check the checkbox
enabled
ordisabled
to override the global setting of restoring recurring jobs/groups.
Via kubectl
- User can use the command
kubectl -n longhorn-system edit settings
to setrestore-volume-recurring-jobs
to betrue
- Users can set
Volume.spec.restoreVolumeRecurringJob
toenabled
ordisabled
to override the global setting of restoring recurring jobs/groups when creating a volume from a backup. - When users create a volume by restoring a backup, they can see the recurring jobs/groups are restored as
RecurringJob
CRs and labeled in theVolume
CR.
...
kind: Volume
metadata:
labels:
longhornvolume: restore-demo
name: restore-demo
namespace: longhorn-system
spec:
RestoreVolumeRecurringJob: "enabled"
fromBackup: "nfs://nfs-sever.com:/opt/shared-path/?backup=backup-f6d9b9caa9444543&volume=backup1"
...
API changes
Add a string parameter RestoreVolumeRecurringJob
to the Volume
struct utilized by the http client,
This ends up being stored in Volume.spec.restoreVolumeRecurringJob
of the volume CR.
Design
Implementation Overview
-
Add a global boolean setting
restore-volume-recurring-jobs
. Default value isfalse
. It will restore all recurring jobs/groups of the backup volume during a backup restoration if this setting is set to betrue
. -
Add the parameter
RestoreVolumeRecurringJob
intoVolume
struct of api/model.go and volume CR. Default value is"ignored"
. -
Store all recurring jobs information of the volume into the backup volume configuration on the backup target during the backup creation.
- We had saved the
"RecurringJob":"c-jaim49"
information in thespec.labels
of the backup CR to show you the backup is created by a recurring job and this information will also be stored into backup volume configuration on the backup target and update tostatus.labels
of the backup volume CR but it only contains the recurring job name and it will be changed after any recurring job creates a backup. - Now we back up the details of recurring jobs/groups information into backup volume configuration on the backup target and synchronized to
status.labels
of the backup volume CR. When users need to restore recurring jobs/groups to the current Longhorn system or another, it will get the recurring jobs/groups configuration from backup volume CR.
Backup Controller queue ┌───────────────┐ ┌───────────────────────┐ ┌┐ ┌┐ ┌┐ │ │ │ │ ... ││ ││ ││ ──────► │ ... | ──────► │ reconcile() │ └┘ └┘ └┘ │ │ │ │ └───────────────┘ └──────────┬────────────┘ │ instance-manager ┌──────────▼────────────┐ ┌──────────────────────┐ ┌──────────────────────┐ │ │ │ │ │ │ │ enableBackupMonitor() │ ──────► │ NewBackupMonitor() │ ... ─► │ SnapshotBackup() │ ... │ │ │ │ │ │ └───────────────────────┘ └──────────────────────┘ └──────────────────────┘
- The
backup_controller
will be responsible for collecting recurring jobs information and send it to the backup monitor when detecting a new backup CR created. - The
backup_monitor
will put recurring jobs information with a new keyVolumeRecurringJobs
into thespec.labels
of the backup CR and trigger the backup creation. - Recurring jobs information in the labels will be stored into the backup volume configuration by
backupstore
.
Example of recurring jobs/groups information stored in the backup volume configuration.
{ ..., "Labels": { "RecurringJob":"c-jaim49", "VolumeRecurringJobInfo": "{ \"c-jaim49\": { \"jobSpec\": {\"name\":\"c-jaim49\",\"task\":\"backup\",\"cron\":\"0/1 * * * *\",\"retain\":3,\"concurrency\":1}, \"fromGroup\":null, \"fromJob\":true }, \"c-qakbzx\": { \"jobSpec\":{\"name\":\"c-qakbzx\",\"groups\":[\"default\"],\"task\":\"backup\",\"cron\":\"0 0 * * *\",\"retain\":5,\"concurrency\":3}, \"fromGroup\":[\"default\"], \"fromJob\":false }, \"c-ua7pxz\": { \"jobSpec\":{\"name\":\"c-ua7pxz\",\"groups\":[\"testgroup01\"],\"task\":\"backup\",\"cron\":\"0/10 0/2 * * *\",\"retain\":3,\"concurrency\":3}, \"fromGroup\":[\"testgroup01\"], \"fromJob\":true } }", "longhorn.io/volume-access-mode":"rwo" }, ..., }
- We had saved the
-
Create all recurring jobs if they do not exist when restoring a backup with the setting
restore-volume-recurring-jobs
beingtrue
orVolume.spec.restoreVolumeRecurringJob
being"enabled"
.Volume Controller queue ┌───────────────┐ ┌───────────────────────┐ ┌┐ ┌┐ ┌┐ │ │ │ │ ... ││ ││ ││ ──────► │ ... | ──────► │ syncVolume() │ └┘ └┘ └┘ │ │ │ │ └───────────────┘ └──────────┬────────────┘ │ ┌───────────▼─────────────┐ │ │ │ updateRecurringJobs() │ │ │ └─────────────────────────┘
- Create all recurring jobs gotten from the backup volume CR if they do not exist or configuration is different and set volume labels of recurring jobs to be
"enabled"
before a restoration starts.
- Create all recurring jobs gotten from the backup volume CR if they do not exist or configuration is different and set volume labels of recurring jobs to be
Test plan
Prepare
- Create a volume and attach it to a node or a workload.
- Create some recurring jobs (some are in groups)
- Label the volume with created recurring jobs (some are in groups)
- Create a backup or wait for a recurring job starting
- Wait for backup creation completed.
- Check if recurring jobs/groups information is stored in the backup volume configuration on the backup target
Recurring Jobs exist
- Create a volume from the backup just created.
- Check the volume if it has labels of recurring jobs and groups.
Recurring Jobs do not exist
- Delete recurring jobs that are already stored in the backup volume on the backup.
- Create a volume from the backup just created.
- Check if recurring jobs have been created.
- Check if restoring volume has labels of recurring jobs and groups.
Upgrade strategy
This enhancement doesn't require an upgrade strategy.