4.2 KiB
4.2 KiB
Recurring Snapshot Cleanup
Summary
Currently, Longhorn's recurring job automatically cleans up older snapshots of volumes to retain no more than the defined snapshot number. However, this is limited to the snapshot created by the recurring job. For the non-recurring volume snapshots or snapshots created by backups, the user needs to clean them manually.
Having periodic snapshot cleanup could help to delete/purge those extra snapshots regardless of the creation method.
Related Issues
https://github.com/longhorn/longhorn/issues/3836
Motivation
Goals
Introduce new recurring job types:
snapshot-delete
: periodically remove and purge all kinds of snapshots that exceed the retention count.snapshot-cleanup
: periodically purge removable or system snapshots.
Non-goals [optional]
None
Proposal
- Introduce two new
RecurringJobType
:- snapshot-delete
- snapshot-cleanup
- Recurring job periodically deletes and purges the snapshots for RecurringJob using the
snapshot-delete
task type. Longhorn will retain snapshots based on the given retain number. - Recurring job periodically purges the snapshots for RecurringJob using the
snapshot-cleanup
task type.
User Stories
- The user can create a RecurringJob with
spec.task=snapshot-delete
to instruct Longhorn periodically delete and purge snapshots. - The user can create a RecurringJob with
spec.task=snapshot-cleanup
to instruct Longhorn periodically purge removable or system snapshots.
User Experience In Detail
Recurring Snapshot Deletion
- Have some volume backups and snapshots.
- Create RecurringJob with the
snapshot-delete
task type.apiVersion: longhorn.io/v1beta2 kind: RecurringJob metadata: name: recurring-snap-delete-per-min namespace: longhorn-system spec: concurrency: 1 cron: '* * * * *' groups: [] labels: {} name: recurring-snap-delete-per-min retain: 2 task: snapshot-delete
- Assign the RecurringJob to volume.
- Longhorn deletes all expired snapshots. As a result of the above example, the user will see two snapshots after the job completes.
Recurring Snapshot Cleanup
- Have some system snapshots.
- Create RecurringJob with the
snapshot-cleanup
task type.apiVersion: longhorn.io/v1beta2 kind: RecurringJob metadata: name: recurring-snap-cleanup-per-min namespace: longhorn-system spec: concurrency: 1 cron: '* * * * *' groups: [] labels: {} name: recurring-snap-cleanup-per-min task: snapshot-cleanup
- Assign the RecurringJob to volume.
- Longhorn deletes all expired system snapshots. As a result of the above example, the user will see 0 system snapshot after the job completes.
API changes
None
Design
Implementation Overview
The RecurringJob snapshot-delete
Task Type
- List all expired snapshots (similar to the current
listSnapshotNamesForCleanup
implementation), and use as the cleanupSnapshotNames indoSnapshotCleanup
. - Continue with the current implementation to purge snapshots.
The RecurringJob snapshot-cleanup
Task Type
- Do snapshot purge only in
doSnapshotCleanup
.
RecurringJob Mutate
- Mutate the
Recurringjob.Spec.Retain
to 0 when the task type issnapshot-cleanup
since retain value has no effect on the purge.
Test plan
Test Recurring Snapshot Delete
- Create volume.
- Create 2 volume backups.
- Create 2 volume snapshots.
- Create a snapshot RecurringJob with the
snapshot-delete
task type. - Assign the RecurringJob to volume.
- Wait until the recurring job is completed.
- Should see the number of snapshots matching the Recurring job
spec.retain
.
Test Recurring Snapshot Cleanup
- Create volume.
- Create 2 volume system snapshots, ex: delete replica, online expansion.
- Create a snapshot RecurringJob with the
snapshot-cleanup
task type. - Assign the RecurringJob to volume.
- Wait until the recurring job is completed.
- Should see the volume has 0 system snapshots.
Upgrade strategy
None
Note [optional]
None