Before this feature is implemented, users can only use CSI Snapshotter to create/restore Longhorn backups.
This means that users must set up a backup target outside of the cluster. Uploading/downloading data from
backup target is a long/costly operation. Sometimes, users might just want to use CSI Snapshotter to take
an in-cluster Longhorn snapshot and create a new volume from that snapshot. The Longhorn snapshot operation
is cheap and faster than the backup operation and doesn't require setting up a backup target.
### User Experience In Detail
To use this feature, users need to do:
1. Deploy the CSI snapshot CRDs, Controller as instructed at https://longhorn.io/docs/1.2.3/snapshots-and-backups/csi-snapshot-support/enable-csi-snapshot-support/
1. Deploy a VolumeSnapshotClass with the parameter `type: longhorn-snapshot`. I.e.,
```yaml
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: longhorn-snapshot
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: longhorn-snapshot
```
1. To create a new CSI snapshot associated with a Longhorn snapshot of the volume `test-vol`, users deploy the following VolumeSnapshot CR:
```yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
name: test-snapshot
spec:
volumeSnapshotClassName: longhorn-snapshot
source:
persistentVolumeClaimName: test-vol
```
A new Longhorn snapshot is created for the volume `test-vol`
1. To create a new PVC from the CSI snapshot, users can deploy the following yaml:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-restore-snapshot-pvc
spec:
storageClassName: longhorn
dataSource:
name: test-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi # should be the same as the size of `test-vol`
```
A new PVC will be created with the same content as in the VolumeSnapshot `test-snapshot`
1. Deleting the VolumeSnapshot `test-snapshot` will lead to the deletion of the corresponding Longhorn snapshot of the volume `test-vol`
### API changes
None
## Design
### Implementation Overview
We follow the specification in [the CSI spec](https://github.com/container-storage-interface/spec/blob/master/spec.md#createsnapshot) when supporting the CSI snapshot.
We define a new parameter in the VolumeSnapshotClass `type`.
The value of the parameter `type` can be `longhorn-snapshot` or `longhorn-backup`.
When `type` is `longhorn-snapshot` it means that the CSI VolumeSnapshot created with this VolumeSnapshotClass is associated with a Longhorn snapshot.
When `type` is `longhorn-backup` it means that the CSI VolumeSnapshot created with this VolumeSnapshotClass is associated with a Longhorn backup.
In [CreateSnapshot function](https://github.com/longhorn/longhorn-manager/blob/878cfb868c568396d6ebfa4ce096c5d95d9b31e3/csi/controller_server.go#L539), we get the
value of parameter `type`. If it is `longhorn-backup`, we take a Longhorn backup as before. If it is `longhorn-snapshot` we do:
* Get the name of the Longhorn volume
* Check if the volume is in attached state.
If it is not, return `codes.FailedPrecondition`.
We cannot take a snapshot of non-attached volume.
* Check if a Longhorn snapshot with the same name as the requested CSI snapshot already exists.
If yes, return OK without taking a new Longhorn snapshot.
* Take a new Longhorn snapshot. Encode the snapshotId in the format `snap://volume-name/snapshot-name`.
This snaphotId will be used in the later CSI CreateVolume and DeleteSnapshot call.
In [CreateVolume function](https://github.com/longhorn/longhorn-manager/blob/878cfb868c568396d6ebfa4ce096c5d95d9b31e3/csi/controller_server.go#L63):
* If the VolumeContentSource is a `VolumeContentSource_Snapshot` type, decode the snapshotId in the format from the above step.
* Create a new volume with the `dataSource` set to `snap://volume-name/snapshot-name`. This will trigger Longhorn to clone the content of the snapshot to the new volume.
Note that if the source volume is not attached, Longhorn cannot verify the existence of the snapshot inside the Longhorn volume.
This means that [the API will return error](https://github.com/longhorn/longhorn-manager/blob/878cfb868c568396d6ebfa4ce096c5d95d9b31e3/manager/volume.go#L347-L352) and new PVC cannot be provisioned.
In [DeleteSnapshot function](https://github.com/longhorn/longhorn-manager/blob/878cfb868c568396d6ebfa4ce096c5d95d9b31e3/csi/controller_server.go#L675):
* Decode the snapshotId in the format from the above step.
If the type is `longhorn-backup` we delete the backup as before.
If the type is `longhorn-snapshot`, we delete the corresponding Longhorn snapshot of the source volume.
If the source volume or the snapshot is no longer exist, we return OK as specified in [the CSI spec](https://github.com/container-storage-interface/spec/blob/master/spec.md#deletesnapshot)
### Test plan
Integration test plan.
1. Deploy the CSI snapshot CRDs, Controller as instructed at https://longhorn.io/docs/1.2.3/snapshots-and-backups/csi-snapshot-support/enable-csi-snapshot-support/
1. Deploy 4 VolumeSnapshotClass:
```yaml
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: longhorn-backup-1
driver: driver.longhorn.io
deletionPolicy: Delete
```
```yaml
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: longhorn-backup-2
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: longhorn-backup
```
```yaml
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: longhorn-snapshot
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: longhorn-snapshot
```
```yaml
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1beta1
metadata:
name: invalid-class
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: invalid
```
1. Create Longhorn volume `test-vol` of 5GB. Create PV/PVC for the Longhorn volume.
1. Create a workload that uses the volume. Write some data to the volume.
Make sure data persist to the volume by running `sync`
1. Set up a backup target for Longhorn
#### Scenarios 1: CreateSnapshot
*`type` is `longhorn-backup` or `""`
* Create a VolumeSnapshot with the following yaml
```yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
name: test-snapshot-longhorn-backup
spec:
volumeSnapshotClassName: longhorn-backup-1
source:
persistentVolumeClaimName: test-vol
```
* Verify that a backup is created.
* Delete the `test-snapshot-longhorn-backup`
* Verify that the backup is deleted
* Create the `test-snapshot-longhorn-backup` VolumeSnapshot with `volumeSnapshotClassName: longhorn-backup-2`
* Verify that a backup is created.
*`type` is `longhorn-snapshot`
* volume is in detached state.
* Scale down the workload of `test-vol` to detach the volume.
* Create `test-snapshot-longhorn-snapshot` VolumeSnapshot with `volumeSnapshotClassName: longhorn-snapshot`.
* Verify the error `volume ... invalid state ... for taking snapshot` in the Longhorn CSI plugin.
* volume is in attached state.
* Scale up the workload to attach `test-vol`
* Verify that a Longhorn snapshot is created for the `test-vol`.
* invalid type
* Create `test-snapshot-invalid` VolumeSnapshot with `volumeSnapshotClassName: invalid-class`.
* Verify the error `invalid snapshot type: %v. Must be %v or %v or` in the Longhorn CSI plugin.
* Delete `test-snapshot-invalid` VolumeSnapshot.
#### Scenarios 2: Create new volume from CSI snapshot
* From `longhorn-backup` type
* Create a new PVC with the flowing yaml:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-restore-pvc
spec:
storageClassName: longhorn
dataSource:
name: test-snapshot-longhorn-backup
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
```
* Attach the PVC `test-restore-pvc` and verify the data
* Delete the PVC
* From `longhorn-snapshot` type
* Source volume is attached && Longhorn snapshot exist
* Create a PVC with the following yaml:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-restore-pvc
spec:
storageClassName: longhorn
dataSource:
name: test-snapshot-longhorn-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
```
* Attach the PVC `test-restore-pvc` and verify the data
* Delete the PVC
* Source volume is detached
* Scale down the workload to detach the `test-vol`
* Create the same PVC `test-restore-pvc` as in the `Source volume is attached && Longhorn snapshot exist` section
* Verify that PVC provisioning failed because the source volume is detached so Longhorn cannot verify the existence of the Longhorn snapshot in the source volume.