doc: Restore volume after unexpected detachment
Longhorn #851 Signed-off-by: Shuo Wu <shuo@rancher.com>
This commit is contained in:
parent
5e2f8cc45e
commit
4dc5d9ea4b
@ -222,6 +222,7 @@ More examples are available at `./examples/`
|
||||
### [Use CSI driver on RancherOS/CoreOS + RKE or K3S](./docs/csi-config.md)
|
||||
### [Restore a backup to an image file](./docs/restore-to-file.md)
|
||||
### [Disaster Recovery Volume](./docs/dr-volume.md)
|
||||
### [Restore volume after unexpected detachment](./docs/restore-volume.md)
|
||||
|
||||
# Troubleshooting
|
||||
You can click `Generate Support Bundle` link at the bottom of the UI to download a zip file contains Longhorn related configuration and logs.
|
||||
|
79
docs/restore-volume.md
Normal file
79
docs/restore-volume.md
Normal file
@ -0,0 +1,79 @@
|
||||
# Restore volume after unexpected detachment
|
||||
|
||||
## Overview
|
||||
1. Now Longhorn can automatically reattach then remount volumes if unexpected detachment happens. e.g., [Kubernetes upgrade](https://github.com/longhorn/longhorn/issues/703), [Docker reboot](https://github.com/longhorn/longhorn/issues/686).
|
||||
2. After reattachment and remount complete, users may need to **manually restart the related workload containers** for the volume restoration **if the following recommended setup is not applied**.
|
||||
|
||||
## Recommended setup when using Longhorn volumes
|
||||
In order to restore unexpectedly detached volumes automatically, users can set `restartPolicy` to `Always` then add `livenessProbe` for the workloads using Longhorn volumes.
|
||||
Then those workloads will be restarted automatically after reattachment and remount.
|
||||
|
||||
Here is one example for the setup:
|
||||
```
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: longhorn-volv-pvc
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: longhorn
|
||||
resources:
|
||||
requests:
|
||||
storage: 2Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: volume-test
|
||||
namespace: default
|
||||
spec:
|
||||
restartPolicy: Always
|
||||
containers:
|
||||
- name: volume-test
|
||||
image: nginx:stable-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
livenessProbe:
|
||||
exec:
|
||||
command:
|
||||
- ls
|
||||
- /data/lost+found
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: volv
|
||||
mountPath: /data
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumes:
|
||||
- name: volv
|
||||
persistentVolumeClaim:
|
||||
claimName: longhorn-volv-pvc
|
||||
```
|
||||
- The directory used in the `livenessProbe` will be `<volumeMount.mountPath>/lost+found`
|
||||
- Don't set a short interval for `livenessProbe.periodSeconds`, e.g., 1s. The liveness command is CPU consuming.
|
||||
|
||||
## Manually restart workload containers
|
||||
## This solution is applied only if:
|
||||
1. The Longhorn volume is reattached automatically.
|
||||
2. The above setup is not included when the related workload is launched.
|
||||
|
||||
### Steps
|
||||
1. Figure out on which node the related workload's containers are running
|
||||
```
|
||||
kubectl -n <namespace of your workload> get pods <workload's pod name> -o wide
|
||||
```
|
||||
2. Connect to the node. e.g., `ssh`
|
||||
3. Figure out the containers belonging to the workload
|
||||
```
|
||||
docker ps
|
||||
```
|
||||
By checking the columns `COMMAND` and `NAMES` of the output, you can find the corresponding container
|
||||
|
||||
4. Restart the container
|
||||
```
|
||||
docker restart <the container ID of the workload>
|
||||
```
|
||||
|
||||
### Reason
|
||||
Typically the volume mount propagation is not `Bidirectional`. It means the Longhorn remount operation won't be propagated to the workload containers if the containers are not restarted.
|
Loading…
Reference in New Issue
Block a user