commit
07e6bd9d1b
25
README.md
25
README.md
@ -1,11 +1,13 @@
|
||||
# Longhorn [](https://github.com/Ullaakut/astronomer)
|
||||
|
||||
### Status
|
||||
### Build Status
|
||||
* Engine: [](https://drone-publish.rancher.io/longhorn/longhorn-engine) [](https://goreportcard.com/report/github.com/rancher/longhorn-engine)
|
||||
* Instance Manager: [](http://drone-publish.rancher.io/longhorn/longhorn-instance-manager)[](https://goreportcard.com/report/github.com/longhorn/longhorn-instance-manager)
|
||||
* Manager: [](https://drone-publish.rancher.io/longhorn/longhorn-manager)[](https://goreportcard.com/report/github.com/rancher/longhorn-manager)
|
||||
* UI: [](https://drone-publish.rancher.io/longhorn/longhorn-ui)
|
||||
* Test: [](http://drone-publish.rancher.io/longhorn/longhorn-tests)
|
||||
|
||||
### Overview
|
||||
Longhorn is a distributed block storage system for Kubernetes.
|
||||
|
||||
Longhorn is lightweight, reliable, and powerful. You can install Longhorn on an existing Kubernetes cluster with one `kubectl apply` command or using Helm charts. Once Longhorn is installed, it adds persistent volume support to the Kubernetes cluster.
|
||||
@ -84,10 +86,15 @@ git clone https://github.com/longhorn/longhorn.git
|
||||
```
|
||||
|
||||
Now using following command to install Longhorn:
|
||||
* Helm2
|
||||
```
|
||||
helm install ./longhorn/chart --name longhorn --namespace longhorn-system
|
||||
```
|
||||
|
||||
* Helm3
|
||||
```
|
||||
kubectl create namespace longhorn-system
|
||||
helm install longhorn ./longhorn/chart/ --namespace longhorn-system
|
||||
```
|
||||
---
|
||||
|
||||
Longhorn will be installed in the namespace `longhorn-system`
|
||||
@ -115,18 +122,21 @@ longhorn-ui-f849dcd85-cgkgg 1/1 Running 0 5d
|
||||
|
||||
### Accessing the UI
|
||||
|
||||
You can run `kubectl -n longhorn-system get svc` to get the external service IP for UI:
|
||||
> For Longhorn v0.8.0+, UI service type has been changed from `LoadBalancer` to `ClusterIP`
|
||||
|
||||
You can run `kubectl -n longhorn-system get svc` to get Longhorn UI service:
|
||||
|
||||
```
|
||||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||||
longhorn-backend ClusterIP 10.20.248.250 <none> 9500/TCP 58m
|
||||
longhorn-frontend LoadBalancer 10.20.245.110 100.200.200.123 80:30697/TCP 58m
|
||||
longhorn-frontend ClusterIP 10.20.245.110 <none> 80/TCP 58m
|
||||
|
||||
```
|
||||
|
||||
If the Kubernetes Cluster supports creating LoadBalancer, you can use `EXTERNAL-IP`(`100.200.200.123` in the case above) of `longhorn-frontend` to access the Longhorn UI. Otherwise you can use `<node_ip>:<port>` (port is `30697`in the case above) to access the UI.
|
||||
To access Longhorn UI when installed from YAML manifest, you need to create an ingress controller.
|
||||
|
||||
See more about how to create an Nginx ingress controller with basic authentication [here](https://github.com/longhorn/longhorn/blob/master/docs/longhorn-ingress.md)
|
||||
|
||||
Noted that the UI is unauthenticated when you installed Longhorn using YAML file.
|
||||
|
||||
# Upgrade
|
||||
|
||||
@ -214,6 +224,7 @@ More examples are available at `./examples/`
|
||||
### [Storage Tags](./docs/storage-tags.md)
|
||||
### [Customized default setting](./docs/customized-default-setting.md)
|
||||
### [Taint Toleration](./docs/taint-toleration.md)
|
||||
### [Volume Expansion](./docs/expansion.md)
|
||||
|
||||
### [Restoring Stateful Set volumes](./docs/restore_statefulset.md)
|
||||
### [Google Kubernetes Engine](./docs/gke.md)
|
||||
@ -275,7 +286,7 @@ Contributing code is not the only way of contributing. We value feedbacks very m
|
||||
|
||||
## License
|
||||
|
||||
Copyright (c) 2014-2019 The Longhorn Authors
|
||||
Copyright (c) 2014-2020 The Longhorn Authors
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
apiVersion: v1
|
||||
name: longhorn
|
||||
version: 0.7.0
|
||||
appVersion: v0.7.0
|
||||
version: 0.8.0
|
||||
appVersion: v0.8.0
|
||||
kubeVersion: ">=v1.14.0-r0"
|
||||
description: Longhorn is a distributed block storage system for Kubernetes powered by Rancher Labs.
|
||||
keywords:
|
||||
|
@ -26,6 +26,8 @@ spec:
|
||||
- daemon
|
||||
- --engine-image
|
||||
- "{{ .Values.image.longhorn.engine }}:{{ .Values.image.longhorn.engineTag }}"
|
||||
- --instance-manager-image
|
||||
- "{{ .Values.image.longhorn.instanceManager }}:{{ .Values.image.longhorn.instanceManagerTag }}"
|
||||
- --manager-image
|
||||
- "{{ .Values.image.longhorn.manager }}:{{ .Values.image.longhorn.managerTag }}"
|
||||
- --service-account
|
||||
@ -41,10 +43,10 @@ spec:
|
||||
- name: varrun
|
||||
mountPath: /var/run/
|
||||
- name: longhorn
|
||||
mountPath: /var/lib/rancher/longhorn/
|
||||
mountPath: /var/lib/longhorn/
|
||||
mountPropagation: Bidirectional
|
||||
- name: longhorn-default-setting
|
||||
mountPath: /var/lib/longhorn/setting/
|
||||
mountPath: /var/lib/longhorn-setting/
|
||||
env:
|
||||
- name: POD_NAMESPACE
|
||||
valueFrom:
|
||||
@ -61,7 +63,7 @@ spec:
|
||||
- name: LONGHORN_BACKEND_SVC
|
||||
value: longhorn-backend
|
||||
- name: DEFAULT_SETTING_PATH
|
||||
value: /var/lib/longhorn/setting/default-setting.yaml
|
||||
value: /var/lib/longhorn-setting/default-setting.yaml
|
||||
volumes:
|
||||
- name: dev
|
||||
hostPath:
|
||||
@ -74,11 +76,14 @@ spec:
|
||||
path: /var/run/
|
||||
- name: longhorn
|
||||
hostPath:
|
||||
path: /var/lib/rancher/longhorn/
|
||||
path: /var/lib/longhorn/
|
||||
- name: longhorn-default-setting
|
||||
configMap:
|
||||
name: longhorn-default-setting
|
||||
serviceAccountName: longhorn-service-account
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: "100%"
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
|
@ -49,4 +49,6 @@ spec:
|
||||
targetPort: http
|
||||
{{- if .Values.service.ui.nodePort }}
|
||||
nodePort: {{ .Values.service.ui.nodePort }}
|
||||
{{- else }}
|
||||
nodePort: null
|
||||
{{- end }}
|
||||
|
@ -4,17 +4,19 @@
|
||||
image:
|
||||
longhorn:
|
||||
engine: longhornio/longhorn-engine
|
||||
engineTag: v0.7.0
|
||||
engineTag: v0.8.0
|
||||
manager: longhornio/longhorn-manager
|
||||
managerTag: v0.7.0
|
||||
managerTag: v0.8.0
|
||||
ui: longhornio/longhorn-ui
|
||||
uiTag: v0.7.0
|
||||
uiTag: v0.8.0
|
||||
instanceManager: longhornio/longhorn-instance-manager
|
||||
instanceManagerTag: v1_20200301
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
service:
|
||||
ui:
|
||||
type: LoadBalancer
|
||||
nodePort: ""
|
||||
type: ClusterIP
|
||||
nodePort: null
|
||||
manager:
|
||||
type: ClusterIP
|
||||
nodePort: ""
|
||||
|
@ -21,7 +21,7 @@ rules:
|
||||
verbs:
|
||||
- "*"
|
||||
- apiGroups: [""]
|
||||
resources: ["pods", "events", "persistentvolumes", "persistentvolumeclaims", "nodes", "proxy/nodes", "pods/log", "secrets", "services", "endpoints", "configmaps"]
|
||||
resources: ["pods", "events", "persistentvolumes", "persistentvolumeclaims","persistentvolumeclaims/status", "nodes", "proxy/nodes", "pods/log", "secrets", "services", "endpoints", "configmaps"]
|
||||
verbs: ["*"]
|
||||
- apiGroups: [""]
|
||||
resources: ["namespaces"]
|
||||
@ -240,7 +240,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: longhorn-manager
|
||||
image: longhornio/longhorn-manager:v0.7.0
|
||||
image: longhornio/longhorn-manager:v0.8.0
|
||||
imagePullPolicy: Always
|
||||
securityContext:
|
||||
privileged: true
|
||||
@ -249,13 +249,18 @@ spec:
|
||||
- -d
|
||||
- daemon
|
||||
- --engine-image
|
||||
- longhornio/longhorn-engine:v0.7.0
|
||||
- longhornio/longhorn-engine:v0.8.0
|
||||
- --instance-manager-image
|
||||
- longhornio/longhorn-instance-manager:v1_20200301
|
||||
- --manager-image
|
||||
- longhornio/longhorn-manager:v0.7.0
|
||||
- longhornio/longhorn-manager:v0.8.0
|
||||
- --service-account
|
||||
- longhorn-service-account
|
||||
ports:
|
||||
- containerPort: 9500
|
||||
readinessProbe:
|
||||
tcpSocket:
|
||||
port: 9500
|
||||
volumeMounts:
|
||||
- name: dev
|
||||
mountPath: /host/dev/
|
||||
@ -264,7 +269,7 @@ spec:
|
||||
- name: varrun
|
||||
mountPath: /var/run/
|
||||
- name: longhorn
|
||||
mountPath: /var/lib/rancher/longhorn/
|
||||
mountPath: /var/lib/longhorn/
|
||||
mountPropagation: Bidirectional
|
||||
- name: longhorn-default-setting
|
||||
mountPath: /var/lib/longhorn-setting/
|
||||
@ -296,11 +301,14 @@ spec:
|
||||
path: /var/run/
|
||||
- name: longhorn
|
||||
hostPath:
|
||||
path: /var/lib/rancher/longhorn/
|
||||
path: /var/lib/longhorn/
|
||||
- name: longhorn-default-setting
|
||||
configMap:
|
||||
name: longhorn-default-setting
|
||||
serviceAccountName: longhorn-service-account
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: "100%"
|
||||
---
|
||||
kind: Service
|
||||
apiVersion: v1
|
||||
@ -336,7 +344,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: longhorn-ui
|
||||
image: longhornio/longhorn-ui:v0.7.0
|
||||
image: longhornio/longhorn-ui:v0.8.0
|
||||
ports:
|
||||
- containerPort: 8000
|
||||
env:
|
||||
@ -357,7 +365,8 @@ spec:
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8000
|
||||
type: LoadBalancer
|
||||
nodePort: null
|
||||
type: ClusterIP
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
@ -376,18 +385,18 @@ spec:
|
||||
spec:
|
||||
initContainers:
|
||||
- name: wait-longhorn-manager
|
||||
image: longhornio/longhorn-manager:v0.7.0
|
||||
image: longhornio/longhorn-manager:v0.8.0
|
||||
command: ['sh', '-c', 'while [ $(curl -m 1 -s -o /dev/null -w "%{http_code}" http://longhorn-backend:9500/v1) != "200" ]; do echo waiting; sleep 2; done']
|
||||
containers:
|
||||
- name: longhorn-driver-deployer
|
||||
image: longhornio/longhorn-manager:v0.7.0
|
||||
image: longhornio/longhorn-manager:v0.8.0
|
||||
imagePullPolicy: Always
|
||||
command:
|
||||
- longhorn-manager
|
||||
- -d
|
||||
- deploy-driver
|
||||
- --manager-image
|
||||
- longhornio/longhorn-manager:v0.7.0
|
||||
- longhornio/longhorn-manager:v0.8.0
|
||||
- --manager-url
|
||||
- http://longhorn-backend:9500/v1
|
||||
# manually set root directory for csi
|
||||
@ -419,9 +428,10 @@ apiVersion: storage.k8s.io/v1
|
||||
metadata:
|
||||
name: longhorn
|
||||
provisioner: driver.longhorn.io
|
||||
allowVolumeExpansion: true
|
||||
parameters:
|
||||
numberOfReplicas: "3"
|
||||
staleReplicaTimeout: "2880" # 48 hours in minutes
|
||||
staleReplicaTimeout: "2880"
|
||||
fromBackup: ""
|
||||
# diskSelector: "ssd,fast"
|
||||
# nodeSelector: "storage,fast"
|
||||
|
@ -1,57 +1,11 @@
|
||||
# Longhorn CSI on RancherOS/CoreOS + RKE or K3S
|
||||
# Longhorn CSI on K3S
|
||||
|
||||
## Requirements
|
||||
1. Kubernetes v1.11 or higher.
|
||||
2. Longhorn v0.4.1 or higher.
|
||||
3. For RancherOS only: Ubuntu console.
|
||||
|
||||
|
||||
## Instruction
|
||||
### For RancherOS/CoreOS + Kubernetes v1.11 only
|
||||
The following step is not needed for Kubernetes v1.12 and above.
|
||||
|
||||
Add extra_binds for kubelet in RKE `cluster.yml`:
|
||||
```
|
||||
services:
|
||||
kubelet:
|
||||
extra_binds:
|
||||
- "/opt/rke/var/lib/kubelet/plugins:/var/lib/kubelet/plugins"
|
||||
```
|
||||
|
||||
### For each node:
|
||||
#### RancherOS:
|
||||
##### 1. Switch to ubuntu console
|
||||
|
||||
`sudo ros console switch ubuntu`, then type `y`
|
||||
|
||||
##### 2. Install open-iscsi for each node.
|
||||
```
|
||||
sudo apt update
|
||||
sudo apt install -y open-iscsi
|
||||
```
|
||||
##### 3. Modify configuration for iscsi.
|
||||
|
||||
1. Open config file `/etc/iscsi/iscsid.conf`
|
||||
2. Comment `iscsid.startup = /bin/systemctl start iscsid.socket`
|
||||
3. Uncomment `iscsid.startup = /sbin/iscsid`
|
||||
|
||||
#### CoreOS:
|
||||
|
||||
##### 1. If you want to enable iSCSI daemon automatically at boot, you need to enable the systemd service:
|
||||
|
||||
```
|
||||
sudo su
|
||||
systemctl enable iscsid
|
||||
reboot
|
||||
```
|
||||
|
||||
##### 2. Or just start the iSCSI daemon for the current session:
|
||||
|
||||
```
|
||||
sudo su
|
||||
systemctl start iscsid
|
||||
```
|
||||
|
||||
#### K3S:
|
||||
##### 1. For Longhorn v0.7.0 and above
|
||||
Longhorn v0.7.0 and above support k3s v0.10.0 and above only by default.
|
||||
@ -73,21 +27,6 @@ This error is due to Longhorn cannot detect where is the root dir setup for Kube
|
||||
User can override the root-dir detection by manually setting argument `kubelet-root-dir` here:
|
||||
https://github.com/rancher/longhorn/blob/master/deploy/longhorn.yaml#L329
|
||||
|
||||
#### How to find `root-dir`?
|
||||
|
||||
**For RancherOS/CoreOS**
|
||||
|
||||
Run `ps aux | grep kubelet` and get argument `--root-dir` on host node.
|
||||
|
||||
e.g.
|
||||
```
|
||||
$ ps aux | grep kubelet
|
||||
root 3755 4.4 2.9 744404 120020 ? Ssl 00:45 0:02 kubelet --root-dir=/opt/rke/var/lib/kubelet --volume-plugin-dir=/var/lib/kubelet/volumeplugins
|
||||
```
|
||||
You will find `root-dir` in the cmdline of proc `kubelet`. If it's not set, the default value `/var/lib/kubelet` would be used. In the case of RancherOS/CoreOS, the root-dir would be `/opt/rke/var/lib/kubelet` as shown above.
|
||||
|
||||
If kubelet is using a configuration file, you would need to check the configuration file to locate the `root-dir` parameter.
|
||||
|
||||
**For K3S v0.10.0-**
|
||||
|
||||
Run `ps aux | grep k3s` and get argument `--data-dir` or `-d` on k3s server node.
|
||||
@ -107,33 +46,8 @@ If K3S is using a configuration file, you would need to check the configuration
|
||||
It is always `/var/lib/kubelet`
|
||||
|
||||
## Background
|
||||
#### CSI doesn't work with RancherOS/CoreOS + RKE before Longhorn v0.4.1
|
||||
The reason is:
|
||||
|
||||
1. RKE sets argument `root-dir=/opt/rke/var/lib/kubelet` for kubelet in the case of RancherOS or CoreOS, which is different from the default value `/var/lib/kubelet`.
|
||||
|
||||
2. **For k8s v1.12 and above**
|
||||
|
||||
Kubelet will detect the `csi.sock` according to argument `<--kubelet-registration-path>` passed in by Kubernetes CSI driver-registrar, and `<drivername>-reg.sock` (for Longhorn, it's `io.rancher.longhorn-reg.sock`) on kubelet path `<root-dir>/plugins`.
|
||||
|
||||
**For k8s v1.11**
|
||||
|
||||
Kubelet will find both sockets on kubelet path `/var/lib/kubelet/plugins`.
|
||||
|
||||
3. By default, Longhorn CSI driver create and expose these 2 sock files on host path `/var/lib/kubelet/plugins`.
|
||||
|
||||
4. Then kubelet cannot find `<drivername>-reg.sock`, so CSI driver doesn't work.
|
||||
|
||||
5. Furthermore, kubelet will instruct CSI plugin to mount Longhorn volume on `<root-dir>/pods/<pod-name>/volumes/kubernetes.io~csi/<volume-name>/mount`.
|
||||
|
||||
But this path inside CSI plugin container won't be binded mount on host path. And the mount operation for Longhorn volume is meaningless.
|
||||
|
||||
Hence Kubernetes cannot connect to Longhorn using CSI driver.
|
||||
|
||||
#### Longhorn versions before v0.7.0 don't work on K3S v0.10.0 or above
|
||||
K3S now sets its kubelet directory to `/var/lib/kubelet`. See [the K3S release comment](https://github.com/rancher/k3s/releases/tag/v0.10.0) for details.
|
||||
|
||||
## Reference
|
||||
https://github.com/kubernetes-csi/driver-registrar
|
||||
|
||||
https://coreos.com/os/docs/latest/iscsi.html
|
||||
|
101
docs/expansion.md
Normal file
101
docs/expansion.md
Normal file
@ -0,0 +1,101 @@
|
||||
# Volume Expansion
|
||||
|
||||
## Overview
|
||||
- Longhorn supports OFFLINE volume expansion only.
|
||||
- Longhorn will expand frontend (e.g. block device) then expand filesystem.
|
||||
|
||||
## Prerequisite:
|
||||
1. Longhorn version v0.8.0 or higher.
|
||||
2. The volume to be expanded is state `detached`.
|
||||
|
||||
## Expand a Longhorn volume
|
||||
There are two ways to expand a Longhorn volume:
|
||||
|
||||
#### Via PVC
|
||||
- This method is applied only if:
|
||||
1. Kubernetes version v1.16 or higher.
|
||||
2. The PVC is dynamically provisioned by the Kubernetes with Longhorn StorageClass.
|
||||
3. The field `allowVolumeExpansion` should be `true` in the related StorageClass.
|
||||
- This method is recommended if it's applicable. Since the PVC and PV will be updated automatically and everything keeps consistent after expansion.
|
||||
- Usage: Find the corresponding PVC for Longhorn volume then modify requested `storage` of the PVC spec. e.g.,
|
||||
```
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
annotations:
|
||||
kubectl.kubernetes.io/last-applied-configuration: |
|
||||
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"longhorn-simple-pvc","namespace":"default"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}},"storageClassName":"longhorn"}}
|
||||
pv.kubernetes.io/bind-completed: "yes"
|
||||
pv.kubernetes.io/bound-by-controller: "yes"
|
||||
volume.beta.kubernetes.io/storage-provisioner: driver.longhorn.io
|
||||
creationTimestamp: "2019-12-21T01:36:16Z"
|
||||
finalizers:
|
||||
- kubernetes.io/pvc-protection
|
||||
name: longhorn-simple-pvc
|
||||
namespace: default
|
||||
resourceVersion: "162431"
|
||||
selfLink: /api/v1/namespaces/default/persistentvolumeclaims/longhorn-simple-pvc
|
||||
uid: 0467ae73-22a5-4eba-803e-464cc0b9d975
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
storageClassName: longhorn
|
||||
volumeMode: Filesystem
|
||||
volumeName: pvc-0467ae73-22a5-4eba-803e-464cc0b9d975
|
||||
status:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
capacity:
|
||||
storage: 1Gi
|
||||
phase: Bound
|
||||
```
|
||||
Modify `spec.resources.requests.storage` of this PVC.
|
||||
|
||||
|
||||
#### Via Longhorn UI
|
||||
- If your Kubernetes version is v1.14 or v1.15, this method is the only choice for Longhorn volume expansion.
|
||||
- Notice that The volume size will be updated after the expansion but the capacity of corresponding PVC and PV won't change. Users need to take care of them.
|
||||
- Usage: On the volume page of Longhorn UI, click `Expand` for the volume.
|
||||
|
||||
|
||||
## Frontend expansion
|
||||
- To prevent the frontend expansion from being interfered by unexpected data R/W, Longhorn supports OFFLINE expansion only.
|
||||
The `detached` volume will be automatically attached to a random node with maintenance mode.
|
||||
- Rebuilding/adding replicas is not allowed during the expansion and vice versa.
|
||||
|
||||
|
||||
## Filesystem expansion
|
||||
#### Longhorn will try to expand the file system only if:
|
||||
1. The expanded size should be greater than the current size.
|
||||
2. There is a Linux filesystem in the Longhorn volume.
|
||||
3. The filesystem used in the Longhorn volume is one of the followings:
|
||||
1. ext4
|
||||
2. XFS
|
||||
4. The Longhorn volume is using block device frontend.
|
||||
|
||||
#### Handling volume revert:
|
||||
If users revert a volume to a snapshot with smaller size, the frontend of the volume is still holding the expanded size. But the filesystem size will be the same as that of the reverted snapshot. In this case, users need to handle the filesystem manually:
|
||||
1. Attach the volume to a random nodes.
|
||||
2. Log into the corresponding node, expand the filesystem:
|
||||
- If the filesystem is `ext4`, the volume might need to be mounted and umounted once before resizing the filesystem manually. Otherwise, executing `resize2fs` might result in an error:
|
||||
```
|
||||
resize2fs: Superblock checksum does not match superblock while trying to open ......
|
||||
Couldn't find valid filesystem superblock.
|
||||
```
|
||||
Follow the steps below to resize the filesystem:
|
||||
```
|
||||
mount /dev/longhorn/<volume name> <arbitrary mount directory>
|
||||
umount /dev/longhorn/<volume name>
|
||||
mount /dev/longhorn/<volume name> <arbitrary mount directory>
|
||||
resize2fs /dev/longhorn/<volume name>
|
||||
umount /dev/longhorn/<volume name>
|
||||
```
|
||||
- If the filesystem is `xfs`, users can directly mount then expand the filesystem.
|
||||
```
|
||||
mount /dev/longhorn/<volume name> <arbitrary mount directory>
|
||||
xfs_growfs <the mount directory>
|
||||
umount /dev/longhorn/<volume name>
|
||||
```
|
49
docs/longhorn-ingress.md
Normal file
49
docs/longhorn-ingress.md
Normal file
@ -0,0 +1,49 @@
|
||||
## Create Nginx Ingress Controller with basic authentication
|
||||
|
||||
1. Create a basic auth file `auth`:
|
||||
> It's important the file generated is named auth (actually - that the secret has a key data.auth), otherwise the ingress-controller returns a 503
|
||||
|
||||
`$ USER=<USERNAME_HERE>; PASSWORD=<PASSWORD_HERE>; echo "${USER}:$(openssl passwd -stdin -apr1 <<< ${PASSWORD})" >> auth`
|
||||
|
||||
2. Create a secret
|
||||
|
||||
`$ kubectl -n longhorn-system create secret generic basic-auth --from-file=auth`
|
||||
|
||||
3. Create an Nginx ingress controller manifest `longhorn-ingress.yml` :
|
||||
|
||||
```
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: longhorn-ingress
|
||||
namespace: longhorn-system
|
||||
annotations:
|
||||
# type of authentication
|
||||
nginx.ingress.kubernetes.io/auth-type: basic
|
||||
# name of the secret that contains the user/password definitions
|
||||
nginx.ingress.kubernetes.io/auth-secret: basic-auth
|
||||
# message to display with an appropriate context why the authentication is required
|
||||
nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required '
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /
|
||||
backend:
|
||||
serviceName: longhorn-frontend
|
||||
servicePort: 80
|
||||
```
|
||||
|
||||
4. Create the ingress controller:
|
||||
`$ kubectl -n longhorn-system apply longhorn-ingress.yml`
|
||||
|
||||
|
||||
|
||||
#### For AWS EKS clusters:
|
||||
User need to create an ELB to expose nginx ingress controller to the internet. (additional cost may apply)
|
||||
|
||||
1. Create pre-requisite resources:
|
||||
https://github.com/kubernetes/ingress-nginx/blob/master/docs/deploy/index.md#prerequisite-generic-deployment-command
|
||||
|
||||
2. Create ELB:
|
||||
https://github.com/kubernetes/ingress-nginx/blob/master/docs/deploy/index.md#aws
|
@ -7,16 +7,24 @@ When a Kubernetes node fails with CSI driver installed (all the following are ba
|
||||
2. After about **five minutes**, the states of all the pods on the `NotReady` node will change to either `Unknown` or `NodeLost`.
|
||||
3. If you're deploying using StatefulSet or Deployment, you need to decide is if it's safe to force deletion the pod of the workload
|
||||
running on the lost node. See [here](https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/).
|
||||
1. StatefulSet has stable identity, so Kubernetes won't delete the Pod for the user.
|
||||
1. StatefulSet has stable identity, so Kubernetes won't force deleting the Pod for the user.
|
||||
2. Deployment doesn't have stable identity, but Longhorn is a Read-Write-Once type of storage, which means it can only attached
|
||||
to one Pod. So the new Pod created by Kubernetes won't be able to start due to the Longhorn volume still attached to the old Pod,
|
||||
on the lost Node.
|
||||
3. In both cases, Kubernetes will automatically evict the pod (set deletion timestamp for the pod) on the lost node, then try to
|
||||
**recreate a new one with old volumes**. Because the evicted pod gets stuck in `Terminating` state and the attached Longhorn volumes
|
||||
cannot be released/reused, the new pod will get stuck in `ContainerCreating` state. That's why users need to decide is if it's safe to force deleting the pod.
|
||||
4. If you decide to delete the Pod manually (and forcefully), Kubernetes will take about another **six minutes** to delete the VolumeAttachment
|
||||
object associated with the Pod, thus finally detach the Longhorn volume from the lost Node and allow it to be used by the new Pod.
|
||||
object associated with the Pod, thus finally detach the Longhorn volume from the lost Node and allow it to be used by the new Pod.
|
||||
- This another six-minute is [hardcoded in Kubernetes](https://github.com/kubernetes/kubernetes/blob/5e31799701123c50025567b8534e1a62dbc0e9f6/pkg/controller/volume/attachdetach/attach_detach_controller.go#L95):
|
||||
If the pod on the lost node is forced deleting, the related volumes won't be unmounted correctly. Then Kubernetes will wait for this fixed timeout
|
||||
to directly clean up the VolumeAttachment object.
|
||||
|
||||
## What to expect when recovering a failed Kubernetes Node
|
||||
1. If the node is **back online within 5 - 6 minutes** of the failure, Kubernetes will restart pods, unmount then re-mount volumes without volume re-attaching and VolumeAttachment cleanup.
|
||||
Because the volume engines would be down after the node down, the direct remount won’t work since the device no longer exists on the node. In this case, Longhorn needs to detach and re-attach the volumes to recover the volume engines, so that the pods can remount/reuse the volumes safely.
|
||||
2. If the node is **not back online within 5 - 6 minutes** of the failure, Kubernetes will try to delete all unreachable pods and these pods will become `Terminating` state. See [pod eviction timeout](https://kubernetes.io/docs/concepts/architecture/nodes/#condition) for details.
|
||||
Because the volume engines would be down after the node down, this direct remount won’t work since the device no longer exists on the node.
|
||||
In this case, Longhorn will detach and re-attach the volumes to recover the volume engines, so that the pods can remount/reuse the volumes safely.
|
||||
2. If the node is **not back online within 5 - 6 minutes** of the failure, Kubernetes will try to delete all unreachable pods based on the pod eviction mechanism and these pods will become `Terminating` state. See [pod eviction timeout](https://kubernetes.io/docs/concepts/architecture/nodes/#condition) for details.
|
||||
Then if the failed node is recovered later, Kubernetes will restart those terminating pods, detach the volumes, wait for the old VolumeAttachment cleanup, and reuse(re-attach & re-mount) the volumes. Typically these steps may take 1 ~ 7 minutes.
|
||||
In this case, detaching and re-attaching operations are included in the recovery procedures. Hence no extra operation is needed and the Longhorn volumes will be available after the above steps.
|
||||
In this case, detaching and re-attaching operations are already included in the Kubernetes recovery procedures. Hence no extra operation is needed and the Longhorn volumes will be available after the above steps.
|
||||
3. For all above recovery scenarios, Longhorn will handle those steps automatically with the association of Kubernetes. This section is aimed to inform users of what happens and what is expected during the recovery.
|
@ -2,7 +2,16 @@
|
||||
|
||||
## Overview
|
||||
1. Now Longhorn can automatically reattach then remount volumes if unexpected detachment happens. e.g., [Kubernetes upgrade](https://github.com/longhorn/longhorn/issues/703), [Docker reboot](https://github.com/longhorn/longhorn/issues/686).
|
||||
2. After reattachment and remount complete, users may need to **manually restart the related workload containers** for the volume restoration **if the following recommended setup is not applied**.
|
||||
2. After **reattachment** and **remount** complete, users may need to **manually restart the related workload containers** for the volume restoration **if the following recommended setup is not applied**.
|
||||
|
||||
#### Reattachment
|
||||
Longhorn will reattach the volume if the volume engine deads unexpectedly.
|
||||
|
||||
#### Remount
|
||||
- Longhorn will detect and remount filesystem for the volume after the reattachment.
|
||||
- But **the auto remount does not work for `xfs` filesystem**.
|
||||
- Since mounting one more layers with `xfs` filesystem is not allowed and will trigger the error `XFS (sdb): Filesystem has duplicate UUID <filesystem UUID> - can't mount`.
|
||||
- Users need to manually unmount then mount the `xfs` filesystem on the host. The device path on host for the attached volume is `/dev/longhorn/<volume name>`
|
||||
|
||||
## Recommended setup when using Longhorn volumes
|
||||
In order to recover unexpectedly detached volumes automatically, users can set `restartPolicy` to `Always` then add `livenessProbe` for the workloads using Longhorn volumes.
|
||||
@ -55,7 +64,7 @@ spec:
|
||||
|
||||
## Manually restart workload containers
|
||||
## This solution is applied only if:
|
||||
1. The Longhorn volume is reattached automatically.
|
||||
1. The Longhorn volume is reattached and remounted automatically.
|
||||
2. The above setup is not included when the related workload is launched.
|
||||
|
||||
### Steps
|
||||
|
27
examples/simple_pod.yaml
Normal file
27
examples/simple_pod.yaml
Normal file
@ -0,0 +1,27 @@
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: longhorn-simple-pod
|
||||
namespace: default
|
||||
spec:
|
||||
restartPolicy: Always
|
||||
containers:
|
||||
- name: volume-test
|
||||
image: nginx:stable-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
livenessProbe:
|
||||
exec:
|
||||
command:
|
||||
- ls
|
||||
- /data/lost+found
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: volv
|
||||
mountPath: /data
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumes:
|
||||
- name: volv
|
||||
persistentVolumeClaim:
|
||||
claimName: longhorn-simple-pvc
|
11
examples/simple_pvc.yaml
Normal file
11
examples/simple_pvc.yaml
Normal file
@ -0,0 +1,11 @@
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: longhorn-simple-pvc
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: longhorn
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
@ -3,6 +3,7 @@ apiVersion: storage.k8s.io/v1
|
||||
metadata:
|
||||
name: longhorn
|
||||
provisioner: driver.longhorn.io
|
||||
allowVolumeExpansion: true
|
||||
parameters:
|
||||
numberOfReplicas: "3"
|
||||
staleReplicaTimeout: "2880" # 48 hours in minutes
|
||||
|
@ -58,7 +58,7 @@ spec:
|
||||
spec:
|
||||
containers:
|
||||
- name: longhorn-uninstall
|
||||
image: longhornio/longhorn-manager:v0.7.0-rc2
|
||||
image: longhornio/longhorn-manager:v0.8.0
|
||||
imagePullPolicy: Always
|
||||
command:
|
||||
- longhorn-manager
|
||||
|
Loading…
Reference in New Issue
Block a user