From 0dc32d7676dc34a50c5116c6f71a881ad9028efa Mon Sep 17 00:00:00 2001 From: James Oliver Date: Fri, 17 Aug 2018 17:19:22 -0700 Subject: [PATCH] Document upgrade path from v0.1, v0.2 to v0.3 --- README.md | 14 ++- docs/upgrade.md | 246 +++++++++++++++++++++++++++++++++--------------- 2 files changed, 180 insertions(+), 80 deletions(-) diff --git a/README.md b/README.md index bc2625f..10d7ab9 100644 --- a/README.md +++ b/README.md @@ -66,19 +66,23 @@ Both `kube-apiserver` and `kubelet` should have `--feature-gates=MountPropagatio 2. Google GKE: `/home/kubernetes/flexvolume` 3. For other distro, please find the correct directory by running `ps aux|grep kubelet` on the host and check the `--volume-plugin-dir` parameter. If there is none, it would be the default value `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` . +# Upgrading + +For instructions on how to upgrade Longhorn v0.1 or v0.2 to v0.3, [see this document](docs/upgrade.md#upgrade). + # Deployment Create the deployment of Longhorn in your Kubernetes cluster is easy. If you're using Rancher RKE, or other distro with Kubernetes v1.10+ and Mount Propagation enabled, you can just do: ``` -kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml +kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml ``` If you're using Flexvolume driver with other Kubernetes Distro, replace the value of $FLEXVOLUME_DIR in the following command with your own Flexvolume Directory as specified above. ``` FLEXVOLUME_DIR="/home/kubernetes/flexvolume/" curl -s https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml|sed "s#^\( *\)value: \"/var/lib/kubelet/volumeplugins\"#\1value: \"${FLEXVOLUME_DIR}\"#g" > longhorn.yaml -kubectl create -f longhorn.yaml +kubectl apply -f longhorn.yaml ``` For Google Kubernetes Engine (GKE) users, see [here](#google-kubernetes-engine) before proceed. @@ -145,12 +149,12 @@ Longhorn provides persistent volume directly to Kubernetes through one of the Lo Use following command to create a default Longhorn StorageClass named `longhorn`. ``` -kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/storageclass.yaml +kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/storageclass.yaml ``` Now you can create a pod using Longhorn like this: ``` -kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/pvc.yaml +kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/pvc.yaml ``` The yaml contains two parts: @@ -214,7 +218,7 @@ We provides two testing purpose backupstore based on NFS server and Minio S3 ser Use following command to setup a Minio S3 server for BackupStore after `longhorn-system` was created. ``` -kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/backupstores/minio-backupstore.yaml +kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/backupstores/minio-backupstore.yaml ``` Now set `Settings/General/BackupTarget` to diff --git a/docs/upgrade.md b/docs/upgrade.md index 8c83a71..b7cd103 100644 --- a/docs/upgrade.md +++ b/docs/upgrade.md @@ -1,100 +1,196 @@ # Upgrade -Here we would cover how to upgrade from Longhorn v0.2 to Longhorn v0.3 release. +Here we cover how to upgrade to Longhorn v0.3 from all previous releases. -## Backup your existing data -1. It's recommended to create a latest backup for every volume to the backupstore before upgrade. -2. Make sure no volume is in degraded or faulted state. -3. Shutdown related Kubernetes pods. Detach all the volumes. Make sure all the volumes are detached before proceeding. -4. Backup CRD yaml to local directory: +## Backup Existing Volumes + +It's recommended to create a recent backup of every volume to the backupstore +before upgrade. + +Create an on-cluster backupstore if you haven't already. We'll use NFS in this +example. ``` -kubectl -n longhorn-system get volumes.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-volumes.yaml -kubectl -n longhorn-system get engines.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-engines.yaml -kubectl -n longhorn-system get replicas.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-replicas.yaml -kubectl -n longhorn-system get settings.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-settings.yaml +kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/backupstores/nfs-backupstore.yaml ``` -5. Noted the value of BackupTarget in the setting. The user would need to reset after upgrade. -## Upgrade from v0.2 to v0.3 +On Settings page, set Backup Target to +`nfs://longhorn-test-nfs-svc.default:/opt/backupstore` and click `Save`. -Please be aware that the upgrade will incur API downtime. +Navigate to each volume detail page and click `Take Snapshot`. Click the new +snapshot and click `Backup`. -### 1. Remove the old manager +## Check For Issues + +Make sure no volume is in degraded or faulted state. Wait for degraded +volumes to heal and delete/restore faulted volumes before proceeding. + +## Detach Volumes + +Shutdown all Kubernetes Pods using Longhorn volumes in order to detach the +volumes. The easiest way to achieve this is by deleting all workloads. If +this is not desirable, some workloads may be suspended. We will cover how +each workload can be modified to shut down its pods. + +### CronJob +Edit the cronjob with `kubectl edit cronjob/`. +Set `.spec.suspend` to `true`. +Wait for any currently executing jobs to complete, or terminate them by +deleting relevant pods. + +### DaemonSet +Delete the daemonset with `kubectl delete ds/`. +There is no way to suspend this workload. + +### Deployment +Edit the deployment with `kubectl edit deploy/`. +Set `.spec.replicas` to `0`. + +### Job +Consider allowing the single-run job to complete. +Otherwise, delete the job with `kubectl delete job/`. + +### Pod +Delete the pod with `kubectl delete pod/`. +There is no way to suspend a pod not managed by a workload controller. + +### ReplicaSet +Edit the replicaset with `kubectl edit replicaset/`. +Set `.spec.replicas` to `0`. + +### ReplicationController +Edit the replicationcontroller with `kubectl edit rc/`. +Set `.spec.replicas` to `0`. + +### StatefulSet +Edit the statefulset with `kubectl edit statefulset/`. +Set `.spec.replicas` to `0`. + +Detach all remaining volumes from Longhorn UI. These volumes were most likely +created and attached outside of Kubernetes via Longhorn UI or REST API. + +## Uninstall Old Version + +Make note of `BackupTarget` on the `Setting` page. You will need to manually +set `BackupTarget` after upgrading from either v0.1 or v0.2. + +Delete Longhorn components. + +For Longhorn `v0.1`: +``` +kubectl delete -f https://raw.githubusercontent.com/llparse/longhorn/v0.1/deploy/uninstall-for-upgrade.yaml +``` + +For Longhorn `v0.2`: ``` kubectl delete -f https://raw.githubusercontent.com/rancher/longhorn/v0.2/deploy/uninstall-for-upgrade.yaml ``` -### 2. Install the new manager - -We will use `kubectl apply` instead of `kubectl create` to install the new version of the manager. - -If you're using Rancher RKE, or other distro with Kubernetes v1.10+ and Mount Propagation enabled, you can just do: +If both commands returned `Not found` for all components, Longhorn is probably +deployed in a different namespace. Determine which namespace is in use and +adjust `NAMESPACE` accordingly: ``` -kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml -``` -If you're using Flexvolume driver with other Kubernetes Distro, replace the value of $FLEXVOLUME_DIR in the following command with your own Flexvolume Directory as specified above. -``` -FLEXVOLUME_DIR="/home/kubernetes/flexvolume/" -curl -s https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml|sed "s#^\( *\)value: \"/var/lib/kubelet/volumeplugins\"#\1value: \"${FLEXVOLUME_DIR}\"#g" > longhorn.yaml -kubectl apply -f longhorn.yaml +NAMESPACE=longhorn-custom-ns +curl -sSfL https://raw.githubusercontent.com/rancher/longhorn/v0.1/deploy/uninstall-for-upgrade.yaml|sed "s#^\( *\)namespace: longhorn#\1namespace: ${NAMESPACE}#g" > longhorn.yaml +kubectl delete -f longhorn.yaml ``` -For Google Kubernetes Engine (GKE) users, see [here](./gke.md) before proceed. +## Backup Longhorn System -Longhorn Manager and Longhorn Driver will be deployed as daemonsets in a separate namespace called `longhorn-system`, as you can see in the yaml file. +Backup Longhorn CRD yaml to local directory. -When you see those pods has started correctly as follows, you've deployed the Longhorn successfully. - -Deployed with CSI driver: +### v0.1 +Check your backups to make sure Longhorn was running in namespace `longhorn`. ``` -# kubectl -n longhorn-system get pod -NAME READY STATUS RESTARTS AGE -csi-attacher-0 1/1 Running 0 6h -csi-provisioner-0 1/1 Running 0 6h -engine-image-ei-57b85e25-8v65d 1/1 Running 0 7d -engine-image-ei-57b85e25-gjjs6 1/1 Running 0 7d -engine-image-ei-57b85e25-t2787 1/1 Running 0 7d -longhorn-csi-plugin-4cpk2 2/2 Running 0 6h -longhorn-csi-plugin-ll6mq 2/2 Running 0 6h -longhorn-csi-plugin-smlsh 2/2 Running 0 6h -longhorn-driver-deployer-7b5bdcccc8-fbncl 1/1 Running 0 6h -longhorn-manager-7x8x8 1/1 Running 0 6h -longhorn-manager-8kqf4 1/1 Running 0 6h -longhorn-manager-kln4h 1/1 Running 0 6h -longhorn-ui-f849dcd85-cgkgg 1/1 Running 0 5d -``` -Or with Flexvolume driver -``` -# kubectl -n longhorn-system get pod -NAME READY STATUS RESTARTS AGE -engine-image-ei-57b85e25-8v65d 1/1 Running 0 7d -engine-image-ei-57b85e25-gjjs6 1/1 Running 0 7d -engine-image-ei-57b85e25-t2787 1/1 Running 0 7d -longhorn-driver-deployer-5469b87b9c-b9gm7 1/1 Running 0 2h -longhorn-flexvolume-driver-lth5g 1/1 Running 0 2h -longhorn-flexvolume-driver-tpqf7 1/1 Running 0 2h -longhorn-flexvolume-driver-v9mrj 1/1 Running 0 2h -longhorn-manager-7x8x8 1/1 Running 0 9h -longhorn-manager-8kqf4 1/1 Running 0 9h -longhorn-manager-kln4h 1/1 Running 0 9h -longhorn-ui-f849dcd85-cgkgg 1/1 Running 0 5d +NAMESPACE=longhorn +kubectl -n ${NAMESPACE} get volumes.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-volumes.yaml +kubectl -n ${NAMESPACE} get engines.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-engines.yaml +kubectl -n ${NAMESPACE} get replicas.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-replicas.yaml +kubectl -n ${NAMESPACE} get settings.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-settings.yaml ``` -### 3. Upgrade Engine Images and set BackupTarget +### v0.2 +Check your backups to make sure Longhorn was running in namespace +`longhorn-system`. +``` +NAMESPACE=longhorn-system +kubectl -n ${NAMESPACE} get volumes.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-volumes.yaml +kubectl -n ${NAMESPACE} get engines.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-engines.yaml +kubectl -n ${NAMESPACE} get replicas.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-replicas.yaml +kubectl -n ${NAMESPACE} get settings.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-settings.yaml +``` -1. Wait until the UI is up. -2. Set the BackupTarget in the setting to the same value as before upgrade. -3. Make all the volumes are all detached. -4. Select all the volumes using batch selection. Click batch operation button - `Upgrade Engine`, choose the only engine image available in the list. It's - the default engine shipped with the manager for this release. -5. Now attach the volume one by one, to see if the volume works correctly. +## Delete CRDs in Different Namespace + +This is only required for Rancher users running Longhorn App `v0.1`. Delete all +CRDs from your namespace which is probably `longhorn`. +``` +NAMESPACE=longhorn +kubectl -n ${NAMESPACE} get volumes.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f - +kubectl -n ${NAMESPACE} get engines.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f - +kubectl -n ${NAMESPACE} get replicas.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f - +kubectl -n ${NAMESPACE} get settings.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f - +kubectl -n ${NAMESPACE} delete volumes.longhorn.rancher.io --all +kubectl -n ${NAMESPACE} delete engines.longhorn.rancher.io --all +kubectl -n ${NAMESPACE} delete replicas.longhorn.rancher.io --all +kubectl -n ${NAMESPACE} delete settings.longhorn.rancher.io --all +``` + +## Install Longhorn v0.3 + +### Rancher 2.x +For Rancher users who are running Longhorn v0.1, delete the Longhorn App from +`Catalog Apps` screen in Rancher UI. *Do not click the upgrade button.* Launch +Longhorn App template version `0.3.0-rc4`. + +### Other Kubernetes Distro + +For Longhorn v0.2 users who are not using Rancher, follow +[the official Longhorn Deployment instructions](../README.md#deployment). + +## Restore Longhorn System + +This step is only required for Rancher users running Longhorn App `v0.1`. + +``` +NAMESPACE=longhorn-system +sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-settings.yaml | kubectl apply -f - +sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-replicas.yaml | kubectl apply -f - +sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-engines.yaml | kubectl apply -f - +sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-volumes.yaml | kubectl apply -f - +``` + +## Access UI and Set BackupTarget + +Wait until the longhorn-ui pod is `Running`: +``` +kubectl -n longhorn-system get pod -w +``` + +[Access the UI](../README.md#access-the-ui). + +On `Setting > General`, set `Backup Target` to the backup target used in +the previous version. In our example, this is +`nfs://longhorn-test-nfs-svc.default:/opt/backupstore`. + +## Upgrade Engine Images + +Ensure all volumes are detached. If any are still attached, detach them now +and wait until they are in `Detached` state. + +Select all the volumes using batch selection. Click batch operation button +`Upgrade Engine`, choose the only engine image available in the list. It's +the default engine shipped with the manager for this release. + +## Attach Volumes + +Now we will resume all workloads by reversing the changes we made to detach +the volumes. Any volume not part of a K8s workload or pod must be attached +manually. ## Note -Upgrade is always tricky. Keep backups for the volumes are critical. - -If you have any issues, please reported it at -https://github.com/rancher/longhorn/issues , with your backup yaml files as well -as manager logs. +Upgrade is always tricky. Keeping recent backups for volumes is critical. +If you have any issues, please report it at +https://github.com/rancher/longhorn/issues and include your backup yaml files +as well as manager logs.