From 0dc32d7676dc34a50c5116c6f71a881ad9028efa Mon Sep 17 00:00:00 2001
From: James Oliver <joliver@rancher.com>
Date: Fri, 17 Aug 2018 17:19:22 -0700
Subject: [PATCH] Document upgrade path from v0.1, v0.2 to v0.3

---
 README.md       |  14 ++-
 docs/upgrade.md | 246 +++++++++++++++++++++++++++++++++---------------
 2 files changed, 180 insertions(+), 80 deletions(-)

diff --git a/README.md b/README.md
index bc2625f..10d7ab9 100644
--- a/README.md
+++ b/README.md
@@ -66,19 +66,23 @@ Both `kube-apiserver` and `kubelet` should have `--feature-gates=MountPropagatio
     2.  Google GKE: `/home/kubernetes/flexvolume`
     3.  For other distro, please find the correct directory by running `ps aux|grep kubelet` on the host and check the `--volume-plugin-dir` parameter. If there is none, it would be the default value `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` .
 
+# Upgrading
+
+For instructions on how to upgrade Longhorn v0.1 or v0.2 to v0.3, [see this document](docs/upgrade.md#upgrade).
+
 # Deployment
 
 Create the deployment of Longhorn in your Kubernetes cluster is easy.
 
 If you're using Rancher RKE, or other distro with Kubernetes v1.10+ and Mount Propagation enabled, you can just do:
 ```
-kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml
+kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml
 ```
 If you're using Flexvolume driver with other Kubernetes Distro, replace the value of $FLEXVOLUME_DIR in the following command with your own Flexvolume Directory as specified above.
 ```
 FLEXVOLUME_DIR="/home/kubernetes/flexvolume/"
 curl -s https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml|sed "s#^\( *\)value: \"/var/lib/kubelet/volumeplugins\"#\1value: \"${FLEXVOLUME_DIR}\"#g" > longhorn.yaml
-kubectl create -f longhorn.yaml
+kubectl apply -f longhorn.yaml
 ```
 For Google Kubernetes Engine (GKE) users, see  [here](#google-kubernetes-engine)  before proceed.
 
@@ -145,12 +149,12 @@ Longhorn provides persistent volume directly to Kubernetes through one of the Lo
 Use following command to create a default Longhorn StorageClass named `longhorn`.
 
 ```
-kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/storageclass.yaml
+kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/storageclass.yaml
 ```
 
 Now you can create a pod using Longhorn like this:
 ```
-kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/pvc.yaml
+kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/examples/pvc.yaml
 ```
 
 The yaml contains two parts:
@@ -214,7 +218,7 @@ We provides two testing purpose backupstore based on NFS server and Minio S3 ser
 
 Use following command to setup a Minio S3 server for BackupStore after `longhorn-system` was created.
 ```
-kubectl create -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/backupstores/minio-backupstore.yaml
+kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/backupstores/minio-backupstore.yaml
 ```
 
 Now set `Settings/General/BackupTarget` to
diff --git a/docs/upgrade.md b/docs/upgrade.md
index 8c83a71..b7cd103 100644
--- a/docs/upgrade.md
+++ b/docs/upgrade.md
@@ -1,100 +1,196 @@
 # Upgrade
 
-Here we would cover how to upgrade from Longhorn v0.2 to Longhorn v0.3 release.
+Here we cover how to upgrade to Longhorn v0.3 from all previous releases.
 
-## Backup your existing data
-1. It's recommended to create a latest backup for every volume to the backupstore before upgrade.
-2. Make sure no volume is in degraded or faulted state.
-3. Shutdown related Kubernetes pods. Detach all the volumes. Make sure all the volumes are detached before proceeding.
-4. Backup CRD yaml to local directory:
+## Backup Existing Volumes
+
+It's recommended to create a recent backup of every volume to the backupstore
+before upgrade.
+
+Create an on-cluster backupstore if you haven't already. We'll use NFS in this
+example.
 ```
-kubectl -n longhorn-system get volumes.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-volumes.yaml
-kubectl -n longhorn-system get engines.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-engines.yaml
-kubectl -n longhorn-system get replicas.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-replicas.yaml
-kubectl -n longhorn-system get settings.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-settings.yaml
+kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/backupstores/nfs-backupstore.yaml
 ```
-5. Noted the value of BackupTarget in the setting. The user would need to reset after upgrade.
 
-## Upgrade from v0.2 to v0.3
+On Settings page, set Backup Target to
+`nfs://longhorn-test-nfs-svc.default:/opt/backupstore` and click `Save`.
 
-Please be aware that the upgrade will incur API downtime.
+Navigate to each volume detail page and click `Take Snapshot`. Click the new
+snapshot and click `Backup`.
 
-### 1. Remove the old manager
+## Check For Issues
+
+Make sure no volume is in degraded or faulted state. Wait for degraded
+volumes to heal and delete/restore faulted volumes before proceeding.
+
+## Detach Volumes
+
+Shutdown all Kubernetes Pods using Longhorn volumes in order to detach the
+volumes. The easiest way to achieve this is by deleting all workloads. If
+this is not desirable, some workloads may be suspended. We will cover how
+each workload can be modified to shut down its pods.
+
+### CronJob
+Edit the cronjob with `kubectl edit cronjob/<name>`.
+Set `.spec.suspend` to `true`.
+Wait for any currently executing jobs to complete, or terminate them by
+deleting relevant pods.
+
+### DaemonSet
+Delete the daemonset with `kubectl delete ds/<name>`.
+There is no way to suspend this workload.
+
+### Deployment
+Edit the deployment with `kubectl edit deploy/<name>`.
+Set `.spec.replicas` to `0`.
+
+### Job
+Consider allowing the single-run job to complete.
+Otherwise, delete the job with `kubectl delete job/<name>`.
+
+### Pod
+Delete the pod with `kubectl delete pod/<name>`.
+There is no way to suspend a pod not managed by a workload controller.
+
+### ReplicaSet
+Edit the replicaset with `kubectl edit replicaset/<name>`.
+Set `.spec.replicas` to `0`.
+
+### ReplicationController
+Edit the replicationcontroller with `kubectl edit rc/<name>`.
+Set `.spec.replicas` to `0`.
+
+### StatefulSet
+Edit the statefulset with `kubectl edit statefulset/<name>`.
+Set `.spec.replicas` to `0`.
+
+Detach all remaining volumes from Longhorn UI. These volumes were most likely
+created and attached outside of Kubernetes via Longhorn UI or REST API.
+
+## Uninstall Old Version
+
+Make note of `BackupTarget` on the `Setting` page. You will need to manually
+set `BackupTarget` after upgrading from either v0.1 or v0.2.
+
+Delete Longhorn components.
+
+For Longhorn `v0.1`:
+```
+kubectl delete -f https://raw.githubusercontent.com/llparse/longhorn/v0.1/deploy/uninstall-for-upgrade.yaml
+```
+
+For Longhorn `v0.2`:
 ```
 kubectl delete -f https://raw.githubusercontent.com/rancher/longhorn/v0.2/deploy/uninstall-for-upgrade.yaml
 ```
 
-### 2. Install the new manager
-
-We will use `kubectl apply` instead of `kubectl create` to install the new version of the manager.
-
-If you're using Rancher RKE, or other distro with Kubernetes v1.10+ and Mount Propagation enabled, you can just do:
+If both commands returned `Not found` for all components, Longhorn is probably
+deployed in a different namespace. Determine which namespace is in use and
+adjust `NAMESPACE` accordingly:
 ```
-kubectl apply -f https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml
-```
-If you're using Flexvolume driver with other Kubernetes Distro, replace the value of $FLEXVOLUME_DIR in the following command with your own Flexvolume Directory as specified above.
-```
-FLEXVOLUME_DIR="/home/kubernetes/flexvolume/"
-curl -s https://raw.githubusercontent.com/rancher/longhorn/v0.3-rc/deploy/longhorn.yaml|sed "s#^\( *\)value: \"/var/lib/kubelet/volumeplugins\"#\1value: \"${FLEXVOLUME_DIR}\"#g" > longhorn.yaml
-kubectl apply -f longhorn.yaml
+NAMESPACE=longhorn-custom-ns
+curl -sSfL https://raw.githubusercontent.com/rancher/longhorn/v0.1/deploy/uninstall-for-upgrade.yaml|sed "s#^\( *\)namespace: longhorn#\1namespace: ${NAMESPACE}#g" > longhorn.yaml
+kubectl delete -f longhorn.yaml
 ```
 
-For Google Kubernetes Engine (GKE) users, see  [here](./gke.md)  before proceed.
+## Backup Longhorn System
 
-Longhorn Manager and Longhorn Driver will be deployed as daemonsets in a separate namespace called `longhorn-system`, as you can see in the yaml file.
+Backup Longhorn CRD yaml to local directory.
 
-When you see those pods has started correctly as follows, you've deployed the Longhorn successfully.
-
-Deployed with CSI driver:
+### v0.1
+Check your backups to make sure Longhorn was running in namespace `longhorn`.
 ```
-# kubectl -n longhorn-system get pod
-NAME                                        READY     STATUS    RESTARTS   AGE
-csi-attacher-0                              1/1       Running   0          6h
-csi-provisioner-0                           1/1       Running   0          6h
-engine-image-ei-57b85e25-8v65d              1/1       Running   0          7d
-engine-image-ei-57b85e25-gjjs6              1/1       Running   0          7d
-engine-image-ei-57b85e25-t2787              1/1       Running   0          7d
-longhorn-csi-plugin-4cpk2                   2/2       Running   0          6h
-longhorn-csi-plugin-ll6mq                   2/2       Running   0          6h
-longhorn-csi-plugin-smlsh                   2/2       Running   0          6h
-longhorn-driver-deployer-7b5bdcccc8-fbncl   1/1       Running   0          6h
-longhorn-manager-7x8x8                      1/1       Running   0          6h
-longhorn-manager-8kqf4                      1/1       Running   0          6h
-longhorn-manager-kln4h                      1/1       Running   0          6h
-longhorn-ui-f849dcd85-cgkgg                 1/1       Running   0          5d
-```
-Or with Flexvolume driver
-```
-# kubectl -n longhorn-system get pod
-NAME                                        READY     STATUS    RESTARTS   AGE
-engine-image-ei-57b85e25-8v65d              1/1       Running   0          7d
-engine-image-ei-57b85e25-gjjs6              1/1       Running   0          7d
-engine-image-ei-57b85e25-t2787              1/1       Running   0          7d
-longhorn-driver-deployer-5469b87b9c-b9gm7   1/1       Running   0          2h
-longhorn-flexvolume-driver-lth5g            1/1       Running   0          2h
-longhorn-flexvolume-driver-tpqf7            1/1       Running   0          2h
-longhorn-flexvolume-driver-v9mrj            1/1       Running   0          2h
-longhorn-manager-7x8x8                      1/1       Running   0          9h
-longhorn-manager-8kqf4                      1/1       Running   0          9h
-longhorn-manager-kln4h                      1/1       Running   0          9h
-longhorn-ui-f849dcd85-cgkgg                 1/1       Running   0          5d
+NAMESPACE=longhorn
+kubectl -n ${NAMESPACE} get volumes.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-volumes.yaml
+kubectl -n ${NAMESPACE} get engines.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-engines.yaml
+kubectl -n ${NAMESPACE} get replicas.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-replicas.yaml
+kubectl -n ${NAMESPACE} get settings.longhorn.rancher.io -o yaml > longhorn-v0.1-backup-settings.yaml
 ```
 
-### 3. Upgrade Engine Images and set BackupTarget
+### v0.2
+Check your backups to make sure Longhorn was running in namespace
+`longhorn-system`.
+```
+NAMESPACE=longhorn-system
+kubectl -n ${NAMESPACE} get volumes.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-volumes.yaml
+kubectl -n ${NAMESPACE} get engines.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-engines.yaml
+kubectl -n ${NAMESPACE} get replicas.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-replicas.yaml
+kubectl -n ${NAMESPACE} get settings.longhorn.rancher.io -o yaml > longhorn-v0.2-backup-settings.yaml
+```
 
-1. Wait until the UI is up.
-2. Set the BackupTarget in the setting to the same value as before upgrade.
-3. Make all the volumes are all detached.
-4. Select all the volumes using batch selection. Click batch operation button
-   `Upgrade Engine`, choose the only engine image available in the list. It's
-   the default engine shipped with the manager for this release.
-5. Now attach the volume one by one, to see if the volume works correctly.
+## Delete CRDs in Different Namespace
+
+This is only required for Rancher users running Longhorn App `v0.1`. Delete all
+CRDs from your namespace which is probably `longhorn`.
+```
+NAMESPACE=longhorn
+kubectl -n ${NAMESPACE} get volumes.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f -
+kubectl -n ${NAMESPACE} get engines.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f -
+kubectl -n ${NAMESPACE} get replicas.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f -
+kubectl -n ${NAMESPACE} get settings.longhorn.rancher.io -o yaml | sed "s/\- longhorn.rancher.io//g" | kubectl apply -f -
+kubectl -n ${NAMESPACE} delete volumes.longhorn.rancher.io --all
+kubectl -n ${NAMESPACE} delete engines.longhorn.rancher.io --all
+kubectl -n ${NAMESPACE} delete replicas.longhorn.rancher.io --all
+kubectl -n ${NAMESPACE} delete settings.longhorn.rancher.io --all
+```
+
+## Install Longhorn v0.3
+
+### Rancher 2.x
+For Rancher users who are running Longhorn v0.1, delete the Longhorn App from
+`Catalog Apps` screen in Rancher UI. *Do not click the upgrade button.* Launch
+Longhorn App template version `0.3.0-rc4`.
+
+### Other Kubernetes Distro
+
+For Longhorn v0.2 users who are not using Rancher, follow
+[the official Longhorn Deployment instructions](../README.md#deployment).
+
+## Restore Longhorn System
+
+This step is only required for Rancher users running Longhorn App `v0.1`.
+
+```
+NAMESPACE=longhorn-system
+sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-settings.yaml | kubectl apply -f -
+sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-replicas.yaml | kubectl apply -f -
+sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-engines.yaml | kubectl apply -f -
+sed "s#^\( *\)namespace: .*#\1namespace: ${NAMESPACE}#g" longhorn-v0.1-backup-volumes.yaml | kubectl apply -f -
+```
+
+## Access UI and Set BackupTarget
+
+Wait until the longhorn-ui pod is `Running`:
+```
+kubectl -n longhorn-system get pod -w
+```
+
+[Access the UI](../README.md#access-the-ui).
+
+On `Setting > General`, set `Backup Target` to the backup target used in
+the previous version. In our example, this is
+`nfs://longhorn-test-nfs-svc.default:/opt/backupstore`.
+
+## Upgrade Engine Images
+
+Ensure all volumes are detached. If any are still attached, detach them now
+and wait until they are in `Detached` state.
+
+Select all the volumes using batch selection. Click batch operation button
+`Upgrade Engine`, choose the only engine image available in the list. It's
+the default engine shipped with the manager for this release.
+
+## Attach Volumes
+
+Now we will resume all workloads by reversing the changes we made to detach
+the volumes. Any volume not part of a K8s workload or pod must be attached
+manually.
 
 ## Note
 
-Upgrade is always tricky. Keep backups for the volumes are critical.
-
-If you have any issues, please reported it at
-https://github.com/rancher/longhorn/issues , with your backup yaml files as well
-as manager logs.
+Upgrade is always tricky. Keeping recent backups for volumes is critical.
 
+If you have any issues, please report it at
+https://github.com/rancher/longhorn/issues and include your backup yaml files
+as well as manager logs.