longhorn/docs/troubleshooting.md

# Troubleshooting

## Common issues
### Volume can be attached/detached from UI, but Kubernetes Pod/StatefulSet etc cannot use it

Check if volume plugin directory has been set correctly.

By default, Kubernetes use `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` as the directory for volume plugin drivers, as stated in the  [official document](https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md#prerequisites).

But some vendors may choose to change the directory due to various reasons. For example, GKE uses `/home/kubernetes/flexvolume`, and RKE uses `/var/lib/kubelet/volumeplugins`.

User can find the correct directory by running `ps aux|grep kubelet` on the host and check the `--volume-plugin-dir`parameter. If there is none, the default `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` will be used.

## Troubleshooting guide

There are a few compontents in the Longhorn. Manager, Engine, Driver and UI. All of those components runnings as pods in the `longhorn-system` namespace by default inside the Kubernetes cluster.

### UI
Make use of the Longhorn UI is a good start for the troubleshooting. For example, if Kubernetes cannot mount one volume correctly, after stop the workload, try to attach and mount that volume manually on one node and access the content to check if volume is intact.

Also, the event logs in the UI dashboard provides some information of probably issues. Check for the event logs in `Warning` level.

### Manager and engines
You can get the log from Longhorn Manager and Engines to help with the troubleshooting. The most useful logs are from `longhorn-manager-xxx`, and the log inside Longhorn Engine, e.g. `<volname>-e-xxxx` and `<volname>-r-xxxx`.

Since normally there are multiple Longhorn Manager running at the same time, we recommend using [kubetail](https://github.com/johanhaleby/kubetail) which is a great tool to keep track of the logs of multiple pods. You can use:
```
kubetail longhorn-system -n longhorn-system
```
To track the manager logs in real time.

### CSI driver

For CSI driver, check the logs for `csi-attacher-0` and `csi-provisioner-0`, as well as containers in `longhorn-csi-plugin-xxx`.

### Flexvolume driver

For Flexvolume driver, you need to check the kubelet logs as the first step. Flexvolume driver itself doesn't run inside the container. It's the kubelet process who is responsible for calling the driver.

If kubelet is running natively on the node, you can use the following command to get the log:
```
journalctl -u kubelet
```

Or if kubelet is running as a container (e.g. in RKE), use the following command instead:
```
docker logs kubelet
```

For even more detail logs of Longhorn Flexvolume, run following command on the node or inside the container (if kubelet is running as a container, e.g. in RKE):
```
touch /var/log/longhorn_driver.log
```
Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`# Troubleshooting`
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`## Common issues`
			`### Volume can be attached/detached from UI, but Kubernetes Pod/StatefulSet etc cannot use it`
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
			`Check if volume plugin directory has been set correctly.`

			By default, Kubernetes use `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` as the directory for volume plugin drivers, as stated in the [official document](https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md#prerequisites).

			But some vendors may choose to change the directory due to various reasons. For example, GKE uses `/home/kubernetes/flexvolume`, and RKE uses `/var/lib/kubelet/volumeplugins`.

			User can find the correct directory by running `ps aux\|grep kubelet` on the host and check the `--volume-plugin-dir`parameter. If there is none, the default `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` will be used.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`## Troubleshooting guide`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			There are a few compontents in the Longhorn. Manager, Engine, Driver and UI. All of those components runnings as pods in the `longhorn-system` namespace by default inside the Kubernetes cluster.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### UI`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00			`Make use of the Longhorn UI is a good start for the troubleshooting. For example, if Kubernetes cannot mount one volume correctly, after stop the workload, try to attach and mount that volume manually on one node and access the content to check if volume is intact.`

			Also, the event logs in the UI dashboard provides some information of probably issues. Check for the event logs in `Warning` level.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### Manager and engines`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00			You can get the log from Longhorn Manager and Engines to help with the troubleshooting. The most useful logs are from `longhorn-manager-xxx`, and the log inside Longhorn Engine, e.g. `<volname>-e-xxxx` and `<volname>-r-xxxx`.

			`Since normally there are multiple Longhorn Manager running at the same time, we recommend using [kubetail](https://github.com/johanhaleby/kubetail) which is a great tool to keep track of the logs of multiple pods. You can use:`
			```
			`kubetail longhorn-system -n longhorn-system`
			```
			`To track the manager logs in real time.`

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### CSI driver`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			For CSI driver, check the logs for `csi-attacher-0` and `csi-provisioner-0`, as well as containers in `longhorn-csi-plugin-xxx`.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### Flexvolume driver`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			`For Flexvolume driver, you need to check the kubelet logs as the first step. Flexvolume driver itself doesn't run inside the container. It's the kubelet process who is responsible for calling the driver.`

			`If kubelet is running natively on the node, you can use the following command to get the log:`
			```
			`journalctl -u kubelet`
			```

			`Or if kubelet is running as a container (e.g. in RKE), use the following command instead:`
			```
			`docker logs kubelet`
			```

			`For even more detail logs of Longhorn Flexvolume, run following command on the node or inside the container (if kubelet is running as a container, e.g. in RKE):`
			```
			`touch /var/log/longhorn_driver.log`
			```