longhorn/docs/troubleshooting.md

# Troubleshooting

## Common issues
### Volume can be attached/detached from UI, but Kubernetes Pod/StatefulSet etc cannot use it

#### Using with Flexvolume Plugin
Check if volume plugin directory has been set correctly. This is automatically detected unless user explicitly set it.

By default, Kubernetes uses `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/`, as stated in the [official document](https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md#prerequisites).

Some vendors choose to change the directory for various reasons. For example, GKE uses `/home/kubernetes/flexvolume` instead.

User can find the correct directory by running `ps aux|grep kubelet` on the host and check the `--volume-plugin-dir` parameter. If there is none, the default `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` will be used.

## Troubleshooting guide

There are a few compontents in the Longhorn. Manager, Engine, Driver and UI. All of those components runnings as pods in the `longhorn-system` namespace by default inside the Kubernetes cluster.

Most of the logs are included in the Support Bundle. You can click Generate Support Bundle link at the bottom of the UI to download a zip file contains Longhorn related configuration and logs.

One exception is the `dmesg`, which need to retrieve by the user on each node.

### UI
Make use of the Longhorn UI is a good start for the troubleshooting. For example, if Kubernetes cannot mount one volume correctly, after stop the workload, try to attach and mount that volume manually on one node and access the content to check if volume is intact.

Also, the event logs in the UI dashboard provides some information of probably issues. Check for the event logs in `Warning` level.

### Manager and engines
You can get the log from Longhorn Manager and Engines to help with the troubleshooting. The most useful logs are from `longhorn-manager-xxx`, and the log inside Longhorn instance managers, e.g. `instance-manager-e-xxxx` and `instance-manager-r-xxxx`.

Since normally there are multiple Longhorn Manager running at the same time, we recommend using [kubetail](https://github.com/johanhaleby/kubetail) which is a great tool to keep track of the logs of multiple pods. You can use:
```
kubetail longhorn-manager -n longhorn-system
```
To track the manager logs in real time.

### CSI driver

For CSI driver, check the logs for `csi-attacher-0` and `csi-provisioner-0`, as well as containers in `longhorn-csi-plugin-xxx`.

### Flexvolume driver

For Flexvolume driver, first check where the driver has been installed on the node. Check the log of `longhorn-driver-deployer-xxxx` for that information.

Then check the kubelet logs. Flexvolume driver itself doesn't run inside the container. It would run along with the kubelet process.

If kubelet is running natively on the node, you can use the following command to get the log:
```
journalctl -u kubelet
```

Or if kubelet is running as a container (e.g. in RKE), use the following command instead:
```
docker logs kubelet
```

For even more detail logs of Longhorn Flexvolume, run following command on the node or inside the container (if kubelet is running as a container, e.g. in RKE):
```
touch /var/log/longhorn_driver.log
```
Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`# Troubleshooting`
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`## Common issues`
			`### Volume can be attached/detached from UI, but Kubernetes Pod/StatefulSet etc cannot use it`
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
Update troubleshooting.md 2018-10-08 22:43:24 +00:00			`#### Using with Flexvolume Plugin`
Update docs relevant to FlexVolume dirpath setting 2018-09-12 00:59:03 +00:00			`Check if volume plugin directory has been set correctly. This is automatically detected unless user explicitly set it.`
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
Update docs relevant to FlexVolume dirpath setting 2018-09-12 00:59:03 +00:00			By default, Kubernetes uses `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/`, as stated in the [official document](https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md#prerequisites).
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
Update docs relevant to FlexVolume dirpath setting 2018-09-12 00:59:03 +00:00			Some vendors choose to change the directory for various reasons. For example, GKE uses `/home/kubernetes/flexvolume` instead.
Create ./docs/ for documents And move part of README.md to it. 2018-08-02 23:39:12 +00:00
Update docs relevant to FlexVolume dirpath setting 2018-09-12 00:59:03 +00:00			User can find the correct directory by running `ps aux\|grep kubelet` on the host and check the `--volume-plugin-dir` parameter. If there is none, the default `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` will be used.
Update troubleshooting.md 2018-08-22 21:08:01 +00:00
Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`## Troubleshooting guide`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			There are a few compontents in the Longhorn. Manager, Engine, Driver and UI. All of those components runnings as pods in the `longhorn-system` namespace by default inside the Kubernetes cluster.

Update troubleshooting.md 2019-02-22 01:02:43 +00:00			`Most of the logs are included in the Support Bundle. You can click Generate Support Bundle link at the bottom of the UI to download a zip file contains Longhorn related configuration and logs.`

			One exception is the `dmesg`, which need to retrieve by the user on each node.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### UI`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00			`Make use of the Longhorn UI is a good start for the troubleshooting. For example, if Kubernetes cannot mount one volume correctly, after stop the workload, try to attach and mount that volume manually on one node and access the content to check if volume is intact.`

			Also, the event logs in the UI dashboard provides some information of probably issues. Check for the event logs in `Warning` level.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### Manager and engines`
doc: update troubleshooting.md for instance manager Signed-off-by: Sheng Yang <sheng.yang@rancher.com> 2019-09-18 21:40:37 +00:00			You can get the log from Longhorn Manager and Engines to help with the troubleshooting. The most useful logs are from `longhorn-manager-xxx`, and the log inside Longhorn instance managers, e.g. `instance-manager-e-xxxx` and `instance-manager-r-xxxx`.
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			`Since normally there are multiple Longhorn Manager running at the same time, we recommend using [kubetail](https://github.com/johanhaleby/kubetail) which is a great tool to keep track of the logs of multiple pods. You can use:`
			```
Update troubleshooting.md Fix a typo. 2018-12-06 17:06:42 +00:00			`kubetail longhorn-manager -n longhorn-system`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00			```
			`To track the manager logs in real time.`

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### CSI driver`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			For CSI driver, check the logs for `csi-attacher-0` and `csi-provisioner-0`, as well as containers in `longhorn-csi-plugin-xxx`.

Update troubleshooting.md 2018-08-22 21:01:01 +00:00			`### Flexvolume driver`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
Update troubleshooting.md 2018-08-22 21:08:01 +00:00			For Flexvolume driver, first check where the driver has been installed on the node. Check the log of `longhorn-driver-deployer-xxxx` for that information.

			`Then check the kubelet logs. Flexvolume driver itself doesn't run inside the container. It would run along with the kubelet process.`
Update troubleshooting.md 2018-08-22 21:00:31 +00:00
			`If kubelet is running natively on the node, you can use the following command to get the log:`
			```
			`journalctl -u kubelet`
			```

			`Or if kubelet is running as a container (e.g. in RKE), use the following command instead:`
			```
			`docker logs kubelet`
			```

			`For even more detail logs of Longhorn Flexvolume, run following command on the node or inside the container (if kubelet is running as a container, e.g. in RKE):`
			```
			`touch /var/log/longhorn_driver.log`
			```