longhorn/docs/troubleshooting.md

54 lines
2.8 KiB
Markdown
Raw Normal View History

2018-08-22 21:01:01 +00:00
# Troubleshooting
2018-08-22 21:01:01 +00:00
## Common issues
### Volume can be attached/detached from UI, but Kubernetes Pod/StatefulSet etc cannot use it
Check if volume plugin directory has been set correctly.
By default, Kubernetes use `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` as the directory for volume plugin drivers, as stated in the [official document](https://github.com/kubernetes/community/blob/master/contributors/devel/flexvolume.md#prerequisites).
But some vendors may choose to change the directory due to various reasons. For example, GKE uses `/home/kubernetes/flexvolume`, and RKE uses `/var/lib/kubelet/volumeplugins`.
User can find the correct directory by running `ps aux|grep kubelet` on the host and check the `--volume-plugin-dir`parameter. If there is none, the default `/usr/libexec/kubernetes/kubelet-plugins/volume/exec/` will be used.
2018-08-22 21:01:01 +00:00
## Troubleshooting guide
2018-08-22 21:00:31 +00:00
There are a few compontents in the Longhorn. Manager, Engine, Driver and UI. All of those components runnings as pods in the `longhorn-system` namespace by default inside the Kubernetes cluster.
2018-08-22 21:01:01 +00:00
### UI
2018-08-22 21:00:31 +00:00
Make use of the Longhorn UI is a good start for the troubleshooting. For example, if Kubernetes cannot mount one volume correctly, after stop the workload, try to attach and mount that volume manually on one node and access the content to check if volume is intact.
Also, the event logs in the UI dashboard provides some information of probably issues. Check for the event logs in `Warning` level.
2018-08-22 21:01:01 +00:00
### Manager and engines
2018-08-22 21:00:31 +00:00
You can get the log from Longhorn Manager and Engines to help with the troubleshooting. The most useful logs are from `longhorn-manager-xxx`, and the log inside Longhorn Engine, e.g. `<volname>-e-xxxx` and `<volname>-r-xxxx`.
Since normally there are multiple Longhorn Manager running at the same time, we recommend using [kubetail](https://github.com/johanhaleby/kubetail) which is a great tool to keep track of the logs of multiple pods. You can use:
```
kubetail longhorn-system -n longhorn-system
```
To track the manager logs in real time.
2018-08-22 21:01:01 +00:00
### CSI driver
2018-08-22 21:00:31 +00:00
For CSI driver, check the logs for `csi-attacher-0` and `csi-provisioner-0`, as well as containers in `longhorn-csi-plugin-xxx`.
2018-08-22 21:01:01 +00:00
### Flexvolume driver
2018-08-22 21:00:31 +00:00
For Flexvolume driver, you need to check the kubelet logs as the first step. Flexvolume driver itself doesn't run inside the container. It's the kubelet process who is responsible for calling the driver.
If kubelet is running natively on the node, you can use the following command to get the log:
```
journalctl -u kubelet
```
Or if kubelet is running as a container (e.g. in RKE), use the following command instead:
```
docker logs kubelet
```
For even more detail logs of Longhorn Flexvolume, run following command on the node or inside the container (if kubelet is running as a container, e.g. in RKE):
```
touch /var/log/longhorn_driver.log
```