reflect in doc that tunableop is default

2025-09-11 20:34:54 +00:00 · 2024-05-17 08:47:02 +00:00 · 2024-05-17 08:47:02 +00:00 · c8475594bc
commit c8475594bc
parent a040a59068
1 changed files with 1 additions and 1 deletions
--- a/docs/source/installation_amd.md
+++ b/docs/source/installation_amd.md
@ -23,7 +23,7 @@ TGI's docker image for AMD GPUs integrates [PyTorch's TunableOp](https://github.

 Experimentally, on MI300X, we noticed a 6-8% latency improvement when using TunableOp on top of ROCm 6.1 and PyTorch 2.3.

-TunableOp is disabled by default as the warmup may take 1-2 minutes. To enable TunableOp, please pass `--env PYTORCH_TUNABLEOP_ENABLED="1"` when launcher TGI's docker container.
+TunableOp is enabled by default, the warmup may take 1-2 minutes. In case you would like to disable TunableOp, please pass `--env PYTORCH_TUNABLEOP_ENABLED="0"` when launcher TGI's docker container.

 ## Flash attention implementation