Update docs

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
This commit is contained in:
Adrien Gallouët 2025-02-06 10:31:05 +00:00
parent 2b0d99c1cf
commit 8bc10d37ee
No known key found for this signature in database
2 changed files with 4 additions and 0 deletions

View File

@ -52,6 +52,8 @@
- sections: - sections:
- local: backends/trtllm - local: backends/trtllm
title: TensorRT-LLM title: TensorRT-LLM
- local: backends/llamacpp
title: Llamacpp
title: Backends title: Backends
- sections: - sections:
- local: reference/launcher - local: reference/launcher

View File

@ -11,3 +11,5 @@ TGI remains consistent across backends, allowing you to switch between them seam
* **[TGI TRTLLM backend](./backends/trtllm)**: This backend leverages NVIDIA's TensorRT library to accelerate LLM inference. * **[TGI TRTLLM backend](./backends/trtllm)**: This backend leverages NVIDIA's TensorRT library to accelerate LLM inference.
It utilizes specialized optimizations and custom kernels for enhanced performance. It utilizes specialized optimizations and custom kernels for enhanced performance.
However, it requires a model-specific compilation step for each GPU architecture. However, it requires a model-specific compilation step for each GPU architecture.
* **[TGI Llamacpp backend](./backends/llamacpp)**: This backend facilitates the deployment of large language models
(LLMs) by integrating [llama.cpp][llama.cpp], an advanced inference engine optimized for both CPU and GPU computation.