mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-10-20 12:25:23 +00:00
* backend(trtllm): bump TRTLLM to v.0.17.0
* backend(trtllm): forget to bump dockerfile
* backend(trtllm): use arg instead of env
* backend(trtllm): use correct library reference decoder_attention_src
* backend(trtllm): link against decoder_attention_{0|1}
* backend(trtllm): build against gcc-14 with cuda12.8
* backend(trtllm): use return value optimization flag as as error if available
* backend(trtllm): make sure we escalade all warnings as errors on the backend impl in debug mode
* backend(trtllm): link against CUDA 12.8
|
||
|---|---|---|
| .. | ||
| install_tensorrt.sh | ||
| setup_sccache.py | ||