text-generation-inference/backends
Funtowicz Morgan 856709d5c3
[Backend] Bump TRTLLM to v.0.17.0 (#2991)
* backend(trtllm): bump TRTLLM to v.0.17.0

* backend(trtllm): forget to bump dockerfile

* backend(trtllm): use arg instead of env

* backend(trtllm): use correct library reference decoder_attention_src

* backend(trtllm): link against decoder_attention_{0|1}

* backend(trtllm): build against gcc-14 with cuda12.8

* backend(trtllm): use return value optimization flag as as error if available

* backend(trtllm): make sure we escalade all warnings as errors on the backend impl in debug mode

* backend(trtllm): link against CUDA 12.8
2025-02-06 16:45:03 +01:00
..
client Revert "feat: improve qwen2-vl startup " (#2924) 2025-01-17 12:09:05 -05:00
grpc-metadata Upgrading our rustc version. (#2908) 2025-01-15 17:04:03 +01:00
trtllm [Backend] Bump TRTLLM to v.0.17.0 (#2991) 2025-02-06 16:45:03 +01:00
v2 Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
v3 Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00