text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-09 23:15:23 +00:00

History

Funtowicz Morgan ea7f4082c4 TensorRT-LLM backend bump to latest version + misc fixes (#2791 ) * misc(cmake) update dependencies * feat(hardware) enable new hardware.hpp and unittests * test(ctest) enable address sanitizer * feat(backend): initial rewrite of the backend for simplicity * feat(backend): remove all the logs from hardware.hpp * feat(backend): added some logging * feat(backend): enable compiler warning if support for RVO not applying * feat(backend): missing return statement * feat(backend): introduce backend_workspace_t to store precomputed information from the engine folder * feat(backend): delete previous backend impl * feat(backend): more impl * feat(backend): use latest trtllm main version to have g++ >= 13 compatibility * feat(backend): allow overriding which Python to use * feat(backend): fix backend_exception_t -> backend_error_t naming * feat(backend): impl missing generation_step_t as return value of pull_tokens * feat(backend): make backend_workspace_t::engines_folder constexpr * feat(backend): fix main.rs retrieving the tokenizer * feat(backend): add guard to multiple header definitions * test(backend): add more unittest * feat(backend): remove constexpr from par * feat(backend): remove constexpig * test(backend): more test coverage * chore(trtllm): update dependency towards 0.15.0 * effectively cancel the request on the executor * feat(backend) fix moving backend when pulling * feat(backend): make sure we can easily cancel request on the executor * feat(backend): fix missing "0" field access * misc(backend): fix reborrowing Pin<&mut T> as described in the doc https://doc.rust-lang.org/stable/std/pin/struct.Pin.html#method.as_mut * chore: Add doc and CI for TRTLLM (#2799) * chore: Add doc and CI for TRTLLM * chore: Add doc and CI for TRTLLM * chore: Add doc and CI for TRTLLM * chore: Add doc and CI for TRTLLM * doc: Formatting * misc(backend): indent --------- Co-authored-by: Hugo Larcher <hugo.larcher@huggingface.co>		2024-12-13 15:50:59 +01:00
..
backends	TensorRT-LLM backend bump to latest version + misc fixes (#2791 )	2024-12-13 15:50:59 +01:00
basic_tutorials	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
conceptual	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
reference	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
_toctree.yml	TensorRT-LLM backend bump to latest version + misc fixes (#2791 )	2024-12-13 15:50:59 +01:00
architecture.md	TensorRT-LLM backend bump to latest version + misc fixes (#2791 )	2024-12-13 15:50:59 +01:00
index.md	Removing ../ that broke the link (#2789 )	2024-12-02 05:48:55 +01:00
installation_amd.md	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
installation_gaudi.md	MI300 compatibility (#1764 )	2024-05-17 15:30:47 +02:00
installation_inferentia.md	MI300 compatibility (#1764 )	2024-05-17 15:30:47 +02:00
installation_intel.md	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
installation_nvidia.md	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
installation.md	MI300 compatibility (#1764 )	2024-05-17 15:30:47 +02:00
multi_backend_support.md	TensorRT-LLM backend bump to latest version + misc fixes (#2791 )	2024-12-13 15:50:59 +01:00
quicktour.md	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
supported_models.md	Fix: docs typo (#2777 )	2024-11-26 14:28:58 +01:00
usage_statistics.md	feat: allow any supported payload on /invocations (#2683 )	2024-10-23 11:26:01 +00:00