text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-10 23:45:23 +00:00

History

Morgan Funtowicz d0a34a95f2 adding missing ld_library_path for cuda stubs in Dockerfile		2024-07-22 15:16:39 +00:00
..
cmake	do the same name definition stuff for tensorrt_llm_executor_static	2024-07-22 11:32:54 +00:00
include	define a shared struct to hold the result of a decoding step	2024-07-18 21:33:04 +00:00
lib	forward tgi parameters rep/freq penalty	2024-07-18 20:56:58 +00:00
scripts	Overall build TRTLLM and deps through CMake build system	2024-07-02 17:16:27 +02:00
src	make sure executor_worker is provided	2024-07-19 11:57:10 +00:00
tests	First version loading engines and making it ready for inference	2024-07-03 21:12:24 +00:00
build.rs	align all the linker search dependency	2024-07-22 14:14:57 +00:00
Cargo.toml	align all the linker search dependency	2024-07-22 14:14:57 +00:00
CMakeLists.txt	do the same name definition stuff for tensorrt_llm_executor_static	2024-07-22 11:32:54 +00:00
Dockerfile	adding missing ld_library_path for cuda stubs in Dockerfile	2024-07-22 15:16:39 +00:00
README.md	adding missing ld_library_path for cuda stubs in Dockerfile	2024-07-22 15:16:39 +00:00

README.md

sequenceDiagram
    TensorRtLlmBackend -->> TensorRtLlmBackendImpl: New thread which instantiates actual backend impl
    TensorRtLlmBackendImpl -->> TensorRtLlmBackendImpl.Receiver: Awaits incoming request sent throught the queue