text-generation-inference/backends/trtllm
2024-07-22 15:16:39 +00:00
..
cmake do the same name definition stuff for tensorrt_llm_executor_static 2024-07-22 11:32:54 +00:00
include define a shared struct to hold the result of a decoding step 2024-07-18 21:33:04 +00:00
lib forward tgi parameters rep/freq penalty 2024-07-18 20:56:58 +00:00
scripts Overall build TRTLLM and deps through CMake build system 2024-07-02 17:16:27 +02:00
src make sure executor_worker is provided 2024-07-19 11:57:10 +00:00
tests First version loading engines and making it ready for inference 2024-07-03 21:12:24 +00:00
build.rs align all the linker search dependency 2024-07-22 14:14:57 +00:00
Cargo.toml align all the linker search dependency 2024-07-22 14:14:57 +00:00
CMakeLists.txt do the same name definition stuff for tensorrt_llm_executor_static 2024-07-22 11:32:54 +00:00
Dockerfile adding missing ld_library_path for cuda stubs in Dockerfile 2024-07-22 15:16:39 +00:00
README.md adding missing ld_library_path for cuda stubs in Dockerfile 2024-07-22 15:16:39 +00:00

sequenceDiagram
    TensorRtLlmBackend -->> TensorRtLlmBackendImpl: New thread which instantiates actual backend impl
    TensorRtLlmBackendImpl -->> TensorRtLlmBackendImpl.Receiver: Awaits incoming request sent throught the queue