Commit Graph

24 Commits

Author SHA1 Message Date
Morgan Funtowicz
1972669f49 remove fmt import 2024-07-12 19:24:09 +00:00
Morgan Funtowicz
50e9fc89c8 working setup of the ffi layer 2024-07-11 21:24:32 +00:00
Morgan Funtowicz
5aede911f8 include guard to build example in cmakelists 2024-07-11 21:24:01 +00:00
Morgan Funtowicz
ed14bd6818 use correct include for spdlog 2024-07-10 13:57:31 +00:00
Morgan Funtowicz
42748d5960 allow converting huggingface::tokenizers error to TensorRtLlmBackendError 2024-07-10 13:56:57 +00:00
Morgan Funtowicz
40fe2ec0ff add auth_token CLI argument to provide hf hub authentification token 2024-07-10 13:50:28 +00:00
Morgan Funtowicz
ca9da2dd49 create cmake install target to put everything relevant in installation folder 2024-07-10 13:48:59 +00:00
Morgan Funtowicz
4272b8cf51 correctly tell cmake to build dependent tensorrt-llm required libraries 2024-07-10 13:48:44 +00:00
Morgan Funtowicz
6c92ebe6a8 update trtllm to latest version a96cccafcf6365c128f004f779160951f8c0801c 2024-07-10 13:47:56 +00:00
Morgan Funtowicz
7b9f92a0aa use spdlog release 1.14.1 moving forward 2024-07-10 13:47:31 +00:00
Morgan Funtowicz
13eabfabcb implement the Stream method to send new tokens through a callback 2024-07-09 13:46:48 +00:00
Morgan Funtowicz
09292b06a0 updated logic and comment to detect cuda compute capabilities 2024-07-09 12:15:41 +00:00
Morgan Funtowicz
bec188ff73 bind to CUDA::nvml to retrieve compute capabilities at runtime 2024-07-08 22:32:41 +00:00
Morgan Funtowicz
68a0247a2c unconditionally call InitializeBackend on the FFI layer 2024-07-08 22:09:09 +00:00
Morgan Funtowicz
da926feaa1 make leader executor mode working 2024-07-08 22:08:49 +00:00
Morgan Funtowicz
f53ffa886d Specify which default log level to use depending on CMake build type 2024-07-08 22:06:49 +00:00
Morgan Funtowicz
4113d6d51b Move to latest TensorRT-LLM version 2024-07-08 22:06:30 +00:00
Morgan Funtowicz
29c7cb36e5 Remembering to check how we can detect support for chunked context 2024-07-03 21:38:17 +00:00
Morgan Funtowicz
f57f2a4521 First version loading engines and making it ready for inference 2024-07-03 21:12:24 +00:00
Morgan Funtowicz
f8a1463915 Enable end to end CMake build 2024-07-03 10:27:53 +02:00
Morgan Funtowicz
818162e0c2 Overall build TRTLLM and deps through CMake build system 2024-07-02 17:16:27 +02:00
Morgan Funtowicz
6dc98abe46 Remove unused parameters annd force tokenizer name to be set 2024-07-01 16:11:59 +02:00
Morgan Funtowicz
47ac5c654d Working FFI call for TGI and TRTLLM backend 2024-07-01 15:53:23 +02:00
Morgan Funtowicz
dc402dc9ac Initial setup for CXX binding to TRTLLM 2024-06-30 23:37:20 +02:00