Commit Graph

17 Commits

Author SHA1 Message Date
Morgan Funtowicz
b643a436f3 forward tgi parameters rep/freq penalty 2024-07-18 20:56:58 +00:00
Morgan Funtowicz
bcb96feea6 update invalid doc in cpp file 2024-07-17 22:23:22 +00:00
Morgan Funtowicz
e983ee5bb8 make sure the context is not dropped in the middle of the async decoding. 2024-07-17 21:56:50 +00:00
Morgan Funtowicz
9220340ff7 compute the number of maximum new tokens for each request independently 2024-07-17 13:55:29 +00:00
Morgan Funtowicz
a01cd030d4 oops missing c++ backend definitions 2024-07-16 20:11:59 +00:00
Morgan Funtowicz
7784a21d48 impl RwLock scenario for TensorRtLllmBackend 2024-07-16 20:08:10 +00:00
Morgan Funtowicz
31d9f4d5dc expose shutdown function at ffi layer 2024-07-15 07:36:01 +00:00
Morgan Funtowicz
344f33f398 end to end ffi flow working 2024-07-12 19:25:40 +00:00
Morgan Funtowicz
1972669f49 remove fmt import 2024-07-12 19:24:09 +00:00
Morgan Funtowicz
50e9fc89c8 working setup of the ffi layer 2024-07-11 21:24:32 +00:00
Morgan Funtowicz
ed14bd6818 use correct include for spdlog 2024-07-10 13:57:31 +00:00
Morgan Funtowicz
13eabfabcb implement the Stream method to send new tokens through a callback 2024-07-09 13:46:48 +00:00
Morgan Funtowicz
da926feaa1 make leader executor mode working 2024-07-08 22:08:49 +00:00
Morgan Funtowicz
f57f2a4521 First version loading engines and making it ready for inference 2024-07-03 21:12:24 +00:00
Morgan Funtowicz
f8a1463915 Enable end to end CMake build 2024-07-03 10:27:53 +02:00
Morgan Funtowicz
47ac5c654d Working FFI call for TGI and TRTLLM backend 2024-07-01 15:53:23 +02:00
Morgan Funtowicz
dc402dc9ac Initial setup for CXX binding to TRTLLM 2024-06-30 23:37:20 +02:00