Morgan Funtowicz
|
a6ac2741a3
|
chore(trtllm): validate there are enough GPus on the system for the desired model
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
848b8ad554
|
chore(trtllm): minor refactoring
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
60a08a283d
|
chore(trtllm): use GetParallelConfig
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
d5c8bdc53b
|
chore(trtllm): define a macro for SizeType cast
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
7217cafadb
|
chore(trtllm): create specific parallelconfig factory and logging init methods
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
421a17544e
|
feat(trtllm): add stop words handling
# Conflicts:
# backends/trtllm/lib/backend.cpp
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
c1a43a6c3e
|
chore(ffi):formatting
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
9ac26ed717
|
feat(post_processing): max_new_tokens is const evaluated now
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
cdac4b0058
|
chore(looper): cleanup a bit more
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
04c6f51258
|
feat(trtllm): rewrite health to not account for current state
|
2024-10-22 09:52:05 +02:00 |
|
Morgan Funtowicz
|
18b473b019
|
chore(router): add python dependency
|
2024-10-22 09:51:50 +02:00 |
|
Morgan Funtowicz
|
d73401ac73
|
chore(rebase): fix invalid references
|
2024-10-21 21:44:28 +02:00 |
|
Morgan Funtowicz
|
f5b9ee368a
|
Revert "chore(trtllm): remove unused method"
This reverts commit 31747163
|
2024-10-21 17:03:35 +02:00 |
|
Morgan Funtowicz
|
8d1c3c8ad4
|
feat(trtllm): do not tokenize twice
|
2024-10-21 15:06:54 +02:00 |
|
Morgan Funtowicz
|
1a3da05f34
|
misc(router): remove SchedulingError
|
2024-10-21 14:57:19 +02:00 |
|
Morgan Funtowicz
|
e6da212431
|
feat(trtllm): cache maxNumTokens to avoid calling JSON everytime
|
2024-10-21 14:51:58 +02:00 |
|
Morgan Funtowicz
|
31747163e7
|
chore(trtllm): remove unused method
|
2024-10-21 14:10:23 +02:00 |
|
Morgan Funtowicz
|
fb00f985ae
|
chore(trtllm): post-rebase commit
|
2024-10-21 12:31:24 +02:00 |
|
Morgan Funtowicz
|
85c03e33a9
|
chore(trtllm): fmt
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
e3bce407be
|
chore(trtllm): disable tokenizer parallelism by default
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
62f33d7ecd
|
chore(trtllm): move dockerfile to right place
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
6687c06a21
|
feat(looper): minor optimizations to avoid growing too much the containers
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
027756c52d
|
chore(cmake): download timestamp should be before URL
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
629153b44b
|
feat(looper): check engine and executorWorker paths exist before creating the backend
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
f20ec28891
|
chore(cmake): use correct policy for download_timestamp
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
819c953771
|
misc(cuda): require 12.6
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
dd94ccc989
|
(fix): ore fixes for Dockerfile
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
f9f10a6636
|
(misc): improve trtllm download script robustness
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
0c3ba932cc
|
(misc): disable logging in release mode
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
437c2aa142
|
(misc): update dependency in trtllm dockerfile
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
cb69c9a967
|
(misc): update dependency in trtllm dockerfile
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
c8a99af6c9
|
(fix): do not recreate the stateful hashmap at every it
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
eb13d8d1f3
|
(misc): increase verbosity of spdlog
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
ce0cd1fce8
|
(misc): build with trtllm 0.13.0
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
188e4dc64f
|
(misc: build for sm_{75,80,86,89,90} by default
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
544c9d9dba
|
(fix): HOPPER_SM_MAJOR is 9 not 8
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
213acc6e34
|
(misc) move to latest trtllm
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
507ff66692
|
(misc) rerun-if-changed all the cmake modules
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
b242f45c04
|
(misc) delete backend.rs
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
984ae9798f
|
(post) impl postprocessing
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
fa63db0d07
|
(scheduler) rework submit/pull logic
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
42ccf4e77c
|
(misc) no need to move for uint32_t items
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
b41875c139
|
(misc) simplify [make_]move_iterator by using c++20 type inference
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
0f50539b77
|
(Dockerfile.trtllm) delete for now
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
b1846fb4e6
|
(backend) refactor & cleanup
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
483f172938
|
(ffi) do not use reference capture in lambda as we are not capturing anything
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
3d0e90b631
|
(ffi) missing namespace for tle::Response
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
8e648ce425
|
(ffi) fix usage of wrong vector constructor making a capacity fill call
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
dddc9a44bd
|
(build) fetchcontent use archives instead of git
|
2024-10-21 10:00:27 +02:00 |
|
Morgan Funtowicz
|
089c5fe668
|
(server) forward auth_token to server::run
|
2024-10-21 10:00:27 +02:00 |
|