Morgan Funtowicz
|
9d659f1e23
|
feat(backend): add missing temperature parameter
|
2024-11-28 16:55:17 +01:00 |
|
Morgan Funtowicz
|
929a2fc718
|
feat(backend): add some test to the backend for core allocation
|
2024-11-28 14:53:46 +01:00 |
|
Morgan Funtowicz
|
298367cdfd
|
feat(backend): fix when num_cores_per_instance is equals to zero with the size of the generated core allocation
|
2024-11-28 14:53:35 +01:00 |
|
Morgan Funtowicz
|
274cfce435
|
feat(backend): remove core overriding in the Rust backend
|
2024-11-28 11:40:52 +01:00 |
|
Morgan Funtowicz
|
862a519fdd
|
misc(doc): rust documentation
|
2024-11-22 15:35:55 +01:00 |
|
Morgan Funtowicz
|
2d9465d181
|
misc(backend): allow rebinding numa core affinity
|
2024-11-22 14:02:58 +01:00 |
|
Morgan Funtowicz
|
5a85661661
|
feat(backend): rely on multi consumer queue to scheduler workers
|
2024-11-22 13:32:56 +01:00 |
|
Morgan Funtowicz
|
84eead219a
|
feat(backend): correctly setup llama_context providing n_threads and n_ubatch
|
2024-11-21 21:43:50 +01:00 |
|
Morgan Funtowicz
|
50c376612c
|
feat(backend): bind thread and memory affinity for thread
|
2024-11-21 13:52:38 +01:00 |
|
Morgan Funtowicz
|
5335bf973b
|
feat(backend): multistream inference on CPU
|
2024-11-21 00:03:05 +01:00 |
|
Morgan Funtowicz
|
6f059c4b5d
|
feat(backend): wrap Arc tokenizer to avoid duplicating
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
57b215467b
|
feat(backend): simplify Rust callback
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
86d30aea43
|
feat(backend): simplify overall cpp structure
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
26d0266cec
|
feat(backend): handle all the tokenization failure and send back to the client
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
7eec0f704f
|
chore(backend): minor fixes mostly format
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
52208f5b78
|
misc(backend): decrease log verbosity in callback
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
1149186794
|
feat(backend): expose tokenizer to the GenerationContext to decode token
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
1473259f84
|
feat(backend): add early stopping criteria from TGI stream callback
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
958c72a44a
|
misc(ffi): remove unused ffi mapping
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
5b7a951389
|
feat(backend): refactor the callback to handle intermediate and end inference message
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
05ff551950
|
feat(backend): add number of generated tokens in the callback
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
188442f67d
|
misc(lint): make clippy happier
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
86a2ae6ba2
|
chore: unsued variables
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
3e82f14f57
|
feat(backend): somewhat generates the final infer response
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
b50dcddbb8
|
feat(backend): avoid dropping the boxed stream at the end of the callback
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
612f2f939f
|
feat(backend): bind incoming request to the server
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
d4aee42fd8
|
feat(backend): add logit parameter in the callback fn
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
f39edc72ff
|
feat(backend): add mapping for ignore_eos_token stopping criteria
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
d52b4c4978
|
feat(backend): full rework of the backend internal to safer c++
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
611590440d
|
misc(offline): expose more parameters for generate
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
a316c53255
|
feat(llamacpp): expose number of threads for the backend when constructing the model
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
e4d803c94e
|
feat(backend): build and link through build.rs
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
355d8a55b4
|
feat(backend): wip Rust binding
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
fa89d1e613
|
misc(cmake): wut
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
52d57dca79
|
feat(llamacpp): initial end2end build
|
2024-11-14 08:42:01 +01:00 |
|
Morgan Funtowicz
|
aa1fcba59f
|
feat(llamacpp): initial commit
# Conflicts:
# Cargo.lock
|
2024-11-14 08:42:01 +01:00 |
|