text-generation-inference/backends/llamacpp
Adrien Gallouët 5b777877b1
Make max_batch_total_tokens optional
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-02-05 11:40:20 +00:00
..
.cargo Add llamacpp backend 2025-02-04 13:32:56 +00:00
src Make max_batch_total_tokens optional 2025-02-05 11:40:20 +00:00
build.rs Rename bindings 2025-02-05 11:21:41 +00:00
Cargo.toml Auto-detect n_threads when not provided 2025-02-04 13:32:59 +00:00
requirements.txt backend(llama): add CUDA Dockerfile_llamacpp for now 2025-02-04 13:32:58 +00:00