text-generation-inference/backends
Adrien Gallouët 30cd3cf510
Enable mmap, offload_kqv & flash_attention by default
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-03-05 11:08:17 +00:00
..
client Revert "feat: improve qwen2-vl startup " (#2924) 2025-01-17 12:09:05 -05:00
gaudi Add Gaudi Backend (#3055) 2025-02-28 12:14:58 +01:00
grpc-metadata Upgrading our rustc version. (#2908) 2025-01-15 17:04:03 +01:00
llamacpp Enable mmap, offload_kqv & flash_attention by default 2025-03-05 11:08:17 +00:00
neuron feat: add support for HF_HUB_USER_AGENT_ORIGIN to add user-agent Origin field in Hub requests. (#3061) 2025-03-04 16:43:50 +01:00
trtllm feat: add support for HF_HUB_USER_AGENT_ORIGIN to add user-agent Origin field in Hub requests. (#3061) 2025-03-04 16:43:50 +01:00
v2 Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
v3 Add property-based testing for RadixAllocator (#3068) 2025-03-04 15:09:46 +01:00