text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-09-09 11:24:53 +00:00

History

drbh 93a7042d7e feat: support phi3.5 moe (#2479 ) * feat: support phi3.5 moe model loading * fix: prefer llama base model and improve rotary logic * feat: return reasonable generation and add integration test * fix: run lint and update docs * fix: rerun lint for openapi docs * fix: prefer do_sample false unless temp is set by user, and update chat tests * fix: small typo adjustments * fix: consolidate long rope paths * fix: revert greedy by default and test changes * Vendor configuration so that we don't have to `trust_remote_code` * Use SparseMoELayer * Add support for dense MoE * Some type annotations * Add the usual model tests * Ruff. --------- Co-authored-by: Daniël de Kok <me@danieldk.eu> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>		2024-09-30 11:15:09 +02:00
..
infer	chore: Add old V2 backend (#2551 )	2024-09-24 08:38:17 +02:00
config.rs	feat: support phi3.5 moe (#2479 )	2024-09-30 11:15:09 +02:00
kserve.rs	fix: simplify kserve endpoint and fix imports (#2119 )	2024-06-25 19:30:10 -04:00
lib.rs	Cleanup Vertex + Chat (#2553 )	2024-09-24 23:37:17 +02:00
logging.rs	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00
main.rs.back	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00
server.rs	Fix build with `--features google` (#2566 )	2024-09-26 11:41:38 +02:00
usage_stats.rs	refactor usage stats (#2339 )	2024-07-31 16:29:07 +02:00
validation.rs	Lots of improvements (Still 2 allocators) (#2449 )	2024-08-29 16:29:01 +02:00
vertex.rs	Cleanup Vertex + Chat (#2553 )	2024-09-24 23:37:17 +02:00