text-generation-inference/router/src
Nicolas Patry cb747b33da
Add deepseekv3 (#2968)
* Add fp8 support moe models

add deepseekv3

format codfe'

update dockerfile

update doc

* Small modifications.

* Moe kernels 0.8.1

* Upgrade to 0.8.1

* Fixing moe import.

* Black.

* Apply suggestions from code review

Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>

* Fixing Mixtral + Nits.

* Put link to ref.

* Fix other call locations.

* Scoring func `softmax` is the only one that works.

---------

Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com>
2025-01-30 16:40:25 +01:00
..
infer Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
config.rs Add deepseekv3 (#2968) 2025-01-30 16:40:25 +01:00
kserve.rs fix: include add_special_tokens in kserve request (#2859) 2024-12-19 16:55:17 -05:00
lib.rs Set alias for max_completion_tokens in ChatRequest (#2932) 2025-01-23 14:18:47 +01:00
logging.rs Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
sagemaker.rs feat: allow any supported payload on /invocations (#2683) 2024-10-23 11:26:01 +00:00
server.rs Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
usage_stats.rs Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
validation.rs Upgrading our rustc version. (#2908) 2025-01-15 17:04:03 +01:00
vertex.rs Auto max prefill (#2797) 2024-12-06 05:52:00 +01:00