text-generation-inference/router/src
Mohit Sharma a35fbdb925
Bug Fix: Sliding Window Attention (#3112)
* (fix) sliding window attention

* (fix) flashinfer

* (typo) collection link

* Add window_size_left param ipex rocm

* Update window size rocm flash decoding

* fix: bump snapshots and improve exceed window test case

* feat: add tests for image types and remove alpha from png

* Upgrading `from_env` to get token from file when necessary + fix
pali_gemma.

* fix: add pillow dependency and bump lock+requirements

* fix: bump org name in gemma3 test

* Fix qwen2.

---------

Co-authored-by: drbh <david.richard.holtz@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2025-03-18 10:37:33 +01:00
..
infer Fix tool call4 (#3094) 2025-03-12 09:28:47 +01:00
chat.rs Fix tool call4 (#3094) 2025-03-12 09:28:47 +01:00
config.rs Router: add gemma3-text model type (#3107) 2025-03-13 10:41:33 +01:00
kserve.rs fix: include add_special_tokens in kserve request (#2859) 2024-12-19 16:55:17 -05:00
lib.rs Add gemma3 model (#3099) 2025-03-12 09:25:51 +01:00
logging.rs Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
sagemaker.rs feat: allow any supported payload on /invocations (#2683) 2024-10-23 11:26:01 +00:00
server.rs Bug Fix: Sliding Window Attention (#3112) 2025-03-18 10:37:33 +01:00
usage_stats.rs feat: Add the parsing of HF_HUB_USER_AGENT_ORIGIN environment variable for telemetry (#3027) 2025-02-19 21:09:12 +01:00
validation.rs Bug Fix: Sliding Window Attention (#3112) 2025-03-18 10:37:33 +01:00
vertex.rs Improve tool call message processing (#3036) 2025-02-21 10:30:29 +01:00