This website requires JavaScript.
Explore
Help
Sign In
huggingface
/
text-generation-inference
Watch
5
Star
0
Fork
0
You've already forked text-generation-inference
mirror of
https://github.com/huggingface/text-generation-inference.git
synced
2025-04-25 03:52:08 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
2b1581edac
text-generation-inference
/
router
/
src
History
Wang, Yi
d752317b5f
Correct input_length since habana extend input_length to max_input_length (
#103
)
...
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2024-03-18 15:23:13 +01:00
..
health.rs
Rebased
#617
(
#868
)
2023-08-28 11:43:47 +02:00
infer.rs
Revert "Prefer prefill instead of decode when max_waiting_tokens==0 (
#18
)" (
#45
) (
#76
)
2024-02-27 11:56:45 +01:00
lib.rs
Exllama v2 (
#1211
)
2023-11-25 22:38:38 +01:00
main.rs
Wait 2sec once shard is ready to improve stability (
#92
) (
#94
)
2024-03-04 12:17:24 +01:00
queue.rs
Heap based router queue (
#63
) (
#88
)
2024-02-29 10:56:26 +01:00
server.rs
Control prefill and decode batch size separately (
#6
)
2024-01-02 18:21:01 +01:00
validation.rs
Correct input_length since habana extend input_length to max_input_length (
#103
)
2024-03-18 15:23:13 +01:00