text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-06-03 13:12:10 +00:00

History

drbh 343aa7a197 fix: Handle concurrent grammar requests (#1610 ) This PR fixes parallel grammar requests, currently grammar states are not concatenated correctly when a new request is added to the batch and this results in incorrect generation. This PR updates the `concatenate` function to correctly include the previous states. fixes: #1601		2024-02-29 11:17:42 +01:00
..
test_flash_llama_grammar_json.json	fix(router): fix openapi and add jsonschema validation (#1578 )	2024-02-21 11:05:32 +01:00
test_flash_llama_grammar_load.json	fix: Handle concurrent grammar requests (#1610 )	2024-02-29 11:17:42 +01:00
test_flash_llama_grammar_regex.json	Outlines guided generation (#1539 )	2024-02-15 10:28:10 +01:00
test_flash_llama_grammar_single_load_instance.json	Outlines guided generation (#1539 )	2024-02-15 10:28:10 +01:00
test_flash_llama_grammar.json	Outlines guided generation (#1539 )	2024-02-15 10:28:10 +01:00