text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-15 01:45:24 +00:00

History

drbh e259625b8b fix: Handle concurrent grammar requests (#1610 ) This PR fixes parallel grammar requests, currently grammar states are not concatenated correctly when a new request is added to the batch and this results in incorrect generation. This PR updates the `concatenate` function to correctly include the previous states. fixes: #1601		2024-04-25 10:11:40 +03:00
..
test_flash_llama_grammar_json.json	fix(router): fix openapi and add jsonschema validation (#1578 )	2024-04-24 18:07:44 +03:00
test_flash_llama_grammar_load.json	fix: Handle concurrent grammar requests (#1610 )	2024-04-25 10:11:40 +03:00
test_flash_llama_grammar_regex.json	Outlines guided generation (#1539 )	2024-04-24 14:57:37 +03:00
test_flash_llama_grammar_single_load_instance.json	Outlines guided generation (#1539 )	2024-04-24 14:57:37 +03:00
test_flash_llama_grammar.json	Outlines guided generation (#1539 )	2024-04-24 14:57:37 +03:00