text-generation-inference/integration-tests/models/__snapshots__/test_grammar_llama
drbh e259625b8b fix: Handle concurrent grammar requests (#1610)
This PR fixes parallel grammar requests, currently grammar states are
not concatenated correctly when a new request is added to the batch and
this results in incorrect generation. This PR updates the `concatenate`
function to correctly include the previous states.

fixes: #1601
2024-04-25 10:11:40 +03:00
..
test_flash_llama_grammar_json.json fix(router): fix openapi and add jsonschema validation (#1578) 2024-04-24 18:07:44 +03:00
test_flash_llama_grammar_load.json fix: Handle concurrent grammar requests (#1610) 2024-04-25 10:11:40 +03:00
test_flash_llama_grammar_regex.json Outlines guided generation (#1539) 2024-04-24 14:57:37 +03:00
test_flash_llama_grammar_single_load_instance.json Outlines guided generation (#1539) 2024-04-24 14:57:37 +03:00
test_flash_llama_grammar.json Outlines guided generation (#1539) 2024-04-24 14:57:37 +03:00