text-generation-inference/integration-tests/models/__snapshots__/test_grammar_llama
drbh 343aa7a197
fix: Handle concurrent grammar requests (#1610)
This PR fixes parallel grammar requests, currently grammar states are
not concatenated correctly when a new request is added to the batch and
this results in incorrect generation. This PR updates the `concatenate`
function to correctly include the previous states.

fixes: #1601
2024-02-29 11:17:42 +01:00
..
test_flash_llama_grammar_json.json fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_flash_llama_grammar_load.json fix: Handle concurrent grammar requests (#1610) 2024-02-29 11:17:42 +01:00
test_flash_llama_grammar_regex.json Outlines guided generation (#1539) 2024-02-15 10:28:10 +01:00
test_flash_llama_grammar_single_load_instance.json Outlines guided generation (#1539) 2024-02-15 10:28:10 +01:00
test_flash_llama_grammar.json Outlines guided generation (#1539) 2024-02-15 10:28:10 +01:00