mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-10-13 08:55:24 +00:00
batch.prefill_cache_indices is reset in generate_token instead of forward, so that position_id could be updated correctly Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> |
||
---|---|---|
.. | ||
custom_modeling | ||
__init__.py | ||
flash_causal_lm.py | ||
flash_vlm_causal_lm.py | ||
globals.py | ||
mllama_causal_lm.py | ||
model.py | ||
seq2seq_lm.py | ||
types.py |