mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-19 13:52:07 +00:00
* launcher: correctly get the head dimension for VLMs For most (?) VLMs, the head dimension is in the `text_config` configuration section. However, since we only queried the top-level `head_dim` (which typically doesn't exist in VLMs), we would never use flashinfer. This change adds a method that gets the head dimension from the top-level `Config` struct or `text_config` when that fails. * fix: bump org name in gemma3 test --------- Co-authored-by: drbh <david.richard.holtz@gmail.com> |
||
---|---|---|
.. | ||
fixtures/neuron | ||
images | ||
models | ||
neuron | ||
conftest.py | ||
pyproject.toml | ||
pytest.ini | ||
requirements.txt | ||
uv.lock |