mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-21 23:12:07 +00:00
* Refactor dead code. * First working step. * Remove a lot of duplicated code. * More dead code. * More cleanup. * Fix Santacoder test. * Fixing the simple tests. * Fixing sharding. * Fixes for VLM. * Fixing santacoder (num_kv_heads hardcoded). * Removing more dead code. * Fixing `config.n_head`. * Stopping earlier because of `<end_of_utterance>` in idefics2. * Addresses comments. * Removing the dead code. * Fuse back mistral into FlashCausalLM. * Finish removal. * Fixing docs + causal_lm `batch_class`. * Fixing docs + causal.lm. * Add default to Gemma Causality. * Default value for gemma/gemma2. * Wrong default. |
||
---|---|---|
.. | ||
basic_tutorials | ||
conceptual | ||
_toctree.yml | ||
architecture.md | ||
index.md | ||
installation_amd.md | ||
installation_gaudi.md | ||
installation_inferentia.md | ||
installation_nvidia.md | ||
installation.md | ||
messages_api.md | ||
quicktour.md | ||
supported_models.md |