text-generation-inference/server/text_generation_server/models
Dmitry Rogozhkin 58848cb471
feat: enable pytorch xpu support for non-attention models (#2561)
XPU backend is available natively (without IPEX) in pytorch starting
from pytorch 2.4. This commit extends TGI to cover the case when user
has XPU support thru pytorch 2.4, but does not have IPEX installed.
Models which don't require attention can work. For attention required
models more work is needed to provide attention implementation.

Tested with the following models:
* teknium/OpenHermes-2.5-Mistral-7B
* bigscience/bloom-560m
* google/gemma-7b
* google/flan-t5-xxl

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-10-14 18:28:49 +02:00
..
custom_modeling enable mllama in intel platform (#2610) 2024-10-07 21:15:09 +02:00
__init__.py Add basic FP8 KV cache support (#2603) 2024-10-04 17:51:48 +02:00
bloom.py Refactor dead code - Removing all flash_xxx.py files. (#2166) 2024-07-05 10:29:56 +02:00
causal_lm.py feat: enable pytorch xpu support for non-attention models (#2561) 2024-10-14 18:28:49 +02:00
flash_causal_lm.py Add basic FP8 KV cache support (#2603) 2024-10-04 17:51:48 +02:00
galactica.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
globals.py Lots of improvements (Still 2 allocators) (#2449) 2024-08-29 16:29:01 +02:00
idefics_causal_lm.py Mllama flash version (#2585) 2024-10-02 11:22:13 +02:00
mamba.py Fixing exl2 and other quanize tests again. (#2419) 2024-08-15 11:12:51 +02:00
mllama_causal_lm.py Mllama flash version (#2585) 2024-10-02 11:22:13 +02:00
model.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
pali_gemma.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
seq2seq_lm.py feat: enable pytorch xpu support for non-attention models (#2561) 2024-10-14 18:28:49 +02:00
types.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
vlm_causal_lm.py Prefix test - Different kind of load test to trigger prefix test bugs. (#2490) 2024-09-11 18:10:40 +02:00