.. |
custom_modeling
|
Support flashinfer for Gemma3 prefill (#3167)
|
2025-04-17 18:07:41 +02:00 |
__init__.py
|
transformers flash llm/vlm enabling in ipex (#3152)
|
2025-04-15 11:08:01 +02:00 |
bloom.py
|
Refactor dead code - Removing all flash_xxx.py files. (#2166)
|
2024-07-05 10:29:56 +02:00 |
causal_lm.py
|
Sync (most) server dependencies with Nix (#2782)
|
2024-12-03 04:04:06 +01:00 |
flash_causal_lm.py
|
add encoder cache free
|
2025-04-18 16:00:35 +00:00 |
galactica.py
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
globals.py
|
Fixing the oom maybe with 2.5.1 change. (#2958)
|
2025-01-28 10:30:28 +01:00 |
idefics_causal_lm.py
|
feat: prefill chunking (#2600)
|
2024-10-16 12:49:33 +02:00 |
mamba.py
|
Choosing input/total tokens automatically based on available VRAM? (#2673)
|
2024-10-28 04:59:49 +01:00 |
metadata_kernels.py
|
feat: add payload limit (#2726)
|
2024-11-21 18:20:15 +00:00 |
mllama_causal_lm.py
|
Update transformers to 4.51 (#3148)
|
2025-04-07 12:55:43 +02:00 |
model.py
|
Bug Fix: Sliding Window Attention (#3112)
|
2025-03-18 10:37:33 +01:00 |
pali_gemma.py
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
seq2seq_lm.py
|
feat: prefill chunking (#2600)
|
2024-10-16 12:49:33 +02:00 |
transformers_flash_causal_lm.py
|
transformers flash llm/vlm enabling in ipex (#3152)
|
2025-04-15 11:08:01 +02:00 |
transformers_flash_vlm.py
|
add encoder cache free
|
2025-04-18 16:00:35 +00:00 |
types.py
|
feat: prefill chunking (#2600)
|
2024-10-16 12:49:33 +02:00 |
vlm_causal_lm.py
|
add encoder cache free
|
2025-04-18 16:00:35 +00:00 |