..
custom_kernels
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
exllama_kernels
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
exllamav2_kernels
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
tests
feat(server): add frequency penalty ( #1541 )
2024-04-24 08:43:50 +00:00
text_generation_server
fix: LlamaTokenizerFast to AutoTokenizer at flash_mistral.py ( #1637 )
2024-04-25 12:35:44 +03:00
.gitignore
Impl simple mamba model ( #1480 )
2024-04-23 11:45:11 +03:00
Makefile
Update peft + transformers + accelerate + bnb + safetensors ( #1646 )
2024-04-25 11:49:44 +03:00
Makefile-awq
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
Makefile-eetq
feat: eetq gemv optimization when batch_size <= 4 ( #1502 )
2024-04-23 09:20:14 +03:00
Makefile-flash-att
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
Makefile-flash-att-v2
make install-flash-attn-v2-cuda
should work like make install-flash-attn-v2
used to work. (#1294 )
2023-11-28 16:28:40 +01:00
Makefile-selective-scan
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
Makefile-vllm
Speculative ( #1308 )
2024-04-18 12:39:39 +00:00
poetry.lock
Update peft + transformers + accelerate + bnb + safetensors ( #1646 )
2024-04-25 11:49:44 +03:00
pyproject.toml
fix: improve tool type, bump pydantic and outlines ( #1650 )
2024-04-25 12:34:55 +03:00
README.md
chore: add pre-commit ( #1569 )
2024-04-24 15:32:02 +03:00
requirements_common.txt
Add RoCm support ( #1243 )
2023-11-27 14:08:12 +01:00
requirements_cuda.txt
v1.4.1 ( #1568 )
2024-04-24 15:42:59 +03:00
requirements_rocm.txt
v1.4.1 ( #1568 )
2024-04-24 15:42:59 +03:00
requirements.txt
Update peft + transformers + accelerate + bnb + safetensors ( #1646 )
2024-04-25 11:49:44 +03:00