mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-19 22:02:06 +00:00
* Using an enum for flash backens (paged/flashdecoding/flashinfer) * Early exit on server too. * Clippy. * Fix clippy and fmt.
22 lines
468 B
Plaintext
22 lines
468 B
Plaintext
.idea
|
|
target
|
|
router/tokenizer.json
|
|
*__pycache__*
|
|
|
|
backends/v3/src/client/pb
|
|
backends/client/src/v2/pb
|
|
backends/client/src/v3/pb
|
|
|
|
# ROCm auto-generated files
|
|
*.hip
|
|
server/exllamav2_kernels/exllamav2_kernels/hip/
|
|
server/exllama_kernels/exllama_kernels/hip/
|
|
server/exllama_kernels/exllama_kernels/hip_func/
|
|
*_hip.cuh
|
|
server/exllama_kernels/exllama_kernels/hip_buffers.cuh
|
|
server/exllama_kernels/exllama_kernels/exllama_ext_hip.cpp
|
|
|
|
data/
|
|
load_tests/*.json
|
|
server/fbgemmm
|