text-generation-inference/.github/workflows
Daniël de Kok 571ac9b507
Use kernels from the kernel hub (#2988)
* Use Hub kernels for Marlin and cutlass quantization kernels

* Use hub kernels for MoE/GPTQ-Marlin MoE

* Use attention kernels from the Hub

* Cache the kernels in the Docker image

* Update moe kernels

* Support loading local kernels for development

* Support latest moe kernels

* Update to moe 0.1.1

* CI: download locked kernels for server tests

* Fixup some imports

* CI: activate venv

* Fix unused imports

* Nix: add attention/moe/quantization kernels

* Update hf-kernels to 0.1.5

* Update kernels

* Update tgi-nix flake for hf-kernels

* Fix EOF

* Take `load_kernel` out of a frequently-called function

* Hoist another case of kernel loading out of a somewhat hot function

* marlin-kernels -> quantization

* attention -> paged-attention

* EOF fix

* Update hf-kernels, fixup Docker

* ipex fix

* Remove outdated TODO
2025-02-10 19:19:25 +01:00
..
autodocs.yaml Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
build_documentation.yaml New runner. Manual squash. (#2110) 2024-06-24 18:08:34 +02:00
build_pr_documentation.yaml doc: Add metrics documentation and add a 'Reference' section (#2230) 2024-08-16 19:43:30 +02:00
build.yaml Using the "lockfile". (#2992) 2025-02-06 12:28:24 +01:00
ci_build.yaml Give TensorRT-LLMa proper CI/CD 😍 (#2886) 2025-01-21 10:19:16 +01:00
client-tests.yaml Removing IPEX_AVAIL. (#2115) 2024-06-25 13:20:57 +02:00
integration_tests.yaml Removing IPEX_AVAIL. (#2115) 2024-06-25 13:20:57 +02:00
load_test.yaml feat: Add automatic nightly benchmarks (#2591) 2024-11-21 17:11:42 +00:00
nix_cache.yaml nix: build and cache impure devshells (#2765) 2024-11-20 20:56:11 +01:00
nix_tests.yaml Stream options. (#2533) 2024-09-19 20:50:37 +02:00
stale.yaml New runner. Manual squash. (#2110) 2024-06-24 18:08:34 +02:00
tests.yaml Use kernels from the kernel hub (#2988) 2025-02-10 19:19:25 +01:00
trufflehog.yaml impureWithCuda: fix gcc version (#2990) 2025-02-04 17:01:59 +01:00
upload_pr_documentation.yaml New runner. Manual squash. (#2110) 2024-06-24 18:08:34 +02:00