text-generation-inference/docs
fxmarty f1976851d9 ROCm: make CK FA2 default instead of Triton (#1924)
As per title.

Triton autotune overhead is prohibitive, as it needs to be done for each
different prompt length.
2024-07-17 05:36:58 +00:00
..
source ROCm: make CK FA2 default instead of Triton (#1924) 2024-07-17 05:36:58 +00:00
index.html chore: add pre-commit (#1569) 2024-04-24 15:32:02 +03:00
openapi.json v2.0.1 2024-06-03 15:39:47 +03:00