text-generation-inference/docs
fxmarty 293b8125e7
ROCm: make CK FA2 default instead of Triton (#1924)
As per title.

Triton autotune overhead is prohibitive, as it needs to be done for each
different prompt length.
2024-05-20 08:44:48 +08:00
..
source ROCm: make CK FA2 default instead of Triton (#1924) 2024-05-20 08:44:48 +08:00
index.html chore: add pre-commit (#1569) 2024-02-16 11:58:58 +01:00
openapi.json v2.0.1 2024-04-18 17:20:36 +02:00