mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-23 16:02:10 +00:00
* (feat) convert tscales to tensorwise * (fix) fp8 scaling for cuda * (kernel) add marlin-kernels * add moe-kernels * fix moe kernel comit * fix scaling * nm changes |
||
---|---|---|
.. | ||
__init__.py | ||
loader.py | ||
w8a8_int.py | ||
w8an_fp.py | ||
wna16_int_24.py | ||
wna16_int.py |