text-generation-inference/server/text_generation_server/layers/moe
Daniël de Kok 775e5f4c64 MoE Marlin: support desc_act for groupsize != -1 (#2590)
This change uses the updated Marlin MoE kernel from vLLM to support
MoE with activation sorting and groups.
2024-10-25 09:12:03 +00:00
..
__init__.py MoE Marlin: support desc_act for groupsize != -1 (#2590) 2024-10-25 09:12:03 +00:00
fused_moe_rocm.py Update ROCM libs and improvements (#2579) 2024-10-25 09:01:04 +00:00
gptq_marlin.py MoE Marlin: support desc_act for groupsize != -1 (#2590) 2024-10-25 09:12:03 +00:00
unquantized.py Update ROCM libs and improvements (#2579) 2024-10-25 09:01:04 +00:00