Commit Graph

2 Commits

Author SHA1 Message Date
Daniël de Kok
5726a9ca81 Move to moe-kernels package and switch to common MoE layer
This change introduces the new `moe-kernels` package:

- Add `moe-kernels` as a dependency.
- Introduce a `SparseMoELayer` module that can be used by MoE
  models.
- Port over Mixtral and Deepseek.
2024-09-16 10:57:44 +00:00
Daniël de Kok
7774655297
Add tests for Mixtral (#2520)
Disable by default because CI runners do not have enough GPUs.
2024-09-16 12:39:18 +02:00