Commit Graph

2 Commits

Author SHA1 Message Date
Daniël de Kok
d4f995e718 Add DenseMoELayer and wire it up in Mixtral/Deepseek V2 (#2537)
This replaces the custom layers in both models.
2024-10-25 09:01:04 +00:00
Daniël de Kok
29a93b78ba Move to moe-kernels package and switch to common MoE layer (#2511)
* Move to moe-kernels package and switch to common MoE layer

This change introduces the new `moe-kernels` package:

- Add `moe-kernels` as a dependency.
- Introduce a `SparseMoELayer` module that can be used by MoE
  models.
- Port over Mixtral and Deepseek.

* Make `cargo check` pass

* Update runner
2024-09-25 06:18:05 +00:00