Commit Graph

4 Commits

Author SHA1 Message Date
Mohit Sharma
a7353c35e8 fix bt 2025-04-11 15:10:19 +00:00
Mohit Sharma
d2f8caff2b support cuda graphs 2025-04-11 15:05:28 +00:00
Mohit Sharma
3f343cdb6f reverse flash causal change 2025-04-10 15:03:44 +00:00
Mohit Sharma
d9bb9bebc9
Add llama4 (#3145)
* initial changes

* Add support for other vlm

* cleanup comment

* Improve attn_implementation

* Add comments for support of models

* add model

* add model

* fixes and improvements

* update docker

* Add cache position

* Add tests

* remove redundant changes

* remove tr version

* Upgrade doc + fix linting.

* Fixing the CI.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2025-04-06 10:20:22 +02:00