* launcher: ensure correct detection of Gemma 3 head size
* Support flashinfer for Gemma3 prefill
Gemma3 uses bidirectional attention for images. Flashinfer
supports custom masks. Hook up the mask with flashinfer, so that we do
not have to use the slower SDPA implementation for prefills with images.
* Update Gemma3 test outputs
* Fixed unused import