Commit Graph

6 Commits

Author SHA1 Message Date
kaixuanliu
535ce23827
Adjust the round_up_seq logic in Gaudi backend (#3224)
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
2025-05-12 09:58:43 +02:00
kaixuanliu
c94f415af4
Change HPU warmup logic: seq length should be with exponential growth (#3217)
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
2025-05-10 15:41:18 +02:00
regisss
f208ba6afc
Fix HF_HUB_OFFLINE=1 for Gaudi backend (#3193)
* Fix `HF_HUB_OFFLINE=1` for Gaudi backend

* Fix HF cache default value in server.rs

* Format
2025-05-06 10:47:53 +02:00
Yuan Wu
3d059f91ab
Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE (#3131)
* Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Remove debug modifications

Signed-off-by: yuanwu <yuan.wu@intel.com>

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
2025-04-03 10:34:53 +02:00
Baptiste Colle
8c2c348f3c
Gaudi: Sync TGI with the latest changes from the TGI-Gaudi fork (#3117)
feat(gaudi): add all the changes from tgi-gaudi fork up to PR #289
2025-03-18 09:45:52 +01:00
Baptiste Colle
683ff53fa3
Add Gaudi Backend (#3055)
* wip(gaudi): import server and dockerfile from tgi-gaudi fork

* feat(gaudi): new gaudi backend working

* fix: fix style

* fix prehooks issues

* fix(gaudi): refactor server and implement requested changes
2025-02-28 12:14:58 +01:00