Commit Graph

  • 737b1d8369
    Merge ae7f3aeba1 into 719907410b Baptiste Colle 2025-06-23 12:27:39 +0000
  • ae7f3aeba1 update conftest gaudi/add-ci baptiste 2025-06-23 12:27:32 +0000
  • a32025f931 fix style baptiste 2025-06-23 12:26:06 +0000
  • 0295bf243f fix broken test baptiste 2025-06-23 12:10:14 +0000
  • cd5c51974e
    Merge f384e0f308 into 719907410b Wang, Yi 2025-06-23 19:46:52 +0800
  • 1f03afe94d enable multi-card test baptiste 2025-05-21 15:28:58 +0000
  • 8768085c8c add new gaudi3 runners baptiste 2025-05-21 11:27:11 +0000
  • 9c235f4d66 feat(gaudi/ci): added ci for gaudi device baptiste 2025-04-22 09:17:44 +0000
  • fcf6870d20 testing baptiste 2025-04-22 08:43:45 +0000
  • 59dc8c2699 change defualt behaviour to only run a subset of all the models baptiste 2025-04-22 08:16:17 +0000
  • 9c6776375e change defualt behaviour to only run a subset of all the models baptiste 2025-04-22 08:15:11 +0000
  • a2a5772cd7 wip(ci): debug the ci Baptiste Colle 2025-04-10 17:17:16 +0200
  • 4b5e812fe1 wip(ci): debug the ci Baptiste Colle 2025-04-10 16:08:06 +0200
  • 2c2cfc09c5 Update tests.yaml Pauline Bailly-Masson 2025-04-10 15:23:17 +0200
  • 1bd2ad9635 Update tests.yaml Pauline Bailly-Masson 2025-04-10 15:16:14 +0200
  • 76d155e660 wip(ci): rerun ci to debug baptiste 2025-04-10 11:47:40 +0000
  • 8568f910a7 fix llama failing test baptiste 2025-04-10 09:03:49 +0000
  • 781dd203e9 feat(ci): llama3 test working baptiste 2025-04-10 08:32:37 +0000
  • 7779d0c786 feat(ci): llama3 test working baptiste 2025-04-10 08:32:28 +0000
  • b4917f67e4 wip: able to launch gaudi tests baptiste 2025-04-10 07:52:20 +0000
  • 4e40467c6d wip(test): adding test to ci baptiste 2025-04-10 07:46:59 +0000
  • 3a7619d450
    Merge fd88b1d6b9 into 719907410b Jiayu Liu 2025-06-23 15:37:28 +0530
  • e7500dbbd4
    Merge d17f36e497 into 719907410b Emmanuel Ferdman 2025-06-23 15:37:28 +0530
  • 719907410b
    [gaudi] Refine rope memory, do not need to keep sin/cos cache per layer (#3274) main Wang, Yi 2025-06-23 17:15:39 +0800
  • 95c7cf9b5c refine rope memory, do not need to keep sin/cos cache per layer Wang, Yi A 2025-06-20 19:00:40 -0700
  • c86cbf42b7
    Merge 2204f91f32 into 238fbd4d50 drbh 2025-06-20 17:58:39 +0200
  • 664e0401f9
    Merge c090cd1506 into 238fbd4d50 Frans de Jonge 2025-06-20 16:24:37 +0200
  • 2245fe2fc2
    Merge f147f10ed4 into 238fbd4d50 Wang, Yi 2025-06-20 07:14:05 +0800
  • d811f363d8
    Merge 3338b34ba4 into 238fbd4d50 Wang, Yi 2025-06-20 07:13:36 +0800
  • ad7ef16508
    Merge aeabb7b71a into 238fbd4d50 Gerard Casas Saez 2025-06-19 12:21:22 -0400
  • aaa9353092
    Merge c43954d44c into 238fbd4d50 Wang, Yi 2025-06-19 22:42:07 +0800
  • 472bc27371
    Merge fab395b41f into 238fbd4d50 Tzu-Yu Lee 2025-06-19 19:39:47 +0800
  • d4bd5cac79 chore: version 3.3.4 v3.3.4 git_v3.3.4 David Corvoysier 2025-06-19 09:08:38 +0000
  • 238fbd4d50
    Neuron backend fix and patch version 3.3.4 (#3273) David Corvoysier 2025-06-19 10:52:41 +0200
  • 14ee6e7804
    [gaudi] gemma3 text and vlm model intial support. need to add sliding window support later (#3270) Wang, Yi 2025-06-19 15:32:34 +0800
  • 8aca6b32d8
    Merge a2d2406ddd into bd1bdebb47 Guspan Tanadi 2025-06-19 05:37:11 +0200
  • 6e1ca4f619 chore: prepare 3.3.4 David Corvoysier 2025-06-18 19:04:10 +0000
  • 82ebfd67bb fix(neuron): wrong assertion when batch_size==1 David Corvoysier 2025-06-18 18:54:53 +0000
  • 1754b79f10 chore: release 3.2.3 v3.3.3 git_v3.3.3 David Corvoysier 2025-06-18 12:59:29 +0000
  • bd1bdebb47
    doc: fix README (#3271) David Corvoysier 2025-06-18 12:35:36 +0200
  • f13e28c98d
    [gaudi] Refine logging for Gaudi warmup (#3222) regisss 2025-06-18 04:34:00 -0600
  • d343820b61 doc: fix README David Corvoysier 2025-06-18 09:56:58 +0000
  • b4d17f18ff
    chore: prepare release 3.3.3 (#3269) David Corvoysier 2025-06-18 11:55:26 +0200
  • 93e62e73c8 gemma3 text and vlm model intial support. need to add sliding window support later Wang, Yi A 2025-06-17 17:46:25 -0700
  • 1acc96c82a Black regisss 2025-06-18 07:52:59 +0000
  • ab81d70000 chore: prepare release 3.3.3 David Corvoysier 2025-06-18 07:38:44 +0000
  • 9dbaa176fd Add log_master & VLM cases regisss 2025-06-17 21:13:13 +0000
  • 0627983c17
    [Gaudi] use pad_token_id to pad input id (#3268) Wang, Yi 2025-06-17 15:07:25 +0800
  • 388f27aaa8 remove unused Deepseek and AutoGPTQ pip Wang, Yi A 2025-06-16 22:31:51 -0700
  • 9431a1ec4c [Gaudi] use pad_token_id to pad input id Wang, Yi A 2025-06-16 05:18:01 -0700
  • 564c9e1cc0 Flash causal LM case regisss 2025-06-16 21:07:44 +0000
  • 2ba396c4c1 Merge branch 'main' into add_logs_gaudi_warmup regisss 2025-06-16 12:36:45 +0000
  • f384e0f308 HuggingFaceM4/Idefics3-8B-Llama3 crash fix Wang, Yi A 2025-06-13 00:35:39 -0700
  • 0381aba864
    Merge 551ee3a365 into 3752143b39 drbh 2025-06-15 00:49:23 +0200
  • aeabb7b71a
    disable mamba in CPU Gerard Casas Saez 2025-06-13 12:34:28 -0500
  • 3752143b39
    [Gaudi] Fix the integration-test issues (#3265) Yuan Wu 2025-06-13 20:47:06 +0800
  • 7ad4909ce8
    Merge branch 'main' into ci Yuan Wu 2025-06-13 20:42:53 +0800
  • ded4cb52ac
    [Gaudi] Enable Qwen3_moe model (#3244) Yuan Wu 2025-06-13 18:03:24 +0800
  • a220e57f45
    [gaudi] HuggingFaceM4/idefics2-8b issue fix (#3264) Wang, Yi 2025-06-13 18:00:08 +0800
  • 8ab1a14e23 Fix test case yuanwu 2025-06-13 07:20:34 +0000
  • 63a75a7952 Remove opt model yuanwu 2025-06-13 05:56:40 +0000
  • a50b33a964 Fix mistral error yuanwu 2025-06-13 05:43:16 +0000
  • 9ed9497f7e Add UV yuanwu 2025-06-13 05:25:10 +0000
  • dbec20f98f delete optimum.habana import Wang, Yi A 2025-06-12 22:42:10 -0700
  • 1e56e5fe5c [gaudi] HuggingFaceM4/idefics2-8b issue fix Wang, Yi A 2025-06-12 22:10:13 -0700
  • 1791c855f0
    Merge branch 'huggingface:main' into qwen3_moe Yuan Wu 2025-06-13 10:02:18 +0800
  • 0bbd8d1645 Remove useless code yuanwu 2025-06-13 01:48:29 +0000
  • e07056ab3f
    [Gaudi] Remove optimum-habana (#3261) Yuan Wu 2025-06-13 04:35:36 +0800
  • 25fdc5f03c
    [gaudi] Move the _update_cos_sin_cache into get_cos_sin (#3254) Yuan Wu 2025-06-13 04:31:11 +0800
  • 613b8dd647
    [gaudi] Vlm rebase and issue fix in benchmark test (#3263) Wang, Yi 2025-06-13 04:26:37 +0800
  • 027f293098 fix mllama oom if set batch_size > 8 Wang, Yi A 2025-06-11 23:18:59 -0700
  • bba260912c fix mllama crash if bs>0 and filter Wang, Yi A 2025-06-11 20:07:48 -0700
  • b1ae4ad260 fix Qwen2 vl crash in benchmark Wang, Yi A 2025-06-10 23:30:11 -0700
  • f72b290020 add mark_step in vlm part Wang, Yi A 2025-06-10 19:02:14 -0700
  • d68edc4a2f Qwen2 vl fix Wang, Yi A 2025-06-09 22:25:47 -0700
  • 93e5e35f9d llama4 and some issue fix Wang, Yi A 2025-06-08 23:56:38 -0700
  • b09d4cc142 port https://github.com/huggingface/text-generation-inference/pull/3188 to gaudi backend Wang, Yi A 2025-06-08 20:03:28 -0700
  • d17f36e497
    Migrate to V2 Pydantic interface Emmanuel Ferdman 2025-06-11 15:34:12 -0700
  • 839477670a
    [gaudi] Perf optimization (#3256) Wang, Yi 2025-06-11 21:00:21 +0800
  • 5f26a72876 Remove the workaround for HPU distributed. yuanwu 2025-06-11 06:15:11 +0000
  • e202b5f98f Remove mllama.py and llava_next.py yuanwu 2025-06-11 05:37:03 +0000
  • c112ef1796 Remove useless files yuanwu 2025-06-11 03:27:19 +0000
  • 91c40e6c58 Fix crash yuanwu 2025-06-11 02:34:09 +0000
  • 512eca7f8f Remove debug info yuanwu 2025-06-06 08:26:38 +0000
  • 14112d800b
    Merge 2394437dc7 into 79183d1647 Sebastian Liebscher 2025-06-11 09:52:35 +0800
  • 79183d1647
    Bump neuron SDK version (#3260) David Corvoysier 2025-06-10 17:56:25 +0200
  • d5bad17ed6 fix(neuron): adjust test expectations for llama on nxd David Corvoysier 2025-05-26 13:55:20 +0000
  • 2c8b0e37c4 tests(neuron): remove obsolete models David Corvoysier 2025-05-26 13:54:41 +0000
  • 5d2b159182 fix(neuron): adapt entrypoint David Corvoysier 2025-05-26 10:13:33 +0000
  • 07a0e2f7e6 Set default value of ATTENTION as paged yuanwu 2025-06-10 07:42:11 +0000
  • 3e977bde99 feat(neuron): support on-device sampling David Corvoysier 2025-05-23 13:26:05 +0000
  • bf529ef476 test(neuron): update models and expectations David Corvoysier 2025-05-23 10:13:29 +0000
  • 4e8ffec8ef fix(generator): emulate greedy in sampling parameters David Corvoysier 2025-05-27 09:34:19 +0000
  • b916076c72 fix(nxd): adapt model retrieval to new APIs David Corvoysier 2025-05-27 12:27:22 +0000
  • 39895019c8 fix(neuron): neuron config is not stored in config anymore David Corvoysier 2025-05-23 09:48:05 +0000
  • c4dd2a8197 fix(neuron): use new cache import path David Corvoysier 2025-05-23 08:37:17 +0000
  • 83eadbb256 fix(neuron): use neuron_config whenever possible David Corvoysier 2025-05-23 08:33:12 +0000
  • 0b640f7c8c refactor(neuron): remove obsolete code paths David Corvoysier 2025-05-23 08:27:27 +0000
  • 2eb223613e refactor(neuron): use named parameters in inputs helpers David Corvoysier 2025-05-22 14:53:25 +0000
  • b094f026c1 chore(neuron): bump version to 0.2.0 David Corvoysier 2025-05-22 14:35:18 +0000