Commit Graph

  • 87a0af4ec2
    Update transformers to 4.51 (#3148) Mohit Sharma 2025-04-07 16:25:43 +0530
  • 9c26b52940
    Use ROCM 6.3.1 (#3141) Mohit Sharma 2025-04-07 16:25:11 +0530
  • edde1d7132
    Lint. Nicolas Patry 2025-04-07 09:43:21 +0200
  • da5f2a82e9
    Lint. Nicolas Patry 2025-04-07 09:39:00 +0200
  • 10a5dfee70
    Those tests cannot be run in CI. Nicolas Patry 2025-04-07 09:20:04 +0200
  • d239884b8e
    Fixing bug in mllama. Nicolas Patry 2025-04-07 09:18:29 +0200
  • 9b50bada65
    Forcing torchvision to be in there. Nicolas Patry 2025-04-07 09:09:49 +0200
  • 23f82b7b65
    Upgrading the nix deps too. Nicolas Patry 2025-04-07 08:33:51 +0200
  • fe1b621c33 update transformres Pedro Cuenca 2025-04-06 12:16:12 +0000
  • c67546fd40
    Release 3.2.2 v3.2.2 git_v3.2.2 Nicolas Patry 2025-04-06 11:40:52 +0200
  • d23b385eee
    Preparing for release. (#3147) Nicolas Patry 2025-04-06 11:36:00 +0200
  • d8dd633c8b
    Merged tgi-nix update. Nicolas Patry 2025-04-06 11:19:16 +0200
  • b3024ca9f7
    Adding hf-xet dependency. Nicolas Patry 2025-04-06 11:14:03 +0200
  • b3e8dfbfd8
    Merge af546505ad into d9bb9bebc9 Drew Paettie 2025-04-06 10:57:08 +0200
  • e34a108b93
    Preparing for release. Nicolas Patry 2025-04-06 10:25:58 +0200
  • d9bb9bebc9
    Add llama4 (#3145) Mohit Sharma 2025-04-06 13:50:22 +0530
  • 0b1d253c04
    Fixing the CI. Nicolas Patry 2025-04-06 10:18:29 +0200
  • f2be5f3db4
    Upgrade doc + fix linting. Nicolas Patry 2025-04-06 10:10:37 +0200
  • 53567b0028 remove tr version add_L4 Mohit Sharma 2025-04-05 19:57:36 +0000
  • 04aab711a7 remove redundant changes Mohit Sharma 2025-04-05 15:42:31 +0000
  • 8094de91fc Add tests Mohit Sharma 2025-04-05 15:33:30 +0000
  • 29703dbd27 fix warmup issue for mllama Wang, Yi A 2025-04-04 05:42:59 -0700
  • b48bfb7a6f fix docker Mohit Sharma 2025-04-04 13:35:04 +0000
  • 2558124ad8 fix typos baptiste 2025-04-04 07:46:42 +0000
  • 2c837c671d remove debug comments baptiste 2025-04-04 07:45:13 +0000
  • 32b00e039b feat(test): add more models to integration tests baptiste 2025-04-04 07:40:37 +0000
  • 064c7bd03c feat(gaudi): add integration test baptiste 2025-03-28 08:58:28 +0000
  • 7a57b01002 Add cache position Mohit Sharma 2025-04-03 10:26:48 +0000
  • 7025e77b88
    Merge 1cb904e619 into 3d059f91ab Jim Burtoft 2025-04-03 08:39:44 +0000
  • 3d059f91ab
    Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE (#3131) Yuan Wu 2025-04-03 16:34:53 +0800
  • 8591687561 refine log and fix some issue Wang, Yi A 2025-04-02 19:11:35 -0700
  • 102e29902a add kvcache dtype Wang, Yi A 2025-04-02 19:29:01 -0700
  • 22aaf497b7 update docker Mohit Sharma 2025-04-02 13:08:51 +0000
  • cc0552f8fc fixes and improvements Mohit Sharma 2025-04-02 11:39:23 +0000
  • a84da5b698 optimize code Wang, Yi A 2025-04-02 00:56:15 -0700
  • 705cc0b619 multi-modality warmup Wang, Yi A 2025-04-01 23:57:07 -0700
  • 8e01191b4c add model Mohit Sharma 2025-04-01 16:11:19 +0000
  • 06663162b4 add model Mohit Sharma 2025-04-01 15:51:36 +0000
  • 9d85ac9485 LLM warmup logic Wang, Yi A 2025-03-30 20:20:09 -0700
  • c55a8caea2 remove torch.where to fix incorrect output in hpu graph model Wang, Yi A 2025-03-31 22:51:54 -0700
  • 065f87a337 IPEX support FP8 kvcache Wang, Yi A 2025-03-29 02:31:38 -0700
  • bfcc1df91f test_kernel aiter_kernels Mohit Sharma 2025-03-28 16:17:13 +0000
  • f0e5faec1a fix some issue Wang, Yi A 2025-03-28 07:01:06 -0700
  • 376e0507b7 missing gptj change... Wang, Yi A 2025-03-28 01:08:40 -0700
  • 787dbe98a8 fix comment Wang, Yi A 2025-03-28 00:09:26 -0700
  • 7914e980e2 Merge branch 'main' into gaudi_backend_pa Wang, Yi A 2025-03-28 00:03:49 -0700
  • 1508ee8de1 remove block_tables and prefill_cache_indices which will lead to dynamic shape Wang, Yi A 2025-03-27 22:51:21 -0700
  • 7900be5ac3 warmup decode Wang, Yi A 2025-03-26 20:19:13 -0700
  • ba7a131e04 add warmup_decode Wang, Yi A 2025-03-26 17:39:26 -0700
  • 23c7d11cde add updated makefile Mohit Sharma 2025-03-26 16:18:29 +0000
  • c553a9cd77 update dockerfile Mohit Sharma 2025-03-26 16:13:55 +0000
  • 0142550096
    nix-v3.2.1 -> v3.2.1-nix (#3129) Corentin REGAL 2025-03-26 15:36:43 +0100
  • fd70ad703e warmup prefill Wang, Yi A 2025-03-25 22:21:44 -0700
  • f5f14dc660
    Gaudi: Fix llava-next and mllama crash issue (#3127) Yuan Wu 2025-03-25 22:08:15 +0800
  • 69773767c5 enable fp8 Wang, Yi A 2025-03-24 20:21:45 -0700
  • cee44bff7a Improve message to be useful without spans message-more-info Corentin REGAL 2025-03-24 16:00:09 +0100
  • 54d15462dc
    Torch 2.6 (#3134) Nicolas Patry 2025-03-24 11:55:49 +0100
  • 6868a13433
    Upgrade to transformers 4.50 Nicolas Patry 2025-03-24 11:39:41 +0100
  • 3964db61c0
    TGI-nix main. Nicolas Patry 2025-03-24 10:09:26 +0100
  • 9aadfb2407 Remove debug modifications yuanwu 2025-03-24 01:13:45 +0000
  • 94ebf9317b
    Time upgrade. Nicolas Patry 2025-03-23 23:14:06 +0100
  • 0dacfcad19
    Upgrade toolchain. Nicolas Patry 2025-03-23 23:05:00 +0100
  • f3e22d7206
    Don't upgrade just yet. Nicolas Patry 2025-03-23 22:59:22 +0100
  • 2fa9ebe960
    Upgrade the toolchain. Nicolas Patry 2025-03-23 22:58:48 +0100
  • 62fb552669
    Torch 2.6 Nicolas Patry 2025-03-23 21:49:09 +0100
  • 8d221b7b79 fix gptq issue Wang, Yi A 2025-03-22 20:58:37 -0700
  • 9914ffe1f1 remove unused quantization code and enable awq/gptq int4 Wang, Yi A 2025-03-21 18:28:58 -0700
  • e721574729 fix: update test for tool_call_id in Message improve-tool-call-and-response-ids drbh 2025-03-14 15:23:07 +0000
  • 042a6bc365 fix: bump openapi docs to include FunctionDefinition.id drbh 2025-03-14 14:33:49 +0000
  • af78f46c3d feat: align function id with tool call response drbh 2025-03-13 19:16:47 +0000
  • 65c6008847 fix: bump openapi doc with new grammar option pr-2982-ci-branch drbh 2025-03-17 14:51:33 +0000
  • 71ef9da72c feat: support json_schema grammar constraining and add tests drbh 2025-03-14 18:13:03 +0000
  • 5e61553f48 fix: another end-of-file-fixer lint drbh 2025-02-20 23:46:04 +0000
  • 92025e4b67 fix: add test snapshots and avoid docs change drbh 2025-02-20 23:45:15 +0000
  • 5e6ac4ff63 fix: end-of-file-fixer lint drbh 2025-02-20 18:39:15 -0500
  • 0928018ac2 fix: various linter adjustments drbh 2025-02-20 21:18:16 +0000
  • d278d3cf4c Add tests for all aliases Alex Weston 2025-01-30 14:11:05 -0500
  • d9cac33231 Add json_schema alias for GrammarType Alex Weston 2025-01-30 14:03:54 -0500
  • b5bac0dd2d Add comments for support of models Mohit Sharma 2025-03-21 14:11:15 +0000
  • 50ffe00a1a Improve attn_implementation Mohit Sharma 2025-03-21 13:44:40 +0000
  • b41faae318 cleanup comment Mohit Sharma 2025-03-21 11:28:40 +0000
  • ac6fc70c75 Add support for other vlm Mohit Sharma 2025-03-21 11:22:12 +0000
  • 2d2c56361d Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE yuanwu 2025-03-21 05:46:08 +0000
  • fdf0733f56 fix incorrect output in qwen2 idefics if hpu graph is used Wang, Yi A 2025-03-21 01:01:37 -0700
  • 36b6612f97 adjust warmup and enable vlm Wang, Yi A 2025-03-20 01:09:58 -0700
  • c6f97fd884 Change default values yuanwu 2025-03-21 05:50:08 +0000
  • 3c6630c6e9 Fix the errors of style check yuanwu 2025-03-21 05:47:18 +0000
  • e5b1aed2b3 nix-v3.2.1 -> v3.2.1-nix make it easier to check for version using semver semantic (same major and minor) Corentin REGAL 2025-03-20 16:34:03 +0100
  • 2e60a8dd65
    CI: enable server tests for backends (#3128) Baptiste Colle 2025-03-20 16:07:31 +0100
  • d8d09f9c7b initial changes Mohit Sharma 2025-03-20 14:22:42 +0000
  • e5503eba78
    configurable termination timeout (#3126) Erik Kaunismäki 2025-03-20 14:25:56 +0100
  • e3ea961d32 add test for backends baptiste 2025-03-20 09:40:38 +0000
  • f95aa42660 multi-modality initial PR Wang, Yi A 2025-03-19 23:27:27 -0700
  • bb55318f81 Fix llava-next and mllama crash issue yuanwu 2025-03-20 05:39:26 +0000
  • d5b78ba16f Merge branch 'main' into gaudi_backend_pa Wang, Yi A 2025-03-19 18:15:08 -0700
  • 69936732eb feat: allow model load and stub logits enable-transformers-vlm drbh 2025-03-19 19:55:19 +0000
  • f81b51d7d2
    Fmt. Nicolas Patry 2025-03-19 15:04:28 +0100
  • 7c030f0618
    Updating documentation. Nicolas Patry 2025-03-19 15:03:47 +0100
  • 2074d0516b enable dbrx remove some unused code Wang, Yi A 2025-03-19 03:16:41 -0700
  • 45075d167f make shard and webserver termination timeouts configurable erikkaum 2025-03-19 10:46:57 +0100