Commit Graph

  • 2cde30de24 gpt_bigcode could also go pageattn Wang, Yi A 2025-03-18 23:59:31 -0700
  • 073f793976 fix phimoe issue Wang, Yi A 2025-03-18 23:11:01 -0700
  • e497bc09f6
    Minor fixes. (#3125) Nicolas Patry 2025-03-18 15:42:35 +0100
  • 76484fdc66
    Minor fixes. Nicolas Patry 2025-03-18 15:31:43 +0100
  • 4d28897b4e
    Fix release nix workflow. v3.2.1 git_3.2.1 Nicolas Patry 2025-03-18 15:27:48 +0100
  • f5850f4c4f
    Patch release 3.2.1 Nicolas Patry 2025-03-18 15:12:48 +0100
  • 67ce543e04
    Intel docker. (#3121) Nicolas Patry 2025-03-18 15:12:11 +0100
  • 83fe45c15e
    Prepare for patch release. (#3124) Nicolas Patry 2025-03-18 15:11:55 +0100
  • f5fa8ba2e8
    Prepare for patch release. Nicolas Patry 2025-03-18 14:59:58 +0100
  • f7d47cb35c
    Fixing dockerfile ? Nicolas Patry 2025-03-18 14:53:56 +0100
  • b77281ac22
    torchaudio ? Nicolas Patry 2025-03-18 12:59:13 +0100
  • 11f2eec10e
    Publish nix docker image. (#3122) Nicolas Patry 2025-03-18 12:58:21 +0100
  • b080f1e31e
    Cleaner tags. Nicolas Patry 2025-03-18 12:52:33 +0100
  • efa0fcd3e2
    Runnign from nix. Nicolas Patry 2025-03-18 12:42:11 +0100
  • a3090578b9
    Testing the PR. Nicolas Patry 2025-03-18 12:33:55 +0100
  • cf3a6cc241
    Pushing with skopeo Nicolas Patry 2025-03-18 12:30:49 +0100
  • 9dc8b67a6f
    Build zstd. Nicolas Patry 2025-03-18 11:17:52 +0100
  • 6c27ab2f41
    Forgot to push. Nicolas Patry 2025-03-18 11:02:53 +0100
  • 452472cf3d
    Something else. Nicolas Patry 2025-03-18 10:55:53 +0100
  • bb851f951c
    Run during PR. Nicolas Patry 2025-03-18 10:53:40 +0100
  • c9b267faa5
    Publish nix docker image. Nicolas Patry 2025-03-18 10:49:59 +0100
  • a84e86a5b3
    Intel docker. Nicolas Patry 2025-03-18 10:43:02 +0100
  • a35fbdb925
    Bug Fix: Sliding Window Attention (#3112) Mohit Sharma 2025-03-18 15:07:33 +0530
  • 078084286a
    Fix qwen2. Nicolas Patry 2025-03-18 10:36:54 +0100
  • 8c2c348f3c
    Gaudi: Sync TGI with the latest changes from the TGI-Gaudi fork (#3117) Baptiste Colle 2025-03-18 09:45:52 +0100
  • 5cd1c93cad add moe support, fix qwen/mistral/mixtral crash Wang, Yi A 2025-03-18 00:45:15 -0700
  • d4d9b610af Fix build error of server make install yuanwu 2025-03-18 05:46:12 +0000
  • 3ca71c6422
    Merge b9467b95a0 into 095775e05c jrc2139 2025-03-17 14:43:47 -0400
  • 095775e05c
    launcher: correctly get the head dimension for VLMs (#3116) Daniël de Kok 2025-03-17 18:19:37 +0100
  • e0535a13c5 increase timeouts debugging-timeouts erikkaum 2025-03-17 17:56:57 +0100
  • dd78df94ad fix: bump org name in gemma3 test drbh 2025-03-17 16:03:11 +0000
  • febc488e0e fix: bump org name in gemma3 test drbh 2025-03-17 15:57:07 +0000
  • 0b3e3db043
    xpu 2.6 update (#3051) Wang, Yi 2025-03-17 20:48:48 +0800
  • 48df6183e8 feat(gaudi): add all the changes from tgi-gaudi fork up to PR #289 baptiste 2025-03-17 11:01:53 +0000
  • 4727a3af67 launcher: correctly get the head dimension for VLMs Daniël de Kok 2025-03-17 10:07:39 +0000
  • 6bbe24d974 use tensor cache in hpu graph to avoid replay issue Wang, Yi A 2025-03-17 01:36:49 -0700
  • a07e7437b6 enable all the model. not testet yet Wang, Yi A 2025-03-16 22:37:34 -0700
  • 5d3653943c adjust block table in hpu to improve performance Wang, Yi A 2025-03-16 19:40:40 -0700
  • b7fea6fc2f fix TP in pageattn Wang, Yi A 2025-03-14 18:01:58 -0700
  • 2c2fc6544d fix: add pillow dependency and bump lock+requirements drbh 2025-03-14 18:17:57 +0000
  • e5dfd41ed4
    Upgrading from_env to get token from file when necessary + fix pali_gemma. Nicolas Patry 2025-03-14 17:06:36 +0100
  • 659ce4f3fc feat: add tests for image types and remove alpha from png drbh 2025-03-14 15:33:06 +0000
  • e5ec176bf4 fix: bump snapshots and improve exceed window test case drbh 2025-03-14 15:04:38 +0000
  • 201dc6294f clean cuda/rocm code in hpu backend, enable flat_hpu Wang, Yi A 2025-03-13 19:21:44 -0700
  • 170a12f331 Update window size rocm flash decoding Mohit Sharma 2025-03-14 07:50:11 +0000
  • b30cdabf68 Add window_size_left param ipex rocm Mohit Sharma 2025-03-14 07:47:45 +0000
  • eaf18c1ccb (typo) collection link Mohit Sharma 2025-03-14 07:36:38 +0000
  • 69e0a87dd5 (fix) flashinfer Mohit Sharma 2025-03-13 21:32:38 +0000
  • ff82f0f84c (fix) sliding window attention Mohit Sharma 2025-03-13 19:30:39 +0000
  • f91434e99b
    Make the Nix-based Docker container work on non-NixOS (#3109) Daniël de Kok 2025-03-13 14:02:45 +0100
  • 2158aaa3d9 Make the Nix-based Docker container work on non-NixOS Daniël de Kok 2025-03-13 12:36:56 +0000
  • 8b91f92978
    Fixing the docker build. (#3108) Nicolas Patry 2025-03-13 11:26:44 +0100
  • 1e3f34b106
    Apply suggestions from code review Nicolas Patry 2025-03-13 11:26:28 +0100
  • 033efc40db
    Fixing the docker build. Nicolas Patry 2025-03-13 11:04:50 +0100
  • 27ed848676
    Release of Gaudi Backend for TGI (#3091) Baptiste Colle 2025-03-13 10:56:01 +0100
  • 83ef364177
    We need gcc during runtime to enable triton to compile kernels. (#3103) Nicolas Patry 2025-03-13 10:45:47 +0100
  • 83b7b7bb92
    Router: add gemma3-text model type (#3107) Daniël de Kok 2025-03-13 10:41:33 +0100
  • 9a3998d037 Router: add gemma3-text model type Daniël de Kok 2025-03-13 09:37:48 +0000
  • c73ae0bd88
    Update to kernels 0.2.1 (#3084) Daniël de Kok 2025-03-13 10:36:29 +0100
  • 73ee7837b8
    Update to kernels 0.2.0. use_updated_kernels Nicolas Patry 2025-03-13 10:30:07 +0100
  • 8c8f249636
    Fixing the docker build. Nicolas Patry 2025-03-12 23:58:15 +0100
  • 273d304cf9
    We need gcc during runtime to enable triton to compile kernels. Nicolas Patry 2025-03-12 23:07:06 +0100
  • 411a28288d
    Release 3.2.0 v3.2.0 git_3.2.0 Nicolas Patry 2025-03-12 11:15:29 +0100
  • a3dfcf571d fix(gaudi): remove use of latest for gaudi docker image + redid gaudi benchmarking section to include best practices baptiste 2025-03-11 10:00:02 +0000
  • 2fd0049929 fix(gaudi): add default argument for the dockerfile baptiste 2025-03-10 10:49:32 +0000
  • 02fa6adb20 feat(gaudi): release ready (docs, docker image and vlm ready) baptiste 2025-03-10 09:01:45 +0000
  • d4c6faa67b
    Try to fix on main CI color. (#3101) origin/slind_window_fix Nicolas Patry 2025-03-12 10:12:24 +0100
  • 4ac06ddf56
    Preparing relase 3.2.0 (#3100) Nicolas Patry 2025-03-12 10:11:33 +0100
  • 9a0fe1bc7b
    Try to fix on main CI color. Nicolas Patry 2025-03-12 10:06:30 +0100
  • 75daa9b571
    Update doc. Nicolas Patry 2025-03-12 09:58:18 +0100
  • f01dc9e743
    Update neuron backend (#3098) David Corvoysier 2025-03-12 09:53:15 +0100
  • f979c2aa1b
    Forgot the README. Nicolas Patry 2025-03-12 09:44:09 +0100
  • 23b449d238 test(neuron): simplify sampling test David Corvoysier 2025-03-12 08:37:28 +0000
  • 2bb2bdb6a8 feat(neuron): tag latest image for local tests David Corvoysier 2025-03-04 13:28:44 +0000
  • 39dc6737ed feat(neuron): bump optimum-neuron version David Corvoysier 2025-02-25 15:18:17 +0000
  • 1e427e5fda feat(neuron): use AWS Neuron SDK 2.21.1 David Corvoysier 2025-02-25 15:24:54 +0000
  • 80c1404ea5
    Preparing relase 3.2.0 Nicolas Patry 2025-03-12 09:37:40 +0100
  • 1c6dee17ce fix awq crash if modules_to_not_convert is None Wang, Yi A 2025-03-12 01:28:56 -0700
  • 5c5528e362
    Fix tool call4 (#3094) Nicolas Patry 2025-03-12 09:28:47 +0100
  • ed46c2c414
    Add gemma3 model (#3099) Mohit Sharma 2025-03-12 13:55:51 +0530
  • ef4e2685d8
    Update the tests. Nicolas Patry 2025-03-11 13:19:38 +0100
  • 03fe626a95
    Removing a lot of NO_TOOL shenanigans. Nicolas Patry 2025-03-10 23:45:00 +0100
  • cb92acf280
    Removing the no_tool content information. Nicolas Patry 2025-03-10 21:24:43 +0100
  • f74c36fe0d
    Fix tool call3 (#3086) Nicolas Patry 2025-03-12 09:22:53 +0100
  • 9997047d8a int Wang, Yi A 2025-03-12 00:47:02 -0700
  • 468da545af update get xpu memory api Wang, Yi A 2025-03-12 00:37:17 -0700
  • 587e5dea22 Add gemma3 model Mohit Sharma 2025-03-12 07:01:46 +0000
  • 750b48c730
    More qwen2. Nicolas Patry 2025-03-12 02:29:31 +0100
  • c6dd5d0e22 Download kernels in install-cuda target Daniël de Kok 2025-03-10 14:36:58 +0000
  • 141c97e34e Update to kernels 0.2.1 Daniël de Kok 2025-03-07 12:31:08 +0000
  • 2deedc1ab0
    Merge 380e73dba9 into ae4451c3da yashaswipiplani 2025-03-11 22:04:34 +0800
  • cdc5380f2b Merge branch 'main' of https://github.com/sywangyi/text-generation-inference into xpu_2.6 Wang, Yi A 2025-03-11 04:29:53 -0700
  • a38e65a9f9 install whl Wang, Yi A 2025-03-11 04:27:05 -0700
  • 1578cdd9a8
    Snapshot update with the new updated tool_call_id. Nicolas Patry 2025-03-11 11:40:06 +0100
  • ae4451c3da
    Update README.md (#3095) celsowm 2025-03-11 07:05:21 -0300
  • 214a5b5da1
    Fmt. Nicolas Patry 2025-03-10 18:42:27 +0100
  • a73cd56075
    Fixing the tool call id. Nicolas Patry 2025-03-10 17:07:22 +0100
  • 3e731a7c2f
    Fixing some corner cases. Nicolas Patry 2025-03-10 12:06:44 +0100
  • 0b710f9671
    Update tehe doc. Nicolas Patry 2025-03-07 20:17:23 +0100
  • 207a70e7be
    Fixing the tool calling convention. Nicolas Patry 2025-03-07 19:42:36 +0100