Commit Graph

  • a70d2749f1
    Upgrading bitsandbytes. Nicolas Patry 2025-01-15 17:03:51 +0100
  • 203cade244
    Upgrading our rustc version. (#2908) Nicolas Patry 2025-01-15 17:04:03 +0100
  • 55fabaae01
    Clippy everything. Nicolas Patry 2025-01-15 16:51:28 +0100
  • 46994b34fb
    📝 add guide on using TPU with TGI in the docs (#2907) Baptiste Colle 2025-01-15 16:26:11 +0100
  • b9ab5037b0
    Fixing the rust tests to proper version. Nicolas Patry 2025-01-15 16:12:13 +0100
  • dc9b8e9814
    Fix docker run in README.md (#2861) Alvaro Bartolome 2025-01-15 16:07:10 +0100
  • 3c7ae48f7f
    docs(conceptual/speculation): available links Train Medusa (#2863) Guspan Tanadi 2025-01-15 22:05:54 +0700
  • 798c765c70
    Upgrading our rustc version. Nicolas Patry 2025-01-15 16:04:53 +0100
  • cc8b9650bd
    Baichuan2-13B does not have max_position_embeddings in config (#2903) Wang, Yi 2025-01-15 22:56:52 +0800
  • e07acc7f68
    Enable FP8 Per-Tensor Scales and Integrate Marlin/MoE Kernels Repo for ROCm (#2825) Mohit Sharma 2025-01-15 11:38:58 +0530
  • f91533224c docs(tpu): add TPU installation guide for TGI Baptiste Colle 2025-01-15 06:20:08 +0100
  • 48067e4a0d fmt baichuan2-13b Wang, Yi A 2025-01-13 17:23:28 -0800
  • 22ed5703de
    Update server/text_generation_server/models/flash_causal_lm.py Wang, Yi 2025-01-14 08:58:48 +0800
  • 59dbe11c4f
    Use Devel as the base image Yaser Jaradeh 2025-01-13 16:35:40 +0100
  • c22dba83a6
    Fix typo in deb package name Yaser Jaradeh 2025-01-13 12:34:28 +0100
  • ed8bf3a178
    Revert to base image and add python.h Yaser Jaradeh 2025-01-13 11:47:17 +0100
  • ad4dcb68df
    Merge branch 'huggingface:main' into fix/dockerfile-triton Yaser Jaradeh 2025-01-13 11:44:53 +0100
  • 880ab9c2f3
    Add Flash decoding kernel ROCm (#2855) Mohit Sharma 2025-01-13 15:42:35 +0530
  • 1660154ae6
    fix crash in torch2.6 if TP=1 (#2885) Wang, Yi 2025-01-13 18:11:31 +0800
  • 2e22164d4a
    Update using_guidance.md (#2901) Nicholas Broad 2025-01-13 02:09:35 -0800
  • 5ad8c9a40b Baichuan2-13B does not have max_position_embeddings in config see https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/config.json Wang, Yi A 2025-01-12 22:47:23 -0800
  • 000db62f7a
    Update using_guidance.md Nicholas Broad 2025-01-10 14:54:11 -0800
  • 6099421350 misc(ci): KAMEHAMEHA Morgan Funtowicz 2025-01-10 16:20:16 +0100
  • 83624a07be
    Add possible variants for A100 and H100 GPUs for auto-detecting flops (#2837) lazariv 2025-01-10 16:12:02 +0100
  • a78e16b6fb misc(ci): wtfinfini Morgan Funtowicz 2025-01-10 15:54:36 +0100
  • ed96760628 misc(ci): wtfinfini Morgan Funtowicz 2025-01-10 15:32:51 +0100
  • 446076a011 misc(ci): use commit HEAD instead of merge commit for image id Morgan Funtowicz 2025-01-10 15:19:31 +0100
  • d3893bc8ad misc(ci): wtf3 Morgan Funtowicz 2025-01-10 15:11:44 +0100
  • 01067f8ba8
    chore: Update jsonschema to 0.28.0 (#2870) Dmitry Dygalo 2025-01-10 06:01:54 -0800
  • 4f7e00f4ce
    Update to marlin-kernels 0.3.7 (#2882) Daniël de Kok 2025-01-10 12:43:44 +0100
  • 329b6fc375 misc(ci): wtf2 Morgan Funtowicz 2025-01-10 00:30:02 +0100
  • 17c04f49b7 misc(ci): wtf Morgan Funtowicz 2025-01-10 00:28:11 +0100
  • 81c33cff08 misc(ci): ok debug Morgan Funtowicz 2025-01-09 23:48:37 +0100
  • 5ac2b15c00 misc(ci): detect gha build Morgan Funtowicz 2025-01-09 23:17:46 +0100
  • 1179fdff7e misc(ci): detect gha build Morgan Funtowicz 2025-01-09 22:17:22 +0100
  • da5ab46705
    Improve vlm support (add idefics3 support) (#2437) drbh 2025-01-09 10:35:32 -0500
  • 4f9c6a5b08 Update to marlin-kernels 0.3.7 Daniël de Kok 2025-01-06 16:03:12 +0000
  • a9c7d2e3b6
    Basic flashinfer 0.2 support (#2862) Daniël de Kok 2025-01-09 16:25:00 +0100
  • 068520749c
    Merge branch 'main' into flash_decoding Wang, Yi 2025-01-09 21:27:43 +0800
  • 33299d1ee9
    Merge b27749eba7 into afb6c728d8 drbh 2025-01-09 13:13:45 +0000
  • dbe80ea323 misc(ci): detect dev profile for debug Morgan Funtowicz 2025-01-09 14:08:46 +0100
  • e9042d3ea1
    Add line-break in docker run for readability Alvaro Bartolome 2025-01-09 11:40:08 +0100
  • 31822daa2e
    Add line-break in docker run for readability Alvaro Bartolome 2025-01-09 11:38:12 +0100
  • 6aadb798bb misc(ci): correct right job name dependency Morgan Funtowicz 2025-01-09 11:21:32 +0100
  • cf846aa900 misc(ci): enforce sccache Morgan Funtowicz 2025-01-09 11:20:40 +0100
  • c7b2e3f100
    chore: Enable blocking feature for reqwest update-jsonschema Dmitry Dygalo 2025-01-09 11:07:49 +0100
  • afb6c728d8
    update ipex xpu to fix issue in ARC770 (#2884) Wang, Yi 2025-01-09 17:11:03 +0800
  • d37a43e581
    chore: fixed some typos and attribute issues in README (#2891) Ruida Zeng 2025-01-09 03:09:23 -0600
  • 59db9fe9d6 misc(ci): reintroduce variables Morgan Funtowicz 2025-01-09 09:53:58 +0100
  • 74788083b7 More output changes Daniël de Kok 2025-01-09 08:28:23 +0000
  • 1a95ee4cf1 misc(ci): let's actually setup sccache in the build.rs Morgan Funtowicz 2025-01-09 00:03:29 +0100
  • c4abc206f4 misc(ci):run the tests on GPU instance Morgan Funtowicz 2025-01-08 23:59:08 +0100
  • d7698e851b misc(ci): attempt to rebuild with sccache? Morgan Funtowicz 2025-01-08 23:56:19 +0100
  • ffb66f556d
    chore: fixed spelling mistakes in README Ruida Zeng 2025-01-08 16:52:19 -0600
  • 8bfeb4cd0d
    chore: fix minor grammar/capitalization Ruida Zeng 2025-01-08 16:51:53 -0600
  • 1415fd0244
    chore: fixed html repeated attribute in README Ruida Zeng 2025-01-08 16:48:17 -0600
  • 65ab4952ec misc(ci): install tensorrt_llm_executor_static Morgan Funtowicz 2025-01-08 22:46:17 +0100
  • eb4d34352d Fix some annoying perturbations Daniël de Kok 2025-01-08 14:49:55 +0000
  • f1f152c113 misc(ci): try the permission scoped again? Morgan Funtowicz 2025-01-08 15:11:01 +0100
  • daa397c515 fix: adjust FlashLlamaModel prefix logic drbh 2025-01-08 13:49:11 +0000
  • 7ce285deda misc(ci): try the permission above again? Morgan Funtowicz 2025-01-08 14:39:48 +0100
  • 51262756c0 misc(ci): try the permission above again? Morgan Funtowicz 2025-01-08 14:37:53 +0100
  • 4e6990267e flashinfer: fixup kv cache dtype Daniël de Kok 2025-01-08 12:21:50 +0000
  • 41948e240f Fix flashinfer install Daniël de Kok 2025-01-08 10:05:46 +0000
  • db6a9e1232 add ats support ci-update_xpu_image Wang, Yi A 2025-01-07 16:23:16 -0800
  • 78004db1e6 fix: adjust ruff lints and small refactors drbh 2025-01-07 22:25:38 +0000
  • d397748ca8 fix: improve prompt_split_image with ref to original impl drbh 2025-01-07 22:09:52 +0000
  • df504e9f1e fix: improve typing drbh 2025-01-07 22:06:54 +0000
  • 765ca78014 fix: clean up idefics 3 and improve prefix handling drbh 2025-01-07 22:05:47 +0000
  • a0e8c875c5 misc(ci): baby more time? Morgan Funtowicz 2025-01-07 17:39:29 +0100
  • 5fb03aec11 misc(ci): baby more time? Morgan Funtowicz 2025-01-07 17:38:34 +0100
  • 630995759b misc(ci): once more? Morgan Funtowicz 2025-01-07 17:36:22 +0100
  • f9ebcd9736 misc(ci): once more? Morgan Funtowicz 2025-01-07 17:22:29 +0100
  • ca7099c6c6 misc(ci): rights? Morgan Funtowicz 2025-01-07 17:16:08 +0100
  • 2894cbc6c9 misc(ci): rights? Morgan Funtowicz 2025-01-07 16:48:56 +0100
  • fde696a4a9
    Delete .github/workflows/1-build-trtllm2.yml Pauline Bailly-Masson 2025-01-07 15:30:25 +0100
  • 1d21d3427f
    Delete .github/workflows/1-test-trtllm2.yml Pauline Bailly-Masson 2025-01-07 15:30:13 +0100
  • 8d22a96947 misc(ci): use ci-build.yaml as main dispatcher Morgan Funtowicz 2025-01-07 15:24:28 +0100
  • 7b1f4bcb47
    Update 1-test-trtllm2.yml Pauline Bailly-Masson 2025-01-07 15:21:26 +0100
  • de72dae5c1
    Rename 1-build-trtllm2 to 1-build-trtllm2.yml Pauline Bailly-Masson 2025-01-07 15:19:10 +0100
  • 66d6611bae
    Update 1-test-trtllm2.yml Pauline Bailly-Masson 2025-01-07 15:18:33 +0100
  • f84891efea misc(ci): fw secrets Morgan Funtowicz 2025-01-07 15:17:48 +0100
  • 4ed9938ad2
    Rename test-trtllm.yml to 1-test-trtllm2.yml Pauline Bailly-Masson 2025-01-07 15:17:49 +0100
  • 64b42656f0
    Rename build-trtllm2 to 1-build-trtllm2 Pauline Bailly-Masson 2025-01-07 15:17:26 +0100
  • 0eeedbe367
    Create build-trtllm2 Pauline Bailly-Masson 2025-01-07 15:16:45 +0100
  • 38401a9deb
    Update test-trtllm.yml Pauline Bailly-Masson 2025-01-07 15:11:43 +0100
  • b65b9b5a45
    Create test-trtllm.yml Pauline Bailly-Masson 2025-01-07 14:57:03 +0100
  • ccafee0814 misc(ci): give the wf a name Morgan Funtowicz 2025-01-07 11:53:00 +0100
  • 5bc1245bd2 misc(ci): make the wf callable for reuse (bis) Morgan Funtowicz 2025-01-07 11:52:18 +0100
  • 40d299f0c7 misc(ci): make the wf callable for reuse (bis) Morgan Funtowicz 2025-01-07 11:51:38 +0100
  • 50de9eda85 misc(ci): make the wf callable for reuse Morgan Funtowicz 2025-01-07 11:48:42 +0100
  • 541618e44c misc(ci): use bin folder Morgan Funtowicz 2025-01-07 00:22:57 +0100
  • 99e226a2ae misc(ci): go Morgan Funtowicz 2025-01-06 23:46:59 +0100
  • 7c05f22b4a misc(ci): go Morgan Funtowicz 2025-01-06 23:22:55 +0100
  • bd0052e568 misc(ci): go Morgan Funtowicz 2025-01-06 22:59:19 +0100
  • d9f3a9f0a3 misc(ci): gogogo Morgan Funtowicz 2025-01-06 22:23:39 +0100
  • 1565c8c0f2 misc(ci): gogogo Morgan Funtowicz 2025-01-06 16:51:19 +0100
  • ccf18b33a2 misc(ci): gogogo Morgan Funtowicz 2025-01-06 16:24:10 +0100
  • 2a6d88d992 misc(ci): gogogo Morgan Funtowicz 2025-01-06 15:35:59 +0100
  • 0c972cdd28 misc(ci): gogogo Morgan Funtowicz 2025-01-06 15:23:11 +0100