Commit Graph

  • 6d4ac297a9 misc(backend): kthxbye retry s3 Morgan Funtowicz 2024-12-17 12:42:49 +0100
  • d0108b4d16 misc(backend): WWWWWWWWWWWWWAAAAAAAA Morgan Funtowicz 2024-12-17 12:29:33 +0100
  • b8d755e806 misc(backend): ok let's debug smtg Morgan Funtowicz 2024-12-17 12:14:46 +0100
  • 7f9b2232ca misc(backend): make sure to correctly set IS_GHA_BUILD=true in wf Morgan Funtowicz 2024-12-17 10:53:42 +0100
  • fd039b6aac misc(backend): missing env directive Morgan Funtowicz 2024-12-17 10:41:36 +0100
  • 71311bedae misc(backend): let's try with GHA Morgan Funtowicz 2024-12-17 10:40:51 +0100
  • 783a057ac0 misc(backend): once more? Morgan Funtowicz 2024-12-16 16:03:07 +0100
  • f1986c0bea misc(backend): test with TGI S3 conf Morgan Funtowicz 2024-12-16 15:51:03 +0100
  • 5d5524d680 misc(backend): test with TGI S3 conf Morgan Funtowicz 2024-12-16 15:47:51 +0100
  • 253116ef8e misc(ci): WAT Morgan Funtowicz 2024-12-13 09:46:13 +0100
  • ba738e23e1 misc(ci): WAT Morgan Funtowicz 2024-12-13 09:43:22 +0100
  • 425f0bf4e5 misc(ci): WAT Morgan Funtowicz 2024-12-12 16:42:38 +0100
  • 5fbab2754b misc(ci): WAT Morgan Funtowicz 2024-12-12 16:14:48 +0100
  • 88884f92c9 misc(ci): WAT Morgan Funtowicz 2024-12-12 16:08:01 +0100
  • 55c92d0234 misc(ci): do not build with ssl enabled Morgan Funtowicz 2024-12-12 15:57:11 +0100
  • b43fe7efb2 misc(ci): lets actually use sccache ... Morgan Funtowicz 2024-12-12 15:38:44 +0100
  • 2737416176 misc(ci): add debug profile Morgan Funtowicz 2024-12-12 14:52:12 +0100
  • f9395008b5 misc(ci): add debug profile Morgan Funtowicz 2024-12-12 14:34:30 +0100
  • dc34f5a70e misc(ci): again Morgan Funtowicz 2024-12-12 14:24:33 +0100
  • bdab3bbdb5 misc(ci): again Morgan Funtowicz 2024-12-12 13:11:54 +0100
  • ea7cf3aba8 misc(ci): let's try this way Morgan Funtowicz 2024-12-12 13:00:13 +0100
  • 0aa49a1674 misc(ci): export aws creds as output of step Morgan Funtowicz 2024-12-12 12:48:58 +0100
  • 3f8dc96caf misc(ci): provide mecanism to cache inside container Morgan Funtowicz 2024-12-12 12:45:14 +0100
  • 7db90f1c5f misc(ci): let's try to build the Dockerfile for trtllm Morgan Funtowicz 2024-12-12 11:50:39 +0100
  • 119a40ca11 misc(ci): update Rust action toolchain Morgan Funtowicz 2024-12-12 09:40:07 +0100
  • 0ab1dd8479 misc(ci): enabe building tensorrt-llm Morgan Funtowicz 2024-12-12 09:29:14 +0100
  • cb8fdde103 feat(trtllm): fix logits retrieval Morgan Funtowicz 2024-12-10 23:28:13 +0100
  • 0baa0173a3 feat(trtllm): expose finish reason to Rust Morgan Funtowicz 2024-12-10 16:51:22 +0100
  • f729f2c59b test(ctest) enable address sanitizer Morgan Funtowicz 2024-11-19 00:17:10 +0100
  • a2fe842795
    Simplify with monkey patch Cyril Vallez 2025-01-20 11:52:58 +0100
  • 2659b5998b
    raise error if needed Cyril Vallez 2025-01-20 11:29:51 +0100
  • 447a5b2f87
    Fixing TRTLLM dockerfile. (#2922) Nicolas Patry 2025-01-20 11:13:46 +0100
  • b2f82510f6
    Unwanted files. Nicolas Patry 2025-01-17 18:50:37 +0100
  • b7c963d33c
    Modifying this should cache hit. Nicolas Patry 2025-01-17 18:49:09 +0100
  • c44511220d
    Revert "Modifying this should cache hit." Nicolas Patry 2025-01-17 18:49:49 +0100
  • 46a2bde108
    Modifying this should cache hit. Nicolas Patry 2025-01-17 18:49:09 +0100
  • 176f7839a6
    Removing the cache directive. Nicolas Patry 2025-01-17 18:33:09 +0100
  • c567676cb6
    Creating a dummy modification to chekc CI runs. Nicolas Patry 2025-01-17 18:19:03 +0100
  • 630f198624
    flashinfer: switch to plan API (#2904) Daniël de Kok 2025-01-17 18:18:02 +0100
  • 8f6146f11a
    Revert "feat: improve qwen2-vl startup " (#2924) drbh 2025-01-17 12:09:05 -0500
  • 009a95aee7
    Revert "feat: improve qwen2-vl startup (#2802)" drbh 2025-01-17 12:06:45 -0500
  • f01014de37
    fix compatibility version issue Cyril Vallez 2025-01-17 17:04:56 +0000
  • eecca27113
    feat: improve qwen2-vl startup (#2802) drbh 2025-01-17 11:50:41 -0500
  • 42ae6dea02
    Remove Warpers for Processor Cyril Vallez 2025-01-17 16:39:49 +0000
  • 17192c9a0e fix: remove test debug params enable-qwen2vl-video drbh 2025-01-17 16:19:02 +0000
  • b40c889360
    small fix Cyril Vallez 2025-01-17 16:05:47 +0000
  • b03d7ae951
    Update based on transformers PR Cyril Vallez 2025-01-17 15:34:08 +0000
  • 6e982f43a1
    fix the crash of meta-llama/Llama-3.2-1B (#2918) Wang, Yi 2025-01-17 22:50:58 +0800
  • 2715792a2c
    Fixed. Nicolas Patry 2025-01-17 15:50:12 +0100
  • b4187d6022 Add tgi_batch_current_size and tgi_batch_current_size as response header response-header-metrics Corentin REGAL 2025-01-17 15:48:02 +0100
  • 3e4ca5032b
    Apply suggestions from code review Nicolas Patry 2025-01-17 15:43:34 +0100
  • c20025dbf7
    Add fp8 kv cache for ROCm (#2856) Mohit Sharma 2025-01-17 18:43:29 +0530
  • ac62bd1572
    improve type hints + required args Cyril Vallez 2025-01-17 13:09:52 +0000
  • 32488c1a11
    remove flag Cyril Vallez 2025-01-17 12:26:51 +0000
  • 2672cae576
    Fixing TRTLLM dockerfile. Nicolas Patry 2025-01-17 12:38:58 +0100
  • de19e7e844
    Moving to uv instead of poetry. (#2919) Nicolas Patry 2025-01-17 12:32:00 +0100
  • 0ce2dff9f9
    flashinfer: switch to plan API Daniël de Kok 2025-01-13 09:34:24 +0000
  • d61f14f271
    nix: update to PyTorch 2.5.1 (#2921) Daniël de Kok 2025-01-17 12:12:11 +0100
  • 885144166f
    Flash decoding kernel adding and prefill-chunking and prefix caching enabling in intel cpu/xpu (#2815) Wang, Yi 2025-01-17 19:04:57 +0800
  • 5b10e5bccf remove bookkeeping field Mohit Sharma 2025-01-17 10:21:34 +0000
  • 8ffb5b3697 Merge branch 'main' into fp8_kvcache_rocm Mohit Sharma 2025-01-17 08:53:31 +0000
  • bde5f9ad82 nix: update to PyTorch 2.5.1 nix/pytorch-2.5.1 Daniël de Kok 2025-01-17 06:44:21 +0000
  • 82f6ea1b71
    feat: improve star coder to support multi lora layers (#2883) drbh 2025-01-16 16:23:55 -0500
  • b6c738b8df fix: bump adapter test for added later names drbh 2025-01-16 19:47:01 +0000
  • ca39f26f27 fix: rerun pre commit lints drbh 2025-01-16 18:22:02 +0000
  • 37f92f2c04 fix: tweak param types drbh 2025-01-16 17:42:52 +0000
  • 6706d8d139 fix: bump snapshot for added tests drbh 2025-01-16 17:31:19 +0000
  • ab4693cfd8
    Editable it is. Nicolas Patry 2025-01-16 18:22:35 +0100
  • 78cd756caf fix: improve video processing and update unsupported paths drbh 2025-01-16 17:20:27 +0000
  • 3ffd2ce813
    Non editable? Nicolas Patry 2025-01-16 18:13:11 +0100
  • 93c4a046d6
    Editable install necessary ? Nicolas Patry 2025-01-16 17:57:06 +0100
  • d279f5cd1c
    TRying to check that pb is imported correctly. Nicolas Patry 2025-01-16 17:46:56 +0100
  • b12ba27684
    Trying to force-include this pb folder. Nicolas Patry 2025-01-16 17:26:03 +0100
  • d7c1cca44a
    --system is redundant. Nicolas Patry 2025-01-16 17:13:42 +0100
  • d611f0f5e2 feat: improve weight that support adapters and add tests for starcoder with lora drbh 2025-01-13 21:53:11 +0000
  • 31778a6508 feat: improve star coder to support multi lora layers drbh 2025-01-07 00:21:58 +0000
  • bd59f96135 feat: update to run qwen2-vl tests drbh 2025-01-14 22:15:04 +0000
  • 822bd045e5 fix: address image resize and rebase changes drbh 2024-12-09 21:25:13 +0000
  • 45e5c2c266 feat: adjust rotary embed and avoid cuda graphs of size 2 and smaller drbh 2024-12-06 00:54:20 +0000
  • 1bcfba305b feat: tokenize each request individually and increase warmup image size drbh 2024-12-05 10:58:37 -0500
  • 3790dde61b
    Monkeying this... Nicolas Patry 2025-01-16 16:14:23 +0100
  • b67ca525db
    docker install on system Nicolas Patry 2025-01-16 15:19:19 +0100
  • 418b058d34
    Add the cli entry point. Nicolas Patry 2025-01-16 15:17:08 +0100
  • 58a277da49
    Install system ? Nicolas Patry 2025-01-16 15:01:15 +0100
  • df26ff1f13
    Fixing the test by activating the environment ? Nicolas Patry 2025-01-16 14:16:41 +0100
  • 4982b9c765
    Fix ? Nicolas Patry 2025-01-16 14:11:41 +0100
  • 1987291593
    Create the venv. Nicolas Patry 2025-01-16 14:02:48 +0100
  • 73ab6a5da1
    Creating venv if not created. Nicolas Patry 2025-01-16 13:52:28 +0100
  • 9177fbfda6
    Moving to uv instead of poetry. Nicolas Patry 2025-01-16 13:43:16 +0100
  • 5f78ec32a5
    Do not convert weight scale to e4m3fnuz on CUDA (#2917) Daniël de Kok 2025-01-16 13:44:32 +0100
  • 0a48e5624c fix the crash of meta-llama/Llama-3.2-1B Wang, Yi A 2025-01-16 04:23:16 -0800
  • f951a8b48d Do not convert weight scale to e4m3fnuz on CUDA Daniël de Kok 2025-01-16 11:06:11 +0000
  • 9935a9ff1f
    Merge TRTLLM in standard CI Hugo Larcher 2025-01-16 00:59:28 +0100
  • 922cc38fbc
    Upgrading bitsandbytes. (#2910) Nicolas Patry 2025-01-15 20:07:21 +0100
  • 120bd3e3bb
    Removing the github runner. (#2912) Nicolas Patry 2025-01-15 19:20:44 +0100
  • 266377b328
    simplify check Cyril Vallez 2025-01-15 18:05:55 +0000
  • ef23cfd013
    Removing the github runner. Nicolas Patry 2025-01-15 18:54:53 +0100
  • ca6e3d981b
    Tighter lock. Nicolas Patry 2025-01-15 18:49:20 +0100
  • 1470aec9d9
    Fix typo in TPU docs (#2911) Baptiste Colle 2025-01-15 18:32:07 +0100
  • f4434dee1d docs(tpu): fix typo Baptiste Colle 2025-01-15 17:44:11 +0100