Commit Graph

  • 67c0ff67ee
    Without pytest. Nicolas Patry 2024-10-02 17:42:07 +0200
  • bf4f08e06d
    No device requests Nicolas Patry 2024-10-02 17:38:01 +0200
  • 27d64fe7e1
    DEvice request, no container name Nicolas Patry 2024-10-02 17:36:35 +0200
  • d22b0c1fbe
    Unroll notify error into generate response (#2597) drbh 2024-10-02 11:34:57 -0400
  • cf92174925
    Tmp2 Nicolas Patry 2024-10-02 17:32:24 +0200
  • 356079e55d
    Tmp Nicolas Patry 2024-10-02 17:30:17 +0200
  • ddd44ce8c5
    A un moment donné. Nicolas Patry 2024-10-02 17:27:54 +0200
  • d5fca1edc6
    Removing all variables ? Nicolas Patry 2024-10-02 17:25:11 +0200
  • df87a771d3
    No device. Nicolas Patry 2024-10-02 17:20:45 +0200
  • 525f3a61db
    Busybox in python Nicolas Patry 2024-10-02 17:17:40 +0200
  • 130b481b31
    Starting hello Nicolas Patry 2024-10-02 17:16:16 +0200
  • 8e4ba60e6b
    Launcher tgi Nicolas Patry 2024-10-02 17:09:45 +0200
  • 4e120c064b
    busybox Nicolas Patry 2024-10-02 17:07:15 +0200
  • 92189e821d
    Tmp Nicolas Patry 2024-10-02 17:04:12 +0200
  • 2335459556
    CI (2592): Allow LoRA adapter revision in server launcher (#2602) drbh 2024-10-02 10:51:04 -0400
  • 0204946d26
    Max token capacity metric (#2595) Nicolas Patry 2024-10-02 16:32:36 +0200
  • 8dd2006cd8
    Sleep ? Nicolas Patry 2024-10-02 16:29:42 +0200
  • 38c39b00e2
    Remove some stuff. Nicolas Patry 2024-10-02 16:07:33 +0200
  • 2c12e937ce
    Trying a few things. Nicolas Patry 2024-10-02 15:49:45 +0200
  • 151a2a8104 fix: run formatter David Holtz 2024-10-02 13:15:06 +0000
  • fdfef067ad fix: adjust linting in test file David Holtz 2024-10-02 12:42:25 +0000
  • 50d239ba8f revert pytorch Mohit Sharma 2024-10-02 12:34:56 +0000
  • 6466be8365 fix: improve docs and indicate change in expected response David Holtz 2024-10-02 12:07:25 +0000
  • d30266dc3c
    fix: add created_at field to results Hugo Larcher 2024-10-02 13:55:01 +0200
  • fe8a373831
    Enhancements to README (#226) Mohit Deopujari 2024-10-02 03:22:33 -0700
  • 337e2045bf
    lint Wauplin 2024-10-02 12:14:39 +0200
  • f9abe4cd40
    Remove tmate. Nicolas Patry 2024-10-02 12:05:00 +0200
  • aae6db9cd0
    Update ToolType input schema Wauplin 2024-10-02 12:01:10 +0200
  • ab7d83b4c1
    Good old tmate. Nicolas Patry 2024-10-02 11:42:27 +0200
  • d18ed5cfc5
    Mllama flash version (#2585) Nicolas Patry 2024-10-02 11:22:13 +0200
  • e164177ff6
    Upgrade to 0.5.0 Nicolas Patry 2024-10-02 10:55:20 +0200
  • 0437f8881a
    Move to hf tgi-nix Nicolas Patry 2024-10-02 10:48:02 +0200
  • 4e45e7d585
    Othername to avoid recursive directories. Nicolas Patry 2024-10-02 10:38:31 +0200
  • e64a07a2d0
    mnt2 ? Nicolas Patry 2024-10-02 10:37:56 +0200
  • 1ee36dc7f3
    Upgrade the flake to latest transformers/tokenizers Nicolas Patry 2024-10-02 10:29:50 +0200
  • 9e658fba92
    Removed dead code. Nicolas Patry 2024-10-01 15:37:01 +0200
  • 29813e2bd0
    Remove dead code. Nicolas Patry 2024-10-01 15:32:44 +0200
  • f58195d1bc
    Update integration test after switch to bf16. Nicolas Patry 2024-10-01 11:10:44 +0200
  • d735e46ef5
    Default dtype bfloat16. Nicolas Patry 2024-10-01 10:52:19 +0200
  • 7ede61bca6
    Force ignore all images but last. Nicolas Patry 2024-10-01 10:18:41 +0200
  • 265715a4f7
    remove log. Nicolas Patry 2024-09-30 15:53:11 +0200
  • e5476dc04c
    Fix vlm ? Nicolas Patry 2024-09-30 15:45:02 +0200
  • d9fecec000
    Earlier assert. Nicolas Patry 2024-09-30 15:04:37 +0200
  • 933060cc3f
    Updating model link. Nicolas Patry 2024-09-30 11:53:21 +0200
  • af677caf4f
    Integrations tests for mllama (cutting to 10 tokens because there seems' to be instability after (meaning size of the batch matters. Nicolas Patry 2024-09-28 22:41:07 +0200
  • 2ac607a215
    Working state. Nicolas Patry 2024-09-28 22:10:10 +0200
  • ef4fa3ea7c
    Starting to get there. Nicolas Patry 2024-09-28 00:38:23 +0200
  • 85771989d6
    Flashing mllama. Nicolas Patry 2024-09-27 17:17:14 +0200
  • 2441142c8b
    Ugrade transformers 4.45 Nicolas Patry 2024-09-25 20:46:44 +0200
  • fc26734472
    Mllama Nicolas Patry 2024-09-25 20:41:40 +0200
  • 1014a06866
    Updating config, removing TODO Nicolas Patry 2024-09-25 10:48:03 +0200
  • fa02c037d7
    Fix idefics. Nicolas Patry 2024-09-23 16:13:22 +0200
  • e55067a475
    Cleaner condition. Nicolas Patry 2024-09-23 15:50:43 +0200
  • 2ab02dc734
    Working state ? (Broke idefics1 temporarily). Nicolas Patry 2024-09-23 15:25:26 +0200
  • a9c278aea1
    Preprocessing. Nicolas Patry 2024-09-18 17:59:13 +0200
  • 79b9df8434
    Working loading state. Nicolas Patry 2024-09-18 17:01:36 +0200
  • 584b4d7a68
    nix: experimental support for building a Docker container (#2470) Daniël de Kok 2024-10-01 18:02:06 +0200
  • 2980720af4
    fix: Update runners group Hugo Larcher 2024-09-30 18:00:54 +0200
  • 04b409e7ae fix: improve test to avoid notify_error David Holtz 2024-10-01 15:30:59 +0000
  • 0a56344ac0
    More quotes. Nicolas Patry 2024-10-01 17:16:43 +0200
  • 5596981584
    Other dockerfile. Nicolas Patry 2024-09-30 09:53:16 +0200
  • 7cb6abdf2f
    Other dockerfile. nix/docker2 Nicolas Patry 2024-09-30 09:53:16 +0200
  • 41e674eea9
    Different approach, only listen on stdin when LOG_LEVEL=debug (which is where dropping to a debugger is important). Nicolas Patry 2024-10-01 17:02:33 +0200
  • 69e3bddf8c
    Remove tailscale. Nicolas Patry 2024-10-01 16:37:44 +0200
  • cabb654fbb
    Add description for the metrics Nicolas Patry 2024-10-01 16:36:43 +0200
  • 6e7189f2aa
    Adding max capacity metric. Nicolas Patry 2024-10-01 16:34:13 +0200
  • 3b0b66bedb
    added tgi to name of metric Edwinhr716 2024-07-22 20:38:07 +0000
  • 3e76ba6599
    adding max_token_capacity_metric Edwinhr716 2024-07-22 18:21:34 +0000
  • 9d08d3494d
    bash spaces. Nicolas Patry 2024-10-01 16:00:03 +0200
  • f2b1429b9b
    ? Nicolas Patry 2024-10-01 15:57:20 +0200
  • 290bd2ec4c
    TRying the tailscale action quickly. Nicolas Patry 2024-10-01 15:54:02 +0200
  • a97931f3d8
    Only run 1 valid test. ci_amd Nicolas Patry 2024-10-01 15:30:55 +0200
  • 91656ff7a1
    Fix group. ci_amd4 Nicolas Patry 2024-10-01 15:26:22 +0200
  • eba27cff0f
    Attempt #4 Nicolas Patry 2024-10-01 15:25:10 +0200
  • 098adea9c4
    Merge branch 'huggingface:main' into fix-revision teamclouday 2024-09-30 21:58:21 -0400
  • 98dcc9beac allow revision for lora adapters from launcher Sida 2024-09-30 21:57:07 -0400
  • 1c84a30fe6
    MoE Marlin: support desc_act for groupsize != -1 (#2590) Daniël de Kok 2024-09-30 19:40:25 +0200
  • fc7dcb0ba6
    feat: Add automatic nightly benchmarks Hugo Larcher 2024-09-30 17:53:05 +0200
  • 90968ce640 MoE Marlin: support desc_act for groupsize != -1 Daniël de Kok 2024-09-30 15:42:23 +0000
  • 88121b3e4e fix: expect simple message when no tool is selected drbh 2024-09-30 15:38:49 +0200
  • dfc4559017 feat: unroll notify_error if no tool is choosen drbh 2024-09-30 15:34:43 +0200
  • f7728565b1 (feat) fp8 fnuz support for rocm Mohit Sharma 2024-09-30 12:56:00 +0000
  • 8cc2febdb6 (fix) quantize=fp8 fp8_rocm Mohit Sharma 2024-09-30 12:07:38 +0000
  • 8ee9823d3b (feat) fp8 fnuz support for rocm Mohit Sharma 2024-09-30 11:43:45 +0000
  • d1f257ac56
    Move flake back to tgi-nix main (#2586) Daniël de Kok 2024-09-30 11:39:41 +0200
  • ae929afbdf Move flake back to tgi-nix main Daniël de Kok 2024-09-30 09:27:24 +0000
  • 93a7042d7e
    feat: support phi3.5 moe (#2479) drbh 2024-09-30 11:15:09 +0200
  • 90a1d04a2f
    Add support for GPTQ-quantized MoE models using MoE Marlin (#2557) Daniël de Kok 2024-09-30 11:14:32 +0200
  • f9e561eced
    Update ROCM libs and improvements (#2579) Mohit Sharma 2024-09-30 14:24:32 +0530
  • 016cf4ec1d Ruff. Nicolas Patry 2024-09-25 14:44:20 +0200
  • e3e483c901 Add the usual model tests Daniël de Kok 2024-09-25 10:27:29 +0000
  • 245d6d8f7c Some type annotations Daniël de Kok 2024-09-25 09:31:19 +0000
  • 0cafbf3b54 Add support for dense MoE Daniël de Kok 2024-09-25 08:47:52 +0000
  • c07c80fac9 Use SparseMoELayer Daniël de Kok 2024-09-19 12:28:29 +0000
  • f4cadd7527 Vendor configuration so that we don't have to trust_remote_code Daniël de Kok 2024-09-18 15:28:35 +0000
  • c2c3e72eb1 fix: revert greedy by default and test changes drbh 2024-09-11 13:19:17 +0000
  • dad070b1fc fix: consolidate long rope paths drbh 2024-09-06 14:17:56 +0000
  • b1026a84cb fix: small typo adjustments drbh 2024-09-03 17:36:47 +0000
  • b4e7601fbe fix: prefer do_sample false unless temp is set by user, and update chat tests drbh 2024-09-03 17:32:11 +0000
  • d3565552af fix: rerun lint for openapi docs drbh 2024-09-02 19:40:50 +0000