Commit Graph

  • 5a82778e0f Add reference to TPU support (#1760) Brandon Royal 2024-04-30 05:39:52 -0400
  • 6ca39843b4 Small CI cleanup. (#1801) Nicolas Patry 2024-04-30 11:39:38 +0200
  • f66b8141c6 Add the missing tool_prompt parameter to Python client (#1825) Maziyar Panahi 2024-04-30 11:07:17 +0200
  • 56c65fad28 Prepare release. Nicolas Patry 2024-04-30 10:52:37 +0200
  • 3f3a1a6a66 Better graceful shutdown. (#1827) Nicolas Patry 2024-04-29 17:23:40 +0200
  • 88dc44994f Changing the waiting_served_ratio default (stack more aggressively by default). (#1820) Nicolas Patry 2024-04-28 17:54:19 +0200
  • 7641cda775 Dummy CI run. (#1817) Nicolas Patry 2024-04-26 19:19:55 +0200
  • a788888619 Fixing qwen2. (#1818) Nicolas Patry 2024-04-26 19:19:08 +0200
  • 388af49916 Blunder (#1815) Nicolas Patry 2024-04-26 15:51:09 +0200
  • 62a83fd800 add intel xpu support for TGI (#1475) Wang, Yi 2024-04-26 21:48:58 +0800
  • 432961324e Adding new env variables for TPU backends. (#1755) Nicolas Patry 2024-04-26 15:44:44 +0200
  • 85dfc39222
    Add Phi-3 medium support (#2039) Daniël de Kok 2024-06-10 09:22:29 +0200
  • 9b3674d903
    ROCm and sliding windows fixes (#2033) fxmarty 2024-06-10 09:09:50 +0200
  • 326038d0ef 2nd round of benchmark modifications (tiny adjustements to avoid overloading the host). (#1816) Nicolas Patry 2024-04-26 15:39:00 +0200
  • 2ed6242816 Use the generation config. (#1808) Nicolas Patry 2024-04-25 19:41:50 +0200
  • ddef3c98d3 Update guidance docs to reflect grammar support in API (#1775) dr3s 2024-04-25 13:11:26 -0400
  • 3bed10199d Updating the benchmarks so everyone uses openai compat layer. (#1800) Nicolas Patry 2024-04-25 15:42:17 +0200
  • ab59a5e346 feat: improve temperature logic in chat (#1749) drbh 2024-04-25 09:31:35 -0400
  • 5d36a5e368 Adding support for HF_HUB_OFFLINE support in the router. (#1789) Nicolas Patry 2024-04-23 23:38:30 +0200
  • ec85883703 fix: avoid frequency and repetition penalty on padding tokens (#1765) drbh 2024-04-23 17:19:16 -0400
  • 57b31f410d Idefics2. (#1756) Nicolas Patry 2024-04-23 23:04:44 +0200
  • 4f8ca6049e Phi3 support (#1797) Nicolas Patry 2024-04-23 18:40:05 +0200
  • 428d0e618a feat: allow null eos and bos tokens in config (#1791) drbh 2024-04-23 10:26:54 -0400
  • e36c4d22f3 Add attribute descriptions for GenerateParameters (#1798) Lucain 2024-04-23 16:22:12 +0200
  • 147584e35b fix typos in docs and add small clarifications (#1790) Moritz Laurer 2024-04-22 18:15:48 +0200
  • 5b162c7026 Make --cuda-graphs work as expected (bis) (#1768) fxmarty 2024-04-22 16:09:19 +0200
  • 41699e9bbf
    . ci_amd2 Nicolas Patry 2024-06-08 22:16:37 +0200
  • eec6c3241b
    . Nicolas Patry 2024-06-08 21:55:27 +0200
  • 0ced5fac2d
    Fix. Nicolas Patry 2024-06-08 08:58:05 +0200
  • 452d442ef2
    We need tailscale. Nicolas Patry 2024-06-08 08:46:55 +0200
  • e62c51d140
    Here we go again. Nicolas Patry 2024-06-08 08:41:40 +0200
  • 8be9c197e5
    Is this it ? Nicolas Patry 2024-06-08 07:54:00 +0200
  • d9f704a1b3
    Are we done ? Nicolas Patry 2024-06-08 07:53:21 +0200
  • 909e6569d1
    . Nicolas Patry 2024-06-08 07:40:08 +0200
  • fa3e811672
    No fromJSON. Nicolas Patry 2024-06-07 23:22:48 +0200
  • 98d383062a
    Extra spaces? Nicolas Patry 2024-06-07 23:15:58 +0200
  • 66e59831f2
    . Nicolas Patry 2024-06-07 23:00:27 +0200
  • 741ab87fba
    fromJSON Nicolas Patry 2024-06-07 22:58:28 +0200
  • fc4404d9d2
    . Nicolas Patry 2024-06-07 22:45:57 +0200
  • 65b2efc585
    . Nicolas Patry 2024-06-07 22:38:06 +0200
  • eda299b84f
    . Nicolas Patry 2024-06-07 20:18:57 +0200
  • e79c83d7ba
    Attempt #727. Nicolas Patry 2024-06-07 20:11:17 +0200
  • c6fa9547a2
    Test. Nicolas Patry 2024-06-07 19:58:56 +0200
  • 75ec1a2661 Add support for Phi-3-medium Daniël de Kok 2024-06-07 19:35:31 +0200
  • a045ead6eb
    . Nicolas Patry 2024-06-07 19:52:14 +0200
  • 5e769ce1e0
    ? Nicolas Patry 2024-06-07 19:46:34 +0200
  • 387a767dca
    feat(ci): add trufflehog secrets detection Luc Georges 2024-06-07 17:52:17 +0200
  • 979b670366 is it flaky? fxmarty 2024-06-07 15:45:08 +0000
  • 87df3d5603
    ? Nicolas Patry 2024-06-07 17:12:17 +0200
  • 19f6327bd2
    esac. Great idea dev of the past. Nicolas Patry 2024-06-07 16:14:24 +0200
  • 2a314fa0dd
    Bash in bash. Nicolas Patry 2024-06-07 16:09:38 +0200
  • b10ba9205c
    ... Nicolas Patry 2024-06-07 16:05:11 +0200
  • 1f4248944c
    Come on GH, dash, underscore, who cares at this point. Nicolas Patry 2024-06-07 16:03:05 +0200
  • cc7c2fd90e
    runs on. Nicolas Patry 2024-06-07 16:01:59 +0200
  • 1e759f9da6
    Wat? Nicolas Patry 2024-06-07 16:00:40 +0200
  • 078fb55109
    Abbé Faria? Nicolas Patry 2024-06-07 15:58:23 +0200
  • 8205962950
    Ahah, I see an exit. Nicolas Patry 2024-06-07 15:56:52 +0200
  • 043de74dcd
    **Feigns death** Nicolas Patry 2024-06-07 15:52:35 +0200
  • 81ddb9d173
    Please let me out ! Nicolas Patry 2024-06-07 15:49:31 +0200
  • aea77a8ab3
    Banana. Nicolas Patry 2024-06-07 15:44:51 +0200
  • e6a4dbe7f5
    I'm an certainly not a monkey. Nicolas Patry 2024-06-07 15:43:58 +0200
  • a759e2e7c5
    Not hitting myself against the wall. Nicolas Patry 2024-06-07 15:39:37 +0200
  • 8712a367dc
    Flying blind feels nice. Nicolas Patry 2024-06-07 15:36:13 +0200
  • 6f3117512c
    Give us sanitation tools already. Nicolas Patry 2024-06-07 15:25:43 +0200
  • 54e3340663 gh.. Nicolas Patry 2024-06-07 15:09:27 +0200
  • 11c75f3a14 I hate this. Nicolas Patry 2024-06-07 15:07:51 +0200
  • 3a8e9c221e Rename for everyone. Nicolas Patry 2024-06-07 15:03:01 +0200
  • f29371e587 Naming. Nicolas Patry 2024-06-07 14:49:48 +0200
  • b8ac9ba752 precise comment fxmarty 2024-06-07 12:43:33 +0000
  • fb5487d00c dead code fxmarty 2024-06-07 12:38:56 +0000
  • b884d2b9b8 address review fxmarty 2024-06-07 12:36:32 +0000
  • 3ee92eb614 ? Nicolas Patry 2024-06-07 14:15:45 +0200
  • 3684439a0e Trying new split of tasks. Nicolas Patry 2024-06-07 12:03:22 +0200
  • 4220423d76 fix sliding window fxmarty 2024-06-07 08:55:01 +0000
  • 9101b2ae4f Fix. Nicolas Patry 2024-06-07 10:05:51 +0200
  • c73355b99c
    Merge branch 'main' into ci_amd2 Nicolas Patry 2024-06-07 10:04:59 +0200
  • c8128c794d Let's iterate a bit faster. Nicolas Patry 2024-06-07 09:50:43 +0200
  • 97af55b7ef Inject slugs Nicolas Patry 2024-06-07 09:10:38 +0200
  • bf3c813782 server: use chunked inputs Daniël de Kok 2024-05-31 11:51:42 +0000
  • 724fa6fe0e AMD CI. Nicolas Patry 2024-06-07 06:40:04 +0200
  • 9376648c4f Checkout. Nicolas Patry 2024-06-07 06:38:13 +0200
  • 633e2bd4d8 server: use chunked inputs Daniël de Kok 2024-05-31 11:51:42 +0000
  • fa05db296a Fix integration-tests config for docker runt . Nicolas Patry 2024-06-06 19:44:22 +0200
  • 4dabddb7ea
    Xpu gqa (#2013) Wang, Yi 2024-06-07 01:12:57 +0800
  • 81704d28e8 Putting the fix for vllm for CIt Nicolas Patry 2024-06-06 19:11:51 +0200
  • 512ed5ca4c Enabling CI for AMD with new runner.. Nicolas Patry 2024-06-06 19:07:48 +0200
  • 9765658212 Revert "Enabling CI for AMD with new runner.." Nicolas Patry 2024-06-06 19:08:16 +0200
  • 101ac9a760 Enabling CI for AMD with new runner.. Nicolas Patry 2024-06-06 19:07:48 +0200
  • ed1cfde0d8
    Internal runner ? (#2023) Nicolas Patry 2024-06-06 18:51:42 +0200
  • b3309abb87
    Merge pull request #154 from kdamaszk/rebase-tgi-2-0-1 regisss 2024-06-06 17:29:02 +0200
  • 51621439a4 marlin: improve build Daniël de Kok 2024-06-06 11:25:56 +0000
  • 0d96468ebb marlin: support tp>1 when group_size==-1 Daniël de Kok 2024-06-06 11:51:52 +0000
  • 0d9b2f2541 enable tunableop by default fxmarty 2024-06-06 14:02:50 +0000
  • c36c7ec83b fix bug where tunableop is bound to cuda graph even when cuda graph are disabled fxmarty 2024-06-06 13:53:43 +0000
  • 35d1946e67 update commit fxmarty 2024-06-06 12:38:43 +0000
  • e9b9a96ba8 update fxmarty 2024-06-06 12:35:18 +0000
  • ff7ebf4aca Revert "Using CPU to build the images (caveat: Waiting on all 3 builds before" Nicolas Patry 2024-06-06 14:34:50 +0200
  • ab7578b9c0 Using CPU to build the images (caveat: Waiting on all 3 builds before integration tests). Nicolas Patry 2024-06-06 14:23:05 +0200
  • 5d16af6d35 Fixing the arg dependency on AMD/Intel. Nicolas Patry 2024-06-06 14:08:06 +0200
  • b9eb57f9cd marlin: support tp>1 when group_size==-1 Daniël de Kok 2024-06-06 11:51:52 +0000