Commit Graph

  • 1ce58375b9 use relaxed OlivierDehaene 2023-10-20 14:33:19 +0200
  • a8e6a49946 add logs OlivierDehaene 2023-10-20 12:55:50 +0200
  • df419a21e0 use relaxed OlivierDehaene 2023-10-20 12:41:21 +0200
  • 3756a5f1e2 fix: batching OlivierDehaene 2023-10-20 11:53:23 +0200
  • 12590fdcce
    feat: paged attention v2 (#1183) OlivierDehaene 2023-10-23 12:29:25 +0200
  • 63fa534612
    Fix link to quantization page in preparing_model.md (#1187) Aastha Varma 2023-10-23 15:42:21 +0530
  • 5ce75df041
    Fix link to quantization page in preparing_model.md Aastha Varma 2023-10-23 10:01:53 +0530
  • 8d5e977332 update flash attn OlivierDehaene 2023-10-20 10:55:43 +0200
  • 5e28f44a83
    #1049 CI (#1178) OlivierDehaene 2023-10-20 10:28:45 +0200
  • 61ad935110 feat: paged attention v2 OlivierDehaene 2023-10-20 10:27:52 +0200
  • 47e0620ab6 fmt OlivierDehaene 2023-10-20 09:47:53 +0200
  • 72b8f88be8
    fix: remove useless token (#1179) Remy 2023-10-19 14:04:44 +0200
  • 1a8c8e8f0a
    fix: remove useless token Remy 2023-10-19 13:51:57 +0200
  • 648ea06430
    fix: EETQLinear with bias in layers.py (#1176) star 2023-10-19 18:15:05 +0800
  • 9179605e1e
    Fix: Replace view() with reshape() in neox_modeling.py to resolve RuntimeError (#1155) Mario928 2023-10-19 15:24:26 +0530
  • d20576ae0c
    pass max_total_tokens info through warmup, python could get max_total… (#1049) Wang, Yi 2023-10-19 17:35:31 +0800
  • cc0ec3c38a
    Update router/src/validation.rs Chirag Jain 2023-10-19 15:02:40 +0530
  • 7402a355dc
    Fix calling cuda() on load_in_8bit (#1153) momonga 2023-10-19 17:42:03 +0900
  • 4d0f5c5de6 fix: EETQLinear with bias in layers.py zhaosida 2023-10-19 11:20:11 +0800
  • d32007b8c2
    Update test_types.py Piotr Mlocek 2023-10-18 12:58:17 -0700
  • f0d6619d80
    Allow top_p == 1.0 Piotr Mlocek 2023-10-18 12:57:14 -0700
  • 7d793a5b61
    build in docker Florian Zimmermeister 2023-10-17 22:33:06 +0200
  • cbb466b4f3
    draft exllamav2, first commit Florian Zimmermeister 2023-10-17 22:14:36 +0200
  • 681253e9a5
    Fix: Replace view() with reshape() in neox_modeling.py to resolve RuntimeError Mario928 2023-10-16 11:16:45 +0530
  • ac531c8d0a fix:error use loadin8bit model.cuda mmnga 2023-10-16 11:03:32 +0900
  • 3af1a11401
    Fix link in preparing_model.md (#1140) Mishig 2023-10-13 09:48:35 +0200
  • 79b5a4068b comment out prints Bill Nell 2023-10-12 14:53:28 -0400
  • bf7becdce3
    Fix link in preparing_model.md Mishig 2023-10-12 17:54:23 +0200
  • 5e02f40a83 hacks to support native continuous batching Bill Nell 2023-10-12 11:44:50 -0400
  • a110e2807d
    (UPDATE): remove internal references A.J 2023-10-11 17:29:25 +0200
  • 1455830909 (ADD): client benchmarking utility & omegaconf antonio 2023-10-11 15:09:37 +0000
  • 20ee71dcf5 fix: force one of max_new_tokens or truncate with slow tokenizer OlivierDehaene 2023-10-11 10:46:40 +0200
  • 8a7771a33c Use cuda devel instead feat/cuda_12 OlivierDehaene 2023-10-11 10:40:06 +0200
  • 57795685d1 feat: support cuda 12.1 OlivierDehaene 2023-10-10 15:23:52 +0200
  • d0463ce151 add NPU support zhangsibo1129 2023-10-09 15:56:29 +0800
  • dd304cf14c
    Remove some content from the README in favour of the documentation (#958) Omar Sanseviero 2023-10-09 11:59:06 +0200
  • 3a05bac225 add torch ccl to support TP for bfloat16, gloo does not support bfloat16 Wang, Yi A 2023-10-07 00:13:46 -0700
  • a1c9cc422a wip Bill Nell 2023-10-06 22:00:30 -0400
  • 9b22eb389f Update bitsandbytes Nikola Borisov 2023-10-06 17:15:23 -0700
  • 11e1e059db Revert "update rust tokenizer" Nikola Borisov 2023-09-28 18:02:31 -0700
  • 890fc9a884 update rust tokenizer Nikola Borisov 2023-09-28 17:57:33 -0700
  • 12684c902a Updating transformers to get new models Nikola Borisov 2023-09-28 17:38:44 -0700
  • 2d37ea0dfd Cargo.lock file should be there Nikola Borisov 2023-09-07 17:38:51 -0700
  • 017f3de464
    Add ignore_eos_token to HTTP interface Chirag Jain 2023-10-06 18:22:12 +0530
  • 00b8f36fba
    Prepare for v1.1.1 (#1100) Nicolas Patry 2023-10-05 16:09:49 +0200
  • 56de96abe9 missing arg feat/attention_sinks OlivierDehaene 2023-10-05 15:14:17 +0200
  • 29341dcc6d update launcher doc OlivierDehaene 2023-10-05 14:44:52 +0200
  • cc36128cda feat: support attention sinks OlivierDehaene 2023-10-05 14:40:35 +0200
  • e9cdf6225f
    Hotfixing idefics base64 parsing. (#1103) Nicolas Patry 2023-10-05 13:35:26 +0200
  • 4930a9d8b7 Hotfixing idefics base64 parsing. Nicolas Patry 2023-10-05 11:04:42 +0000
  • 3c373dcc53
    Adding yarn support. (#1099) Nicolas Patry 2023-10-05 10:11:50 +0200
  • 87f43814e3
    Fixing GPTQ exllama kernel usage. (#1101) Nicolas Patry 2023-10-05 10:11:27 +0200
  • 0e4ee4f107
    fix: type hint typo in tokens.py (#1102) Martin Vejvar 2023-10-05 16:33:04 +0900
  • a27de21afa
    fix: tokens.py Martin Vejvar 2023-10-05 15:04:24 +0900
  • 2d4ae09074 Fixing GPTQ exllama kernel usage. Nicolas Patry 2023-10-04 15:50:56 +0000
  • 6df43da0a4
    Modify the default for max_new_tokens. (#1097) Nicolas Patry 2023-10-04 17:38:42 +0200
  • 664c3d8ba2 Prepare for v1.1.1 Nicolas Patry 2023-10-04 15:37:17 +0000
  • 66ce2fa7c1
    Receive base64 encoded images for idefics. (#1096) Nicolas Patry 2023-10-04 17:35:29 +0200
  • 04323d27dd Adding yarn support. Nicolas Patry 2023-10-04 15:17:02 +0000
  • d19030d330 Update openapi.json Nicolas Patry 2023-10-04 14:36:11 +0000
  • c432471546 Fmt Nicolas Patry 2023-10-04 16:25:46 +0200
  • b7a81ae6d4
    Update router/src/lib.rs Nicolas Patry 2023-10-04 16:14:27 +0200
  • c05cabc730 Modify the default for max_new_tokens. Nicolas Patry 2023-10-04 16:03:01 +0200
  • 1710777ff9 Receive base64 encoded images. Nicolas Patry 2023-10-04 15:40:30 +0200
  • 8ec1b87f16
    Adding titles to CLI doc. (#1094) Nicolas Patry 2023-10-04 12:57:21 +0200
  • 2469deedcc Normalize a bit. Nicolas Patry 2023-10-04 12:39:39 +0200
  • fcf0f890d2 Fix titles ? Nicolas Patry 2023-10-04 12:30:17 +0200
  • 6b5537be86 Updated doc Nicolas Patry 2023-10-04 12:25:14 +0200
  • a82f90173b Adding titles to CLI doc. Nicolas Patry 2023-10-04 12:20:57 +0200
  • b4f68c3cf4
    fixed command line arguments in docs (#1092) Fluder-Paradyne 2023-10-03 15:55:45 +0530
  • 1bebb9e76b
    Update idefics_image_processing.py (#1091) Nicolas Patry 2023-10-03 12:25:06 +0200
  • ea941a4fcf remove -- in docs Fluder-Paradyne 2023-10-03 10:03:06 +0000
  • a31359d052
    Update idefics_image_processing.py Nicolas Patry 2023-10-03 11:55:56 +0200
  • 85acb11ba0
    Handling bloom prefix. (#1090) Nicolas Patry 2023-10-03 11:55:10 +0200
  • 702d269729
    [Doc page] Fix launcher page highlighting (#1080) Mishig 2023-10-03 11:11:10 +0200
  • f092404830 Handling bloom prefix. Nicolas Patry 2023-10-03 09:08:41 +0000
  • b8fefa6b55
    raise exception on invalid images (#999) Leo Tronchon 2023-10-03 10:26:10 +0200
  • d569a521a8
    Update server/text_generation_server/models/custom_modeling/idefics_image_processing.py Nicolas Patry 2023-10-03 10:24:19 +0200
  • bd998d8797
    Fix window_size_left for flash attention v1 (#1089) Peter Lowrance 2023-10-02 14:53:14 -0400
  • 891b18a0c3
    Fix window_size_left for flash attention v1 Peter Lowrance 2023-10-02 11:46:04 -0400
  • 5e68cf1260
    Update README.md Omar Sanseviero 2023-09-29 12:20:26 +0200
  • 195008d621
    Merge branch 'main' into remove_readme Omar Sanseviero 2023-09-29 12:19:17 +0200
  • 5ba53d44a1
    Fixing eetq dockerfile. (#1081) Nicolas Patry 2023-09-29 11:19:06 +0200
  • 24a8785d3d Fix. Nicolas Patry 2023-09-29 07:52:58 +0000
  • c8e049263b Fix dockerfile Nicolas Patry 2023-09-29 07:10:38 +0000
  • 59d77e5ea8 Fixing eetq dockerfile. Nicolas Patry 2023-09-29 06:47:35 +0000
  • 29e1576af2 [Doc page] Fix launcher page highlighting Mishig Davaadorj 2023-09-28 22:40:11 +0200
  • 00e0d2d7b4 Merge branch 'main' into remove_readme_fix_conflicts Pedro Cuenca 2023-09-28 18:49:05 +0200
  • 724199aaf1
    Update launcher.md to wrap code blocks (#1076) Mishig 2023-09-28 17:30:36 +0200
  • 270dd328d5 Update launcher.md to wrap code blocks Mishig 2023-09-28 16:47:32 +0200
  • a7808ff853
    Fix launcher.md (#1075) Mishig 2023-09-28 15:37:50 +0200
  • 3835d278dd
    fix doc egeneration newline Mishig 2023-09-28 15:28:23 +0200
  • d6c2d9570d
    chore Mishig 2023-09-28 15:24:11 +0200
  • 35b7c1b10a
    Fix launcher.md Mishig 2023-09-28 15:15:47 +0200
  • 7a6fad6aac update readme v1.1.0 OlivierDehaene 2023-09-28 10:18:18 +0200
  • 3b56d7669b
    feat: add mistral model (#1071) OlivierDehaene 2023-09-28 09:55:47 +0200
  • 8a3449def9 fix default window size OlivierDehaene 2023-09-28 09:24:32 +0200
  • f938825272 fix tests OlivierDehaene 2023-09-28 09:16:03 +0200
  • dd31afd3a8 fix tests OlivierDehaene 2023-09-27 19:44:30 +0200
  • 8fdac4ef2f update doc OlivierDehaene 2023-09-27 19:41:42 +0200