Commit Graph

  • 21f037939b updated rsnm2 2023-09-01 18:22:50 +0000
  • 819e1514f8
    docs: typo in streaming.js Julien Bouquillon 2023-09-01 18:48:16 +0200
  • 2bc287bfcd
    small fix on idefics (#954) Victor SANH 2023-09-01 12:44:34 -0400
  • d6d2b9426b feat: support for loading peft model Subaandh Krishnakumar 2023-08-29 11:06:57 +0200
  • 4f5d93ecd0
    Fixing top_k tokens when k ends up < 0 (#966) Nicolas Patry 2023-09-01 00:22:03 +0200
  • 8a5f564942
    Fix Falcon weight mapping for H2O.ai checkpoints (#953) Vincent Brouwers 2023-08-31 21:15:14 +0200
  • e74a68ee70 Fix top_k when k < 0 Nicolas Patry 2023-08-31 20:51:17 +0200
  • 7d8e5fb284
    Update version in docs (#957) Omar Sanseviero 2023-08-31 20:00:12 +0200
  • 68a3ded45c fix: about cargo lock version fredbjer 2023-08-31 22:23:35 +0800
  • 1a800db0a8
    add transformers gptq support Florian Zimmermeister 2023-08-31 15:30:02 +0200
  • ec2acc2f91 Restructure osanseviero 2023-08-31 10:46:41 +0200
  • 268f9b62cf Update version in docs osanseviero 2023-08-31 09:35:50 +0200
  • 57e57e6fee
    Return num input tokens (#3) Nikola Borisov 2023-08-30 15:20:47 -0700
  • f03cfe6493 small fix on idefics VictorSanh 2023-08-30 17:45:56 +0000
  • e864b95656 Fix Falcon weight mapping for H2O.ai checkpoints Vincent Brouwers 2023-08-30 09:50:49 +0000
  • 7c2e0af2a6
    Fix f180 (#951) Nicolas Patry 2023-08-30 11:09:46 +0200
  • 7b88baddf7 Fix f180 Nicolas Patry 2023-08-30 08:36:09 +0000
  • f6042b4955 sync text-generation version to 0.6.0 with pyproject.toml youdaoyzbx 2023-08-30 16:05:18 +0800
  • 0c7559b7fb
    Update idefics_causal_lm.py icyboy™ 2023-08-30 10:39:45 +0800
  • 9826cd1dad update accelerate Yessen Kanapin 2023-08-29 22:22:56 +0000
  • 6a017f5208 bugfix Yessen Kanapin 2023-08-29 22:22:40 +0000
  • 5485c142e8
    New release. (#941) v1.0.3 Nicolas Patry 2023-08-29 14:28:22 +0200
  • 18c849a738 .. Nicolas Patry 2023-08-29 11:58:39 +0000
  • 0fd065bf58 Moving to poetry? Nicolas Patry 2023-08-29 11:46:47 +0000
  • 29b8a8f6ab Correct location for syrupy. Nicolas Patry 2023-08-29 11:31:15 +0000
  • 38cbf5cea2 syrupy latest breaks. Nicolas Patry 2023-08-29 10:41:54 +0000
  • 012a26ac26 Update idefics. Nicolas Patry 2023-08-29 09:58:23 +0000
  • 5cdc851993 New release. Nicolas Patry 2023-08-29 10:43:08 +0200
  • d4e2ce7e7b
    Merge branch 'main' into main Marcus Dunn 2023-08-28 14:12:39 -0700
  • bd5fcf6f13
    Update server.py Robert Shaw 2023-08-28 11:03:54 -0400
  • 0b8f0ae068
    Merge pull request #6 from rsnm2/stopping-next-token-chooser Robert Shaw 2023-08-28 08:17:29 -0600
  • 77ac1648c5 added files rsnm2 2023-08-28 14:14:45 +0000
  • 211b54ac41
    Rebased #617 (#868) Nicolas Patry 2023-08-28 11:43:47 +0200
  • 4486f78cf9
    Fixing the lora adaptation on docker. (#935) Nicolas Patry 2023-08-28 11:13:24 +0200
  • 9011a3dd40 Fixing the lora adaptation on docker. Nicolas Patry 2023-08-28 09:11:50 +0000
  • adc43c6f8f Fix idefics. Nicolas Patry 2023-08-28 09:16:39 +0200
  • 59d3688ce6
    Merge pull request #5 from rsnm2/stopping-next-token-chooser Robert Shaw 2023-08-27 20:14:34 -0600
  • d02993a5c3 stash rsnm2 2023-08-28 02:13:09 +0000
  • b55270320c
    Merge pull request #4 from rsnm2/stopping-next-token-chooser Robert Shaw 2023-08-27 20:12:10 -0600
  • bb124e4029 successfully implemented and integrated nexttokenchooser with support for repetition penalty, do_sample, temperature, top_k, and top_p rsnm2 2023-08-28 02:05:43 +0000
  • 6b5609b413 add peft param to launcher Chris 2023-08-28 00:05:45 +0200
  • 8175a305aa adding missing folder for custom peft docker builds Chris 2023-08-27 21:24:16 +0200
  • d23cc5857f adding peft loading instructions and dockerfile Chris 2023-08-27 21:17:58 +0200
  • 1659b871b6 load peft from cli Chris 2023-08-27 20:56:53 +0200
  • 9d98557772
    add POC news chris-aeviator 2023-08-27 17:39:10 +0200
  • aba56c1343 loading adapter model ontop Chris 2023-08-27 17:32:28 +0200
  • 694a535033 loading quantization config the proper way Chris 2023-08-27 16:48:30 +0200
  • da1cfea208
    Merge pull request #2 from ohmytofu-ai/chris-aeviator-patch-1 chris-aeviator 2023-08-27 14:55:54 +0200
  • d628f5fd29
    Update README.md chris-aeviator 2023-08-27 14:55:41 +0200
  • db8937c209
    Update README.md chris-aeviator 2023-08-27 14:43:26 +0200
  • 4d8e47e0e9
    Merge pull request #1 from ohmytofu-ai/impl/4bit-demo chris-aeviator 2023-08-27 14:42:52 +0200
  • cf178a278a loading models in 4 bit this enables pascal GPUS to load LLama 2 when used with --quantize bitsandbytes Chris 2023-08-27 14:38:25 +0200
  • 3062fa035d
    Update README.md chris-aeviator 2023-08-27 12:52:48 +0200
  • f4932aec89 Give name to the final docker image Nikola Borisov 2023-08-25 19:58:33 +0000
  • 6f6775fb23 Update transformers Nikola Borisov 2023-08-25 19:58:17 +0000
  • de7748ff41 implement dyno rope Yessen Kanapin 2023-08-25 13:20:31 -0700
  • a875c05ccd implemented temperature, repetition penalty rsnm2 2023-08-25 16:11:09 +0000
  • 96f8365996 refactored stopping criteria; started concept of GenerationParameters to control generation --- currently enabling passing max_new_tokens; next step --- expand next token chooser rsnm2 2023-08-25 15:26:16 +0000
  • 02952c511f
    Merge pull request #3 from rsnm2/dev-ux Robert Shaw 2023-08-25 07:41:11 -0600
  • 5f4dcd5a4b
    Update quantization.md Merve Noyan 2023-08-25 12:32:50 +0300
  • 7c2db76b89
    Removed details and wrote caveats Merve Noyan 2023-08-25 12:31:58 +0300
  • 764d946607
    Update docs/source/conceptual/quantization.md Merve Noyan 2023-08-25 11:57:14 +0300
  • 937e4269e1
    Added bnb Merve Noyan 2023-08-25 11:48:36 +0300
  • 09f4fc40a5 built fastapi wrapper around router rsnm2 2023-08-24 20:37:38 +0000
  • ff3275c442 added HTTP interface around the router rsnm2 2023-08-24 19:10:51 +0000
  • a973cf4922 moved files rsnm2 2023-08-24 18:37:03 +0000
  • cd3349f53b readded changed file names rsnm2 2023-08-24 17:50:01 +0000
  • 06fc85f93c refactored router code to use queues (which can be treated like a stream for IPC rsnm2 2023-08-24 17:49:28 +0000
  • 3db59bfd00 Supporting code llama. Nicolas Patry 2023-08-24 18:52:20 +0200
  • 5347eb4055
    Update quantization.md Merve Noyan 2023-08-24 14:45:44 +0300
  • 8f251c7c3a
    Update quantization.md Merve Noyan 2023-08-24 13:57:00 +0300
  • 11db3cd3ea
    Update docs/source/basic_tutorials/non_core_models.md Merve Noyan 2023-08-24 13:32:18 +0300
  • 33d9bae612
    Update tensor_parallelism.md Merve Noyan 2023-08-24 12:46:27 +0300
  • 2363e9a482
    Desperate attempt to fix latex Merve Noyan 2023-08-24 12:20:40 +0300
  • 60e4ee2f11
    Update docs/source/conceptual/tensor_parallelism.md Merve Noyan 2023-08-24 11:39:13 +0300
  • 4213eb57da
    Update non_core_models.md Merve Noyan 2023-08-24 11:38:06 +0300
  • 29754ced5a
    Update docs/source/basic_tutorials/non_core_models.md Merve Noyan 2023-08-24 11:32:39 +0300
  • ed49379c6a
    Merge pull request #2 from rsnm2/development Robert Shaw 2023-08-23 14:00:31 -0600
  • 83f6461bb9
    Merge pull request #1 from rsnm2/dev-router Robert Shaw 2023-08-23 13:56:01 -0600
  • 1f87c7762f implemented a basic naive router rsnm2 2023-08-23 19:54:31 +0000
  • b581eb7151
    Initial commit Merve Noyan 2023-08-23 19:27:20 +0300
  • 2ec5436f9c
    Removed internal implementation details and clarified Merve Noyan 2023-08-23 17:00:40 +0300
  • e16ecaf0c9
    Added trust-remote-code Merve Noyan 2023-08-23 16:27:55 +0300
  • d7a0c348b6
    Addressed Omar's comments Merve Noyan 2023-08-23 16:01:55 +0300
  • 0af0315b78
    Update docs/source/conceptual/tensor_parallelism.md Merve Noyan 2023-08-23 15:45:24 +0300
  • 1e828f33c0
    Update docs/source/conceptual/tensor_parallelism.md Merve Noyan 2023-08-23 15:45:12 +0300
  • a84d70d97d
    Update consuming_tgi.md Merve Noyan 2023-08-23 15:28:42 +0300
  • 403db3e7c4
    Update consuming_tgi.md Merve Noyan 2023-08-23 15:16:13 +0300
  • ac47237516
    Update consuming_tgi.md Merve Noyan 2023-08-23 15:15:12 +0300
  • 9ce13d5b32
    Increase height Merve Noyan 2023-08-23 15:10:02 +0300
  • e3e0b587ac Upgrade version number in docs. Nicolas Patry 2023-08-23 13:45:28 +0200
  • b940f4ce64
    Update streaming.md Omar Sanseviero 2023-08-22 23:41:47 +0200
  • cf0453182e
    Restructure Merve Noyan 2023-08-22 23:45:56 +0300
  • bdf36659a6
    Update non_core_models.md Merve Noyan 2023-08-22 23:44:53 +0300
  • 98a5e6f26a
    Update docs/source/basic_tutorials/non_core_models.md Merve Noyan 2023-08-22 23:44:18 +0300
  • abde90c493
    Update docs/source/basic_tutorials/non_core_models.md Merve Noyan 2023-08-22 23:44:13 +0300
  • ef5a99ffc9
    Update consuming_tgi.md Merve Noyan 2023-08-22 23:38:00 +0300
  • b4b52c6f32
    Update _toctree.yml Merve Noyan 2023-08-22 23:35:48 +0300
  • b33a66148c
    Update and rename custom_models.md to non_core_models.md Merve Noyan 2023-08-22 23:35:16 +0300
  • 5b995926f8
    Update docs/source/basic_tutorials/consuming_tgi.md Merve Noyan 2023-08-22 23:32:40 +0300