Commit Graph

  • 2774b0ab44
    Fixing watermark. (#851) Nicolas Patry 2023-08-16 07:17:26 +0200
  • 737d5781e4
    Update README.md (#848) Adarsh Shirawalmath 2023-08-15 22:43:52 +0530
  • 05dd14fdb9
    Fix tokenizers==0.13.4 . (#838) Nicolas Patry 2023-08-14 19:26:19 +0200
  • d8f1337e7e
    README edit -- running the service with no GPU or CUDA support (#773) Pasquale Minervini 2023-08-14 15:41:13 +0200
  • a072660bf5
    fix: LlamaTokenizerFast to AutoTokenizer at flash_llama.py (#619) Dong Shin 2023-08-14 21:20:18 +0900
  • b5087c4f4e
    Fix rope dynamic + factor (#822) Nicolas Patry 2023-08-14 14:09:51 +0200
  • 3ffcd9d311
    Added two more features in readme.md file (#831) sawan Rawat 2023-08-14 17:39:20 +0530
  • d71237fc8b
    Have snippets in Python/JavaScript in quicktour (#809) Omar Sanseviero 2023-08-14 13:47:32 +0200
  • 09eca64227
    Version 1.0.1 (#836) v1.0.1 Nicolas Patry 2023-08-14 11:23:11 +0200
  • a2a913eec5
    Added streaming for InferenceClient (#821) Merve Noyan 2023-08-11 18:05:19 +0300
  • cc7bb5084d
    Upgrade transformers (fix protobuf==3.20 issue) (#795) Nicolas Patry 2023-08-11 16:46:08 +0200
  • d0e30771c2
    Added ChatUI Screenshot to Docs (#823) Merve Noyan 2023-08-11 17:42:43 +0300
  • 5df4c7c0d7
    [docs] Build docs only when doc files change (#812) Mishig 2023-08-11 07:07:53 +0200
  • e58ad6dd66
    Added CLI docs (#799) Merve Noyan 2023-08-10 16:00:30 +0300
  • 7dbaef3f5b
    Minor docs style fixes (#806) Omar Sanseviero 2023-08-10 14:32:51 +0200
  • 04f7c2d86b
    Fix gated docs (#805) Omar Sanseviero 2023-08-10 14:32:07 +0200
  • 8bdb16ee9a
    Use destructuring in router arguments to avoid '.0' (#798) ivarflakstad 2023-08-10 10:52:50 +0200
  • 647ae7a7d3
    Setup for doc-builder and docs for TGI (#740) Merve Noyan 2023-08-10 11:24:52 +0300
  • 9a17f30042
    Edit ToC tree to change launching locally section Merve Noyan 2023-08-09 17:05:36 +0300
  • 464f135a8f
    Changed title of launching locally Merve Noyan 2023-08-09 17:04:12 +0300
  • e14611c6cb
    Added image to index Merve Noyan 2023-08-09 17:03:16 +0300
  • 197fbfde7c Fix Tip formatting osanseviero 2023-08-09 15:38:57 +0200
  • 21701d4b44
    Added note on TP sharding Merve Noyan 2023-08-09 16:04:37 +0300
  • 21ecdbc50f
    Update preparing_model.md Merve Noyan 2023-08-09 15:55:10 +0300
  • ea78c5c26a Fix typo in batch concatination Vincent Brouwers 2023-08-09 08:39:18 +0000
  • 36a47be98d Merge querying and consuming_tgi osanseviero 2023-08-09 00:11:02 +0200
  • 18c5861016 Fix code snippet in Tip osanseviero 2023-08-08 18:50:26 +0200
  • 00c697e587 Add some intro text + docker installation link before code snippet osanseviero 2023-08-08 18:47:03 +0200
  • 15e929fe45 Upgrade bitsandbytes. Nicolas Patry 2023-08-08 16:45:45 +0000
  • 83c93f352a Format shell code correctly osanseviero 2023-08-08 18:42:34 +0200
  • 2a0be538b5 Fix Tip snippet osanseviero 2023-08-08 18:42:04 +0200
  • 125a618449 Dummy trigger. Nicolas Patry 2023-08-08 14:28:43 +0000
  • cf318455ef Upgrade acclereate. Nicolas Patry 2023-08-08 12:49:31 +0000
  • 304524b914 After rebase redo upgrade. Nicolas Patry 2023-08-08 11:46:20 +0000
  • 0e8b47811e
    Llama change. (#793) Nicolas Patry 2023-08-08 13:43:40 +0200
  • fc7221369e Fix. Nicolas Patry 2023-08-08 10:43:34 +0000
  • c4dac9f3dc
    Update __init__.py (#794) Nicolas Patry 2023-08-08 12:09:51 +0200
  • 71b8d349a5
    Update __init__.py Nicolas Patry 2023-08-08 12:09:18 +0200
  • be4d0be8c8 Llama change. Nicolas Patry 2023-08-08 11:59:34 +0200
  • b5c7e3f4ca
    Update .github/workflows/build_documentation.yml Omar Sanseviero 2023-08-08 10:29:34 +0200
  • f9c228b521
    Update .github/workflows/delete_doc_comment.yml Omar Sanseviero 2023-08-08 10:29:16 +0200
  • c1109bf99b Add usage examples in index osanseviero 2023-08-08 08:15:03 +0200
  • 81029b9896 Pass through index osanseviero 2023-08-08 08:05:50 +0200
  • 4ddb6681ac
    Add workflow to upload documentation osanseviero-patch-1 Omar Sanseviero 2023-08-08 07:49:45 +0200
  • 4cfa8a8a6e Add missing GH Actions workflow osanseviero 2023-08-08 07:41:04 +0200
  • b8b9ac3c28 Fix ToC Tree osanseviero 2023-08-07 21:55:21 +0200
  • 3de777c645 merged Blake Mallory 2023-08-07 15:08:08 -0400
  • a160ce5623 added docker-compose example Blake Mallory 2023-08-07 15:07:39 -0400
  • 1fdc88ee90
    Fixing non 4bits quantization. (#785) Nicolas Patry 2023-08-07 13:02:00 +0200
  • 891e19cc51
    Fix dynamic rope. (#783) Nicolas Patry 2023-08-07 12:28:19 +0200
  • a217b4df5a Fixing non 4bits quantization. Nicolas Patry 2023-08-07 12:19:02 +0200
  • 9b62094748 Fix dynamic rope. Nicolas Patry 2023-08-07 11:46:22 +0200
  • adcbdfa8cd
    refactored toc Merve Noyan 2023-08-07 11:21:20 +0300
  • d235d3b432
    Update docker_launch.md Merve Noyan 2023-08-07 11:17:45 +0300
  • c12547972d
    nit Merve Noyan 2023-08-07 11:17:35 +0300
  • 982d6709fe
    Update consuming_tgi.md Merve Noyan 2023-08-07 11:15:45 +0300
  • f7c49f612b
    Added tip Merve Noyan 2023-08-04 22:59:45 +0300
  • 5dcb27e9e3
    Addressed comments Merve Noyan 2023-08-04 22:57:40 +0300
  • 4042b700c1
    Update docs/source/basic_tutorials/docker_launch.md Merve Noyan 2023-08-04 22:42:10 +0300
  • c015a2feaf
    Update docs/source/basic_tutorials/local_launch.md Merve Noyan 2023-08-04 22:41:54 +0300
  • b5155248d5
    Update docs/source/basic_tutorials/local_launch.md Merve Noyan 2023-08-04 22:41:31 +0300
  • 8255dda916
    Update docs/source/basic_tutorials/consuming_tgi.md Merve Noyan 2023-08-04 22:41:06 +0300
  • c35830f068
    Update docs/source/basic_tutorials/consuming_tgi.md Merve Noyan 2023-08-04 22:40:55 +0300
  • 5bc0da0201
    Update docs/source/basic_tutorials/local_launch.md Merve Noyan 2023-08-04 22:40:47 +0300
  • 2975decaa4
    Update docs/source/basic_tutorials/consuming_tgi.md Merve Noyan 2023-08-04 22:40:34 +0300
  • bae7b1cc68
    Update docs/source/basic_tutorials/querying.md Merve Noyan 2023-08-04 22:40:23 +0300
  • 9c2bf4008f
    Update docs/source/basic_tutorials/local_launch.md Merve Noyan 2023-08-04 22:40:17 +0300
  • 801c7df18b
    Update docs/source/supported_models.md Merve Noyan 2023-08-04 22:40:08 +0300
  • bf5e32417e
    Merge pull request #1 from pminervini/pminervini-patch-1 Pasquale Minervini 2023-08-04 12:35:53 +0200
  • c92b4e557f
    Update README.md Pasquale Minervini 2023-08-04 12:34:36 +0200
  • 16fadcec57
    Merge BNB 4bit. (#770) Nicolas Patry 2023-08-03 23:00:59 +0200
  • f91e9d282d
    fix build tokenizer in quantize and remove duplicate import (#768) zspo 2023-08-04 04:21:33 +0800
  • 8853645e9f ??? What the heck. Nicolas Patry 2023-08-03 20:07:03 +0000
  • 7eafbbd621 Cargo fmt. Nicolas Patry 2023-08-03 20:02:47 +0000
  • 6df90175d6 add documentation for 4bit quantization options krzim 2023-07-19 22:10:34 +0000
  • c9a78bbe0f add 4bit bnb quantization krzim 2023-07-17 20:23:23 +0000
  • 794767a98d add AutoModel error message for 4bit quantization krzim 2023-07-17 19:31:39 +0000
  • bccebb027a add bnb 4bit to quantization enums krzim 2023-07-17 19:31:11 +0000
  • 3f2c5c31a8 update bnb requirements krzim 2023-07-17 19:29:05 +0000
  • 6ec5288ab7
    This should prevent the PyTorch overriding. (#767) Nicolas Patry 2023-08-03 21:54:39 +0200
  • 38bca9301d Explicitly pass the index of PyTorch to poetry (to guarantee cuda version). Nicolas Patry 2023-08-03 14:38:54 +0000
  • 209ab7d013 This should prevent the PyTorch overriding. Nicolas Patry 2023-08-03 14:19:12 +0000
  • ac736fd89c
    feat(server): Add native support for PEFT Lora models (#762) Nicolas Patry 2023-08-03 17:22:45 +0200
  • e911ea5483 fix build tokenizer in quantize and remove duplicate import zspo 2023-08-03 22:42:45 +0800
  • e515f8da63 Missing catch. Nicolas Patry 2023-08-03 14:40:32 +0000
  • 0c3f3cdb08
    Added safetensors Merve Noyan 2023-08-03 17:09:19 +0300
  • 76366a560e Without hashes. Nicolas Patry 2023-08-03 13:57:59 +0000
  • e20a5aeac5
    Added RoPE and quantization Merve Noyan 2023-08-03 15:56:13 +0300
  • 9d5a018fac Cleaner peft code. Nicolas Patry 2023-08-03 12:41:13 +0000
  • 1569558750 feat(server): Add native support for PEFT Lora models Nicolas Patry 2023-08-02 22:09:10 +0000
  • 5bfc5a8f60
    Addressed comments Merve Noyan 2023-08-03 00:03:01 +0300
  • 4f1657418d
    Update docs/source/supported_models.md Merve Noyan 2023-08-02 23:45:09 +0300
  • 0efe4384c0
    Update docs/source/supported_models.md Merve Noyan 2023-08-02 23:44:51 +0300
  • 97f5d7dd47
    Update docs/source/supported_models.md Merve Noyan 2023-08-02 23:44:40 +0300
  • 904703d561
    Update docs/source/index.md Merve Noyan 2023-08-02 23:44:33 +0300
  • 8b0d608f1f
    Automatically map deduplicated safetensors weights to their original values (#501) (#761) Nicolas Patry 2023-08-02 20:24:37 +0200
  • bd3088748e
    add FastLinear import (#750) zspo 2023-08-03 02:04:46 +0800
  • 9bcac46d00
    Automatically map deduplicated safetensors weights to their original values (#501) Vincent Brouwers 2023-08-02 19:55:03 +0200
  • e994ad1172
    Added InferenceClient model_compat_log Merve Noyan 2023-08-02 17:57:01 +0300
  • bb83f333b7
    Added consuming TGI with ChatUI Merve Noyan 2023-08-02 17:40:56 +0300