text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-11 07:55:24 +00:00

Author	SHA1	Message	Date
OlivierDehaene	f1d8da3ba6	feat(server): add frequency penalty (#1541 )	2024-04-24 08:43:50 +00:00
Jason Stillerman	cec954e391	Update to peft 0.8.2 (#1537 ) <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [x] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [x] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [x] Did you write any new necessary tests? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @OlivierDehaene OR @Narsil	2024-04-23 11:50:44 +03:00
drbh	51a4e62ed4	Impl simple mamba model (#1480 ) This draft PR is a work in progress implementation of the mamba model. This PR currently loads weights, and produces correct logits after a single pass. This PR still needs to correctly integrate this model so it produces tokens as expected, and apply optimization to avoid all copies during runtime/unnecessary operations. [Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Albert Gu and Tri Dao)](https://arxiv.org/abs/2312.00752) https://github.com/johnma2006/mamba-minimal https://github.com/huggingface/candle/blob/main/candle-examples/examples/mamba-minimal/model.rs https://github.com/huggingface/transformers/pull/28094 Notes: this dev work is currently targeting `state-spaces/mamba-130m`, so if you want to test please use that model. Additionally when starting the router the prefill needs to be limited: `cargo run -- --max-batch-prefill-tokens 768 --max-input-length 768` Integration tests have been added and basic functionality such as model loading is supported. ```bash cd integration-tests pytest -vv models/test_fused_kernel_mamba.py ``` - [x] add tests - [x] load model - [x] make simple request - [ ] resolve warmup issue - [ ] resolve output issues fetching models tested during dev ```bash text-generation-server download-weights state-spaces/mamba-130m text-generation-server download-weights state-spaces/mamba-1.4b text-generation-server download-weights state-spaces/mamba-2.8b ``` The server can be run ```bash cd server MASTER_ADDR=127.0.0.1 MASTER_PORT=5555 python text_generation_server/cli.py serve state-spaces/mamba-2.8b ``` router ```bash cargo run ``` make a request ```bash curl -s localhost:3000/generate \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \ -H 'Content-Type: application/json' \| jq ``` response ```json { "generated_text": "\n\nDeep learning is a machine learning technique that uses a deep neural network to learn from data." } ``` --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2024-04-23 11:45:11 +03:00
Dean Wyatte	27daa511ec	GPTNeoX: Use static rotary embedding (#1498 ) # What does this PR do? `transformers` 4.35 removed rotary embeddings from GPTNeoX's weights ([link to line diff](`253f9a3f97 (diff-0e2a05d86c82e96f516db8c14070ceb36f53ca44c6bc21a9cd92ad2e777b9cf1R298)`)). This applies the same fix as https://github.com/huggingface/text-generation-inference/pull/793 which generates them on-the-fly using the appropriate value from the config file Fixes https://github.com/huggingface/text-generation-inference/issues/1460 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [x] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? @OlivierDehaene OR @Narsil	2024-04-23 09:21:21 +03:00
dtlzhuangz	bf72c03d0e	feat: eetq gemv optimization when batch_size <= 4 (#1502 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Add TensorRT-LLM weight-only GEMV kernel support. We extract GEMV kernel from [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/tree/main/cpp/tensorrt_llm/kernels/weightOnlyBatchedGemv) to accelerate the decode speed of EETQ when batch_size is smaller or equal to 4. - Features 1. There is almost no loss of quantization accuracy. 2. The speed of decoding is 13% - 27% faster than original EETQ which utilizes GEMM kernel. - Test Below is our test on 3090. Environment: torch=2.0.1, cuda=11.8, nvidia driver: 525.78.01 prompt=1024, max_new_tokens=50 ![image](https://github.com/huggingface/text-generation-inference/assets/139844877/98e63b23-23cd-452f-91bd-55ccdc9b7021) ![image](https://github.com/huggingface/text-generation-inference/assets/139844877/5c3132ff-fc1c-4b20-a83f-59b3d5f586b7) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-23 09:20:14 +03:00
Nicolas Patry	433934519c	Fixing top_n_tokens. (#1497 ) Superseeds #1459 The fix works as follows. We updated next_token_chooser to return all logprbs, then batch_top_n_tokens, now also gets accepted_ids + speculated_length (so it knows how to interpret the flat logprobs). We then update the code to return lists ot `Tokens` that it expects. <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-23 08:49:24 +03:00
OlivierDehaene	efd4b97d15	v1.4.0 (#1494 )	2024-04-22 15:47:42 +03:00
fxmarty	4b376b30f1	GPTQ support on ROCm (#1489 ) Tested with ``` CUDA_VISIBLE_DEVICES=0 text-generation-launcher --model-id TheBloke/Llama-2-7B-Chat-GPTQ --quantize gptq EXLLAMA_VERSION=1 CUDA_VISIBLE_DEVICES=0 text-generation-launcher --model-id TheBloke/Llama-2-7B-Chat-GPTQ --quantize gptq CUDA_VISIBLE_DEVICES="0,1" text-generation-launcher --model-id TheBloke/Llama-2-7B-Chat-GPTQ --quantize gptq ``` all with good and identical results on MI210. --------- Co-authored-by: Felix Marty <felix@hf.co> Co-authored-by: OlivierDehaene <olivier@huggingface.co> Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>	2024-04-22 15:38:50 +03:00
Nicolas Patry	b064b33e8b	Add sealion mpt support (#1477 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> --------- Co-authored-by: Choon Meng Tan <choonmeng@aisingapore.org> Co-authored-by: David Ong Tat-Wee <13075447+ongtw@users.noreply.github.com>	2024-04-22 15:37:05 +03:00
Nicolas Patry	ea2aa53805	Reinstate exl2 with tp (#1490 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-22 15:36:57 +03:00
drbh	b2fc097b2b	feat: adds phi model (#1442 ) This PR adds basic modeling for phi-2 run ```bash text-generation-server \ serve \ microsoft/phi-2 \ --revision 834565c23f9b28b96ccbeabe614dd906b6db551a ``` test ```bash curl -s localhost:3000/generate \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \ -H 'Content-Type: application/json' \| jq . ``` notes - recently (~1 day ago) the Phi weights and model were updated to accommodate adding [GQA/MQA attention to the model.](https://github.com/huggingface/transformers/pull/28163) This impl expects the original model format so a fixed revision is required at the moment. - this PR only includes a basic implementation of the model and can later be extended for support Flash and Sharded versions as well as make use of better optimization	2024-04-22 13:06:38 +03:00
Nicolas Patry	2a3a9c526b	Fixing non divisible embeddings. (#1476 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-22 12:48:59 +03:00
PYNing	e930ad9cec	Fix local load for Medusa (#1420 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Close #1418 Close #1415 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-22 09:30:41 +03:00
R. P. Ruiz	92ddb41d95	Fix missing make target platform for local install: 'install-flash-attention-v2' (#1414 )	2024-04-22 09:18:00 +03:00
OlivierDehaene	118344b99d	fix: fix local loading for .bin models (#1419 )	2024-04-22 09:17:52 +03:00
OlivierDehaene	62646c2a54	v1.3.4	2024-04-22 09:08:34 +03:00
Nicolas Patry	8cc4306f72	Fix local load for peft (#1373 ) local directory overloaded still needs the directory to locate the weights files correctly. # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-22 09:03:34 +03:00
OlivierDehaene	7eeabb9cda	feat: update exllamav2 kernels (#1370 ) Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2024-04-22 09:02:53 +03:00
Nicolas Patry	be05972911	Peft safetensors. (#1364 ) Works by removing adapter_model.safetensors from being detected as the core model file (which skips the real peft detection). # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-22 09:02:31 +03:00
OlivierDehaene	b7299e1b7f	fix: fix gpt-q with groupsize = -1 (#1358 )	2024-04-19 15:05:50 +03:00
OlivierDehaene	5ff9e81952	fix: fix offline (#1341 ) (#1347 ) @oOraph --------- Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> Co-authored-by: Raphael Glon <oOraph@users.noreply.github.com>	2024-04-19 14:56:25 +03:00
OlivierDehaene	ecb0db45af	fix: fix logic if sliding window key is not present in config (#1352 )	2024-04-19 14:56:10 +03:00
OlivierDehaene	a95e6d603d	feat: relax mistral requirements (#1351 ) Close #1253 Close #1279	2024-04-19 14:50:24 +03:00
OlivierDehaene	3600fc9dbe	v1.3.3	2024-04-19 14:18:39 +03:00
OlivierDehaene	bb6200503c	fix: max_past default value must be -1, not 0 (#1348 )	2024-04-19 14:18:05 +03:00
OlivierDehaene	214ec0eb49	fix: only keep stop sequence buffer if we have some	2024-04-19 14:18:00 +03:00
OlivierDehaene	04dbf7a506	fix: slice stopping criteria buffer	2024-04-19 14:17:52 +03:00
OlivierDehaene	b3c2d7291e	fix: fix quant linear autotune	2024-04-19 14:17:39 +03:00
OlivierDehaene	28fcdcca6d	fix: fix triton OutOfResources import	2024-04-19 14:17:32 +03:00
OlivierDehaene	5c9ef069ed	feat: add more latency metrics in forward (#1346 )	2024-04-19 13:41:34 +03:00
OlivierDehaene	c974437ba7	fix: fix gpt-q params loading	2024-04-19 12:12:50 +03:00
OlivierDehaene	05f8c85a8b	v1.3.2	2024-04-18 16:33:05 +03:00
OlivierDehaene	f9b58ac7a1	feat: add quant to mixtral (#1337 )	2024-04-18 16:32:50 +03:00
OlivierDehaene	09c556dbd7	v1.3.1	2024-04-18 16:32:07 +03:00
OlivierDehaene	db5053fc86	v1.3.0	2024-04-18 16:31:53 +03:00
OlivierDehaene	79f268f95a	chore: formatting	2024-04-18 16:26:00 +03:00
OlivierDehaene	9aef902982	feat: mixtral (#1328 )	2024-04-18 12:39:52 +00:00
Nicolas Patry	a7f52f3812	Speculative (#1308 )	2024-04-18 12:39:39 +00:00
Jacek Czaja	ae6215fcea	Enable server UT: test_causal_lm.py::test_batch_from_pb (#121 ) Co-authored-by: Jacek Czaja <jczaja@habana.ai>	2024-04-10 16:33:56 +02:00
Karol Damaszke	30cc78773e	Skip server tests of not enabled models (#125 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-04-09 14:15:41 +02:00
Karol Damaszke	c6739526c6	Fix test_watermark (#124 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-04-09 11:29:21 +02:00
Sylwester Fraczek	757c12dbac	Fix test_pass_through_tokenizer (#117 ) Co-authored-by: Sylwester Fraczek <sfraczek@habana.ai>	2024-04-09 09:30:47 +02:00
Karol Damaszke	d957e32601	Add Habana copyright header (#122 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-04-08 18:06:21 +02:00
Karol Damaszke	b0de25a285	Don't set rope_scaling for unsupported models (#115 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-04-02 12:12:02 +02:00
Karol Damaszke	7342baa2eb	Add support for rope_scaling and remove is_optimized_for_gaudi (#112 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-03-29 15:07:32 +01:00
Karol Damaszke	bf5263b88b	Disable watermark with FP8 quantization (#114 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-03-27 13:32:20 +01:00
jkaniecki	56f00a552b	Adjust warmup to all possible bucket sizes and decode batch size = 1 (#113 )	2024-03-27 11:59:51 +01:00
Karol Damaszke	b45f648483	Add warmup for logits processors (#107 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-03-18 15:17:47 +01:00
yuanwu2017	a4d5c3f40f	Fix the generate_stream crash in concurrent query (#105 ) Signed-off-by: yuanwu <yuan.wu@intel.com>	2024-03-15 10:54:56 +01:00
Yao Matrix	7149ac30e6	Fix the issue of out of range (#98 ) Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: yuanwu <yuan.wu@intel.com>	2024-03-13 10:09:53 +01:00
Karol Damaszke	80ae9ead28	Set MAX_TOTAL_TOKENS automatically (#91 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-03-01 11:25:15 +01:00
Karol Damaszke	a5c788cfe4	Remove redundant fill op (#83 ) (#90 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com>	2024-03-01 01:32:02 +01:00
Karol Damaszke	03c2123244	Use batched index_copy (#73 ) (#89 ) Co-authored-by: madamczykhabana <110973826+madamczykhabana@users.noreply.github.com>	2024-02-29 15:45:16 +01:00
Karol Damaszke	7dbf4bf7a4	Improve tensor slicing performance (#66 ) (#87 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com>	2024-02-29 10:48:54 +01:00
Karol Damaszke	3831f1bed5	Add warmup for shift operation (#59 ) (#86 )	2024-02-29 09:19:28 +01:00
Karol Damaszke	022ce1eaaf	Overhead reduction (#58 ) (#85 ) Co-authored-by: mrs303 <54661797+mrs303@users.noreply.github.com>	2024-02-29 09:17:45 +01:00
Karol Damaszke	212136dff8	Log exceptions to debug.log (#52 ) (#84 ) Co-authored-by: madamczykhabana <110973826+madamczykhabana@users.noreply.github.com>	2024-02-29 09:14:42 +01:00
Karol Damaszke	c7ccfb87ff	Grouped pad/shift/move operations (#57 ) (#82 ) Co-authored-by: madamczykhabana <110973826+madamczykhabana@users.noreply.github.com>	2024-02-29 04:16:44 +01:00
Karol Damaszke	2122acc60f	Add warmup for all possible shapes for prefill #49 (#81 )	2024-02-28 10:40:13 +01:00
Karol Damaszke	31bed905d4	Update habana profiler (#50 ) (#80 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com>	2024-02-28 09:57:40 +01:00
Karol Damaszke	d31fb62576	Add more info to high-level profiler events (#46 ) (#79 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>	2024-02-28 09:55:50 +01:00
Karol Damaszke	941d36f3fd	Enable deferred token generation (#44 ) (#75 ) Co-authored-by: Krzysztof Laskowski <klaskowski@habana.ai>	2024-02-27 15:46:40 +01:00
jkaniecki	83b059bd27	Bulk shifting (#40 ) (#70 ) Co-authored-by: madamczykhabana <110973826+madamczykhabana@users.noreply.github.com>	2024-02-26 17:29:56 +01:00
regisss	8f4aba6ad3	Update dependencies (#69 )	2024-02-25 13:07:47 +01:00
jkaniecki	c3bd8ef445	Add Fp8 support (#42 ) (#71 ) Co-authored-by: mrs303 <54661797+mrs303@users.noreply.github.com> Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com> Co-authored-by: Grzegorz Morys <gmorys@habana.ai>	2024-02-23 11:52:28 +01:00
jkaniecki	a490847702	Sequence bucketing for prefill (#39 ) (#67 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com>	2024-02-23 01:52:14 +01:00
jkaniecki	9ad6086250	Improve habana profile dev experience (#36 ) (#65 ) Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com>	2024-02-22 13:57:45 +01:00
jkaniecki	f7ef414e38	Remove unused pad_token_id for filter (#35 ) (#64 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com>	2024-02-22 11:24:09 +01:00
jkaniecki	8f590759e3	Prefill optimization by allocating space only for the first output token (#34 ) (#62 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com> Co-authored-by: Karol Damaszke <karol.damaszke@intel.com>	2024-02-22 04:55:43 +01:00
jkaniecki	80303b469c	Do not limit hpu graphs by default (#32 ) (#61 ) Co-authored-by: mswiniarsk <156412439+mswiniarsk@users.noreply.github.com>	2024-02-21 15:38:00 +01:00
jkaniecki	6b6dec9ea1	Transparent tokenizer uses explicit int32 (#31 ) (#60 ) Co-authored-by: Adam Stachowicz <105052242+astachowiczhabana@users.noreply.github.com>	2024-02-21 14:24:41 +01:00
regisss	a4d3a00d98	Fix dependencies (#56 )	2024-02-19 10:19:23 +01:00
regisss	dca9ac6508	Revert "Solve dependency issue" This reverts commit `ea2b93dd75`.	2024-02-19 07:28:04 +00:00
regisss	ea2b93dd75	Solve dependency issue	2024-02-19 07:26:37 +00:00
regisss	2060bb58bf	Fix trust remote code (#55 )	2024-02-19 07:53:24 +01:00
Karol Damaszke	2a7a967de3	Revert prefill optimization and fix accuracy issue in shift operation (#29 ) Co-authored-by: Karol Damaszke <kdamaszke@habana.ai> Co-authored-by: madamczykhabana <110973826+madamczykhabana@users.noreply.github.com> Co-authored-by: jkaniecki <153085639+jkaniecki@users.noreply.github.com>	2024-01-23 15:19:07 +01:00
jkaniecki	ac3bc0e95e	Removed kv_cache from HPU graph output (#19 )	2024-01-19 15:34:13 +01:00
Karol Damaszke	60f63262db	Prefill optimization by allocating space only for the first token (#17 )	2024-01-19 15:18:35 +01:00
Adam Stachowicz	0b96da89aa	Make tokenizer optional (#12 )	2024-01-19 15:12:04 +01:00
madamczykhabana	381ec38cad	Batch bucketing improvements (#15 )	2024-01-17 10:09:27 +01:00
mrs303	8523f7ef64	Deepspeed terminate (#11 )	2024-01-17 09:57:03 +01:00
Krzysztof Laskowski	c459c86f88	High-level server profiler (#13 )	2024-01-16 09:57:29 +01:00
madamczykhabana	41c4f4fa41	Debugging utils (#14 )	2024-01-15 21:05:27 +01:00
Karol Damaszke	a8c5b69e2c	Set default value of LIMIT_HPU_GRAPH to True (#7 )	2024-01-11 14:51:49 +01:00
Karol Damaszke	252ccde104	Control prefill and decode batch size separately (#6 )	2024-01-02 18:21:01 +01:00
Karol Damaszke	1be2d9a8ec	Batch size bucketing (#5 )	2023-12-22 21:53:01 +01:00
jkaniecki	e3dcd7f2c2	Disable tensor caching in HPU Graph execution (#4 )	2023-12-22 13:51:16 +01:00
Karol Damaszke	6436ae86a1	Fix for continuous batching (#1 )	2023-12-11 09:24:09 +01:00
regisss	e5f124b077	Merge tag 'v1.2.0' into v1.2-release	2023-12-06 18:46:16 +01:00
regisss	c09066aeb1	Merge tag 'v1.1.1' into v1.1-release	2023-12-06 09:50:58 +01:00
regisss	cc744ba426	Add changes from Optimum Habana's TGI folder	2023-12-05 11:12:16 +01:00
OlivierDehaene	ccd5725a0c	v1.2.0	2023-11-30 15:18:15 +01:00
Nicolas Patry	ba552e1a82	Let each model resolve their own default dtype. (#1287 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2023-11-28 17:54:26 +01:00
Nicolas Patry	3c71c656c7	`make install-flash-attn-v2-cuda` should work like `make install-flash-attn-v2` used to work. (#1294 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2023-11-28 16:28:40 +01:00
fxmarty	b2b5df0e94	Add RoCm support (#1243 ) This PR adds support for AMD Instinct MI210 & MI250 GPUs, with paged attention and FAv2 support. Remaining items to discuss, on top of possible others: * Should we have a `ghcr.io/huggingface/text-generation-inference:1.1.0+rocm` hosted image, or is it too early? * Should we set up a CI on MI210/MI250? I don't have access to the runners of TGI though. * Are we comfortable with those changes being directly in TGI, or do we need a fork? --------- Co-authored-by: Felix Marty <felix@hf.co> Co-authored-by: OlivierDehaene <olivier@huggingface.co> Co-authored-by: Your Name <you@example.com>	2023-11-27 14:08:12 +01:00
Nicolas Patry	ed2a3f617e	Exllama v2 (#1211 ) # What does this PR do? See #1165 <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> --------- Co-authored-by: Florian Zimmermeister <flozi00.fz@gmail.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-24-153.ec2.internal>	2023-11-25 22:38:38 +01:00
Vince Jankovics	c6bb76703f	Fix IDEFICS dtype (#1214 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> This forces the use of `bfloat16` for IDEFICS. The issue is that with `float16` the 80b model gives garbage output. Let me know if this solution is not appropriate and I'll adjust accordingly. For the details see below. The current behaviour: ```sh $ curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json' {"generated_text":""} ``` On closer inspection with: ```python import requests headers = { "Content-Type": "application/json"} query = "What is Deep Learning?" data = { "inputs": query, "parameters": { "max_new_tokens": 10, "return_full_text": True, "decoder_input_details": True, "do_sample": False, }, } api_url = "http://127.0.0.1:8080" response = requests.post(api_url + "/generate", headers=headers, json=data).json() for i in ['prefill', 'tokens']: print(f'### {i}') print(repr(''.join([t['text'] for t in response['details'][i]]))) ``` Prints: ``` ### prefill '<s>WhatisDeepLearning?' ### tokens '<unk><unk><unk><unk><unk><unk><unk><unk><unk><unk>' ######## ``` With the change in this PR it prints: ``` ### prefill '<s>WhatisDeepLearning?' ### tokens '\n\nDeep Learning is a subset of machine' ``` Note, using the Transformers implementation (with `IdeficsForVisionText2Text.from_pretrained`) produces the latter (correct) output as well. This only happens with the 80b model, the 9b model is not as sensitive to the dtype (as also mentioned in the code). The reason for "forcing" this in the IDEFICS init method, is because if quantization is used, then the dtype cannot be set explicitly. And since it's left as `None`, it's set to `float16` by default [here](`96a982ad8f/server/text_generation_server/models/__init__.py (L90)`). I.e. there's no other way to manually change the dtype if someone is using quantization: ```sh $ docker run .... ghcr.io/huggingface/text-generation-inference:latest --model-id HuggingFaceM4/idefics-80b-instruct --dtype bfloat16 --quantize bitsandbytes-nf4 ..... 2023-10-31T12:42:26.710401Z INFO shard-manager: text_generation_launcher: Starting shard rank=0 2023-10-31T12:42:30.315734Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output: Traceback (most recent call last): File "/opt/conda/bin/text-generation-server", line 8, in <module> sys.exit(app()) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 80, in serve raise RuntimeError( RuntimeError: Only 1 can be set between `dtype` and `quantize`, as they both decide how goes the final model. rank=0 Error: ShardCannotStart 2023-10-31T12:42:30.414010Z ERROR text_generation_launcher: Shard 0 failed to start 2023-10-31T12:42:30.414044Z INFO text_generation_launcher: Shutting down shards ``` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @Narsil what do you think? <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-11-23 15:00:09 +01:00
OlivierDehaene	35509ff5de	chore: update to torch 2.1.0 (#1182 ) Close #1142	2023-11-23 13:38:50 +01:00
Traun Leyden	e12c34bd25	Load PEFT weights from local directory (#1260 ) # What does this PR do? Enables PEFT weights to be loaded from a local directory, as opposed to a hf hub repository. It is a continuation of the work in PR https://github.com/huggingface/text-generation-inference/pull/762 <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes #1259 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? Yes but I don't know how to run the tests for this repo, and it doesn't look like this code is covered anyway - [x] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. Yes, @Narsil asked for a PR in [this comment](https://github.com/huggingface/text-generation-inference/pull/762#issuecomment-1728089505) - [x] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). I didn't see any documentation added to the [original PR](https://github.com/huggingface/text-generation-inference/pull/762), and am not sure where this belongs. Let me know and I can add some - [x] Did you write any new necessary tests? I didn't see any existing test coverage for this python module ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @Narsil <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @Narsil --> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-11-23 12:56:17 +01:00
Diwank Singh Tomer	91111a0dc2	Fix missing `trust_remote_code` flag for AutoTokenizer in utils.peft (#1270 ) Peft loading function was missing the `trust_remote_code=trust_remote_code` argument causing the custom tokenizer code to be not found. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @Narsil	2023-11-23 12:41:05 +01:00

1 2 3 4 5 ...

436 Commits