Commit Graph

796 Commits

Author SHA1 Message Date
Nicolas Patry
fc4404d9d2
. 2024-06-07 22:45:57 +02:00
Nicolas Patry
65b2efc585
. 2024-06-07 22:38:06 +02:00
Nicolas Patry
eda299b84f
. 2024-06-07 20:18:57 +02:00
Nicolas Patry
e79c83d7ba
Attempt #727. 2024-06-07 20:11:17 +02:00
Nicolas Patry
c6fa9547a2
Test. 2024-06-07 19:58:56 +02:00
Nicolas Patry
a045ead6eb
. 2024-06-07 19:52:14 +02:00
Nicolas Patry
5e769ce1e0
? 2024-06-07 19:46:34 +02:00
Nicolas Patry
87df3d5603
? 2024-06-07 17:12:17 +02:00
Nicolas Patry
19f6327bd2
esac. Great idea dev of the past. 2024-06-07 16:14:24 +02:00
Nicolas Patry
2a314fa0dd
Bash in bash. 2024-06-07 16:09:38 +02:00
Nicolas Patry
b10ba9205c
... 2024-06-07 16:05:11 +02:00
Nicolas Patry
1f4248944c
Come on GH, dash, underscore, who cares at this point. 2024-06-07 16:03:05 +02:00
Nicolas Patry
cc7c2fd90e
runs on. 2024-06-07 16:01:59 +02:00
Nicolas Patry
1e759f9da6
Wat? 2024-06-07 16:00:40 +02:00
Nicolas Patry
078fb55109
Abbé Faria? 2024-06-07 15:58:23 +02:00
Nicolas Patry
8205962950
Ahah, I see an exit. 2024-06-07 15:56:52 +02:00
Nicolas Patry
043de74dcd
**Feigns death** 2024-06-07 15:52:35 +02:00
Nicolas Patry
81ddb9d173
Please let me out ! 2024-06-07 15:49:31 +02:00
Nicolas Patry
aea77a8ab3
Banana. 2024-06-07 15:44:51 +02:00
Nicolas Patry
e6a4dbe7f5
I'm an certainly not a monkey. 2024-06-07 15:43:58 +02:00
Nicolas Patry
a759e2e7c5
Not hitting myself against the wall. 2024-06-07 15:39:37 +02:00
Nicolas Patry
8712a367dc
Flying blind feels nice. 2024-06-07 15:36:13 +02:00
Nicolas Patry
6f3117512c
Give us sanitation tools already. 2024-06-07 15:25:43 +02:00
Nicolas Patry
54e3340663 gh.. 2024-06-07 15:09:27 +02:00
Nicolas Patry
11c75f3a14 I hate this. 2024-06-07 15:07:51 +02:00
Nicolas Patry
3a8e9c221e Rename for everyone. 2024-06-07 15:03:01 +02:00
Nicolas Patry
f29371e587 Naming. 2024-06-07 14:49:48 +02:00
Nicolas Patry
3ee92eb614 ? 2024-06-07 14:15:45 +02:00
Nicolas Patry
3684439a0e Trying new split of tasks. 2024-06-07 12:03:22 +02:00
Nicolas Patry
9101b2ae4f Fix. 2024-06-07 10:05:51 +02:00
Nicolas Patry
c73355b99c
Merge branch 'main' into ci_amd2 2024-06-07 10:04:59 +02:00
Nicolas Patry
c8128c794d Let's iterate a bit faster. 2024-06-07 09:50:43 +02:00
Nicolas Patry
97af55b7ef Inject slugs 2024-06-07 09:10:38 +02:00
Daniël de Kok
bf3c813782 server: use chunked inputs
The router will now send the input as chunks besides as a single
string. This change modifies the server to process chunked input
rather than strings. This also allows us to remove the image
extraction code from the server.
2024-06-07 08:09:04 +02:00
Nicolas Patry
724fa6fe0e AMD CI. 2024-06-07 06:40:04 +02:00
Nicolas Patry
9376648c4f Checkout. 2024-06-07 06:38:13 +02:00
Nicolas Patry
fa05db296a Fix integration-tests config for docker runt . 2024-06-06 19:44:22 +02:00
Wang, Yi
4dabddb7ea
Xpu gqa (#2013)
# What does this PR do?

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2024-06-06 19:12:57 +02:00
Nicolas Patry
81704d28e8 Putting the fix for vllm for CIt 2024-06-06 19:11:51 +02:00
Nicolas Patry
512ed5ca4c Enabling CI for AMD with new runner.. 2024-06-06 19:09:15 +02:00
Nicolas Patry
9765658212 Revert "Enabling CI for AMD with new runner.."
This reverts commit 101ac9a760.
2024-06-06 19:08:16 +02:00
Nicolas Patry
101ac9a760 Enabling CI for AMD with new runner.. 2024-06-06 19:07:48 +02:00
Nicolas Patry
ed1cfde0d8
Internal runner ? (#2023)
# What does this PR do?

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2024-06-06 18:51:42 +02:00
Daniël de Kok
51621439a4 marlin: improve build 2024-06-06 17:19:46 +02:00
Daniël de Kok
0d96468ebb marlin: support tp>1 when group_size==-1 2024-06-06 17:19:28 +02:00
Daniël de Kok
4594e6faba Add support for Marlin-quantized models
This change adds support for Marlin-quantized models. Marlin is an
FP16xINT4 matmul kernel, which provides good speedups decoding batches
of 16-32 tokens. It supports quantized models with symmetric
quantization, groupsize -1 or 128, and 4-bit.

Tested with:

- Llama 2
- Llama 3
- Phi 3
2024-06-06 13:16:52 +02:00
Nicolas Patry
cf0d459aaf Revert "Less cache misses on cargo build."
This reverts commit 5aec4154c2.
2024-06-06 10:33:55 +02:00
Nicolas Patry
5aec4154c2 Less cache misses on cargo build. 2024-06-06 10:33:01 +02:00
Andrés Marafioti
2a48a10043
Update __version__ on __init__.py to 0.7.0 (#2017)
There was a new release of the python client with version upped to 0.7.0
on pip and on the pyproject.toml, but it wasn't changed on the
__init__.py so when one does:

```python
import text_generation
print(text_generation.__version__)
```

It still outputs "0.6.0"

# What does this PR do?

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2024-06-05 14:51:07 +02:00
Daniël de Kok
3f4bcf978c
Fix GPTQWeight import (#2020)
# What does this PR do?

Fix stray import.

## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2024-06-05 14:49:15 +02:00