Daniël de Kok 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ed96ba6503 
							
						 
					 
					
						
						
							
							flashinfer 0.2.0.post1 -> post2 ( #3040 )  
						
						... 
						
						
						
						* flashinfer 0.2.0.post1 -> post2
* Fix ruff stuff.
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> 
						
					 
					
						2025-02-20 12:34:20 +01:00 
						 
				 
			
				
					
						
							
							
								Nicolas Patry 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9c89d0070e 
							
						 
					 
					
						
						
							
							Having less logs in case of failure for checking CI more easily. ( #3037 )  
						
						... 
						
						
						
						* Having less logs in case of failure for checking CI more easily.
* Cleaning up the versions to uv for the client.
* Ignore entirely the API. 
						
					 
					
						2025-02-19 17:01:33 +01:00 
						 
				 
			
				
					
						
							
							
								drbh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e4d31a40db 
							
						 
					 
					
						
						
							
							fix: bump clients test base url to llama ( #1751 )  
						
						... 
						
						
						
						This PR bumps the client tests from `google/flan-t5-xxl` to
`meta-llama/Llama-2-7b-chat-hf` to resolve issues when calling the
endpoint and `google/flan-t5-xxl` is not available
run with
```bash
make python-client-tests
clients/python/tests/test_client.py ..............     [ 43%]
clients/python/tests/test_errors.py ..........         [ 75%]
clients/python/tests/test_inference_api.py ......      [ 93%]
clients/python/tests/test_types.py ..                  [100%]
```
**note `google/flan-t5-xxl` function is currently unused but still
included in the `conftest.py` 
						
					 
					
						2024-04-16 16:56:47 -04:00 
						 
				 
			
				
					
						
							
							
								Nicolas Patry 
							
						 
					 
					
						
						
						
						
							
						
						
							1e03b61b5c 
							
						 
					 
					
						
						
							
							Revert "Modify default for max_new_tokens in python client ( #1336 )"  
						
						... 
						
						
						
						This reverts commit 2d56f106a6 
						
					 
					
						2024-02-01 14:36:10 +00:00 
						 
				 
			
				
					
						
							
							
								freitng 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2d56f106a6 
							
						 
					 
					
						
						
							
							Modify default for max_new_tokens in python client ( #1336 )  
						
						... 
						
						
						
						# What does this PR do?
Since
([#1097 ](https://github.com/huggingface/text-generation-inference/pull/1097 ))
the clients do not need to specify a max_length anymore. However, the
python client in this repo had not yet been adapted to these changes.
This PR makes it possible to use the python client and not provide
max_new_tokens.
<!-- Remove if not applicable -->
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests ),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/ )? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs ),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation ).
- [x] Did you write any new necessary tests?
## Who can review?
Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR. 
						
					 
					
						2024-01-29 11:02:57 -05:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							895c5f1562 
							
						 
					 
					
						
						
							
							feat(server): only compute prefill logprobs when asked ( #406 )  
						
						... 
						
						
						
						Close  #288  
					
						2023-06-02 17:12:30 +02:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dbdc587ddd 
							
						 
					 
					
						
						
							
							feat(integration-tests): improve comparison and health checks ( #336 )  
						
						
						
					 
					
						2023-05-16 20:22:11 +02:00 
						 
				 
			
				
					
						
							
							
								Ehsan M. Kermani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f092ba9b22 
							
						 
					 
					
						
						
							
							feat(server): add watermarking tests ( #248 )  
						
						
						
					 
					
						2023-04-27 19:16:35 +02:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b927244eb5 
							
						 
					 
					
						
						
							
							feat(python-client): get list of currently deployed tgi models using the inference API ( #191 )  
						
						
						
					 
					
						2023-04-17 18:43:24 +02:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							53ee09c0b0 
							
						 
					 
					
						
						
							
							fea(dockerfile): better layer caching ( #159 )  
						
						
						
					 
					
						2023-04-14 10:12:21 +02:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d6a93fe992 
							
						 
					 
					
						
						
							
							fix(server): fix flash-neox scores warping ( #137 )  
						
						
						
					 
					
						2023-03-24 18:21:41 +01:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d8dc8f1b0c 
							
						 
					 
					
						
						
							
							feat(python-client): add new parameters ( #118 )  
						
						
						
					 
					
						2023-03-09 16:05:33 +01:00 
						 
				 
			
				
					
						
							
							
								OlivierDehaene 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3fef90d50f 
							
						 
					 
					
						
						
							
							feat(clients): Python client ( #103 )  
						
						
						
					 
					
						2023-03-07 18:52:22 +01:00