mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-10 20:04:52 +00:00
Update consuming_tgi.md
This commit is contained in:
parent
3a2a13ecd5
commit
8f1d266e69
@ -143,13 +143,9 @@ You can try the demo directly here 👇
|
|||||||
|
|
||||||
You can disable streaming mode using `return` instead of `yield` in your inference function, like below.
|
You can disable streaming mode using `return` instead of `yield` in your inference function, like below.
|
||||||
|
|
||||||
```diff
|
```python
|
||||||
def inference(message, history):
|
def inference(message, history):
|
||||||
partial_message = ""
|
return client.text_generation(message, max_new_tokens=20)
|
||||||
for token in client.text_generation(message, max_new_tokens=20, stream=True):
|
|
||||||
partial_message += token
|
|
||||||
- yield partial_message
|
|
||||||
+ return partial_message
|
|
||||||
```
|
```
|
||||||
|
|
||||||
You can read more about how to customize a `ChatInterface` [here](https://www.gradio.app/guides/creating-a-chatbot-fast).
|
You can read more about how to customize a `ChatInterface` [here](https://www.gradio.app/guides/creating-a-chatbot-fast).
|
||||||
|
Loading…
Reference in New Issue
Block a user