Switch async client example to use stream

2025-09-10 20:04:52 +00:00 · 2023-08-16 17:17:15 +02:00 · 2023-08-16 17:17:15 +02:00 · f3266b8a4a
commit f3266b8a4a
parent 3dfa7d33eb
1 changed files with 16 additions and 4 deletions
--- a/docs/source/conceptual/streaming.md
+++ b/docs/source/conceptual/streaming.md
@ -38,7 +38,7 @@ To stream tokens with `InferenceClient`, simply pass `stream=True` and iterate o
 ```python
 from huggingface_hub import InferenceClient
-client = InferenceClient(model="http://127.0.0.1:8080")
+client = InferenceClient("http://127.0.0.1:8080")
 for token in client.text_generation("How do you make cheese?", max_new_tokens=12, stream=True):
    print(token)
@ -73,9 +73,21 @@ The `huggingface_hub` library also comes with an `AsyncInferenceClient` in case
 ```python
 from huggingface_hub import AsyncInferenceClient
-client = AsyncInferenceClient(URL_TO_ENDPOINT_SERVING_TGI)
+client = AsyncInferenceClient("http://127.0.0.1:8080")
-await client.text_generation("How do you make cheese?")
+async for token in await client.text_generation("How do you make cheese?", stream=True):
-# \nTo make cheese, you need to start with milk.
+    print(token)
 # To
 # make
 # cheese
 #,
 # you
 # need
 # to
 # start
 # with
 # milk
 #.
 ```
 ### Streaming with cURL