mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-10 20:04:52 +00:00
Cleanup
This commit is contained in:
parent
16a679390e
commit
03975ccb3e
@ -16,8 +16,7 @@ curl 127.0.0.1:8080/generate \
|
|||||||
|
|
||||||
## Inference Client
|
## Inference Client
|
||||||
|
|
||||||
[`huggingface-hub`](https://huggingface.co/docs/huggingface_hub/main/en/index) is a Python library to interact with the Hugging Face Hub, including its endpoints. It provides a nice high-level class, [`~huggingface_hub.InferenceClient`], which makes it easy to make calls to a TGI endpoint. `InferenceClient` also takes care of parameter validation and provides a simple to-use interface. At the moment, `InferenceClient` only works for models hosted with the Inference API or Inference Endpoints.
|
[`huggingface-hub`](https://huggingface.co/docs/huggingface_hub/main/en/index) is a Python library to interact with the Hugging Face Hub, including its endpoints. It provides a nice high-level class, [`~huggingface_hub.InferenceClient`], which makes it easy to make calls to a TGI endpoint. `InferenceClient` also takes care of parameter validation and provides a simple to-use interface.
|
||||||
|
|
||||||
You can simply install `huggingface-hub` package with pip.
|
You can simply install `huggingface-hub` package with pip.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@ -29,7 +28,7 @@ Once you start the TGI server, instantiate `InferenceClient()` with the URL to t
|
|||||||
```python
|
```python
|
||||||
from huggingface_hub import InferenceClient
|
from huggingface_hub import InferenceClient
|
||||||
|
|
||||||
client = InferenceClient(model=URL_TO_ENDPOINT_SERVING_TGI)
|
client = InferenceClient(model="http://127.0.0.1:8080")
|
||||||
client.text_generation(prompt="Write a code for snake game")
|
client.text_generation(prompt="Write a code for snake game")
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -52,7 +51,7 @@ print(output)
|
|||||||
You can see how to stream below.
|
You can see how to stream below.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
output = client.text_generation(prompt="Meaning of life is", model="http://localhost:3000/", stream=True, details=True)
|
output = client.text_generation(prompt="Meaning of life is", stream=True, details=True)
|
||||||
print(next(iter(output)))
|
print(next(iter(output)))
|
||||||
|
|
||||||
# TextGenerationStreamResponse(token=Token(id=267, text=' a', logprob=-2.0723474, special=False), generated_text=None, details=None)
|
# TextGenerationStreamResponse(token=Token(id=267, text=' a', logprob=-2.0723474, special=False), generated_text=None, details=None)
|
||||||
|
Loading…
Reference in New Issue
Block a user