From 7dbaef3f5b854babb656141596e548227f6ec41f Mon Sep 17 00:00:00 2001 From: Omar Sanseviero Date: Thu, 10 Aug 2023 14:32:51 +0200 Subject: [PATCH] Minor docs style fixes (#806) --- docs/source/basic_tutorials/consuming_tgi.md | 5 ++--- docs/source/installation.md | 15 ++++++++------- docs/source/quicktour.md | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/source/basic_tutorials/consuming_tgi.md b/docs/source/basic_tutorials/consuming_tgi.md index 619e0a31f..7fb74719f 100644 --- a/docs/source/basic_tutorials/consuming_tgi.md +++ b/docs/source/basic_tutorials/consuming_tgi.md @@ -6,7 +6,7 @@ There are many ways you can consume Text Generation Inference server in your app After the launch, you can query the model using either the `/generate` or `/generate_stream` routes: -```shell +```bash curl 127.0.0.1:8080/generate \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \ @@ -20,14 +20,13 @@ curl 127.0.0.1:8080/generate \ You can simply install `huggingface-hub` package with pip. -```python +```bash pip install huggingface-hub ``` Once you start the TGI server, instantiate `InferenceClient()` with the URL to the endpoint serving the model. You can then call `text_generation()` to hit the endpoint through Python. ```python - from huggingface_hub import InferenceClient client = InferenceClient(model=URL_TO_ENDPOINT_SERVING_TGI) diff --git a/docs/source/installation.md b/docs/source/installation.md index 4105acf43..a8e2e7516 100644 --- a/docs/source/installation.md +++ b/docs/source/installation.md @@ -16,7 +16,7 @@ Text Generation Inference is available on pypi, conda and GitHub. To install and launch locally, first [install Rust](https://rustup.rs/) and create a Python virtual environment with at least Python 3.9, e.g. using conda: -```shell +```bash curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh conda create -n text-generation-inference python=3.9 @@ -27,7 +27,7 @@ You may also need to install Protoc. On Linux: -```shell +```bash PROTOC_ZIP=protoc-21.12-linux-x86_64.zip curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP sudo unzip -o $PROTOC_ZIP -d /usr/local bin/protoc @@ -37,13 +37,13 @@ rm -f $PROTOC_ZIP On MacOS, using Homebrew: -```shell +```bash brew install protobuf ``` Then run to install Text Generation Inference: -```shell +```bash BUILD_EXTENSIONS=True make install # Install repository and HF/transformer fork with CUDA kernels ``` @@ -51,7 +51,7 @@ BUILD_EXTENSIONS=True make install # Install repository and HF/transformer fork On some machines, you may also need the OpenSSL libraries and gcc. On Linux machines, run: -```shell +```bash sudo apt-get install libssl-dev gcc -y ``` @@ -59,13 +59,14 @@ sudo apt-get install libssl-dev gcc -y Once installation is done, simply run: -```shell +```bash make run-falcon-7b-instruct ``` This will serve Falcon 7B Instruct model from the port 8080, which we can query. To see all options to serve your models, check in the [codebase](https://github.com/huggingface/text-generation-inference/blob/main/launcher/src/main.rs) or the CLI: -``` + +```bash text-generation-launcher --help ``` diff --git a/docs/source/quicktour.md b/docs/source/quicktour.md index 31185a2d3..77f0a9c52 100644 --- a/docs/source/quicktour.md +++ b/docs/source/quicktour.md @@ -19,7 +19,7 @@ To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvi Once TGI is running, you can use the `generate` endpoint by doing requests. To learn more about how to query the endpoints, check the [Consuming TGI](./basic_tutorials/consuming_tgi) section. -```shell +```bash curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json' ``` @@ -27,7 +27,7 @@ curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","par To see all possible flags and options, you can use the `--help` flag. It's possible to configure the number of shards, quantization, generation parameters, and more. -```shell +```bash docker run ghcr.io/huggingface/text-generation-inference:1.0.0 --help ```