From 3fb7803ed926bbe65fbfe1ebece6d01775d1ec0a Mon Sep 17 00:00:00 2001 From: Merve Noyan Date: Thu, 10 Aug 2023 15:28:28 +0300 Subject: [PATCH] adjust toctree and shell to bash --- docs/source/_toctree.yml | 2 +- docs/source/basic_tutorials/using_cli.md | 2 +- docs/source/installation.md | 16 ++++++++-------- docs/source/quicktour.md | 6 +++--- 4 files changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml index 60834860..a161dc28 100644 --- a/docs/source/_toctree.yml +++ b/docs/source/_toctree.yml @@ -16,5 +16,5 @@ - local: basic_tutorials/gated_model_access title: Serving Private & Gated Models - local: basic_tutorials/using_cli - title: Using TGI through CLI + title: Using TGI CLI title: Tutorials diff --git a/docs/source/basic_tutorials/using_cli.md b/docs/source/basic_tutorials/using_cli.md index 7f261978..072925b0 100644 --- a/docs/source/basic_tutorials/using_cli.md +++ b/docs/source/basic_tutorials/using_cli.md @@ -30,6 +30,6 @@ You can also find it hosted in this [Swagger UI](https://huggingface.github.io/t Same documentation can be found for `text-generation-server`. -```shell +```bash text-generation-server --help ``` diff --git a/docs/source/installation.md b/docs/source/installation.md index 0310cf7f..1301b930 100644 --- a/docs/source/installation.md +++ b/docs/source/installation.md @@ -8,14 +8,14 @@ You can use TGI command-line interface (CLI) to download weights, serve and quan To install the CLI, you need to first clone the TGI repository and then run `make`. -```shell +```bash git clone https://github.com/huggingface/text-generation-inference.git && cd text-generation-inference make install ``` If you would like to serve models with custom kernels, run -```shell +```bash BUILD_EXTENSIONS=True make install ``` @@ -28,7 +28,7 @@ Text Generation Inference is available on pypi, conda and GitHub. To install and launch locally, first [install Rust](https://rustup.rs/) and create a Python virtual environment with at least Python 3.9, e.g. using conda: -```shell +```bash curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh conda create -n text-generation-inference python=3.9 @@ -39,7 +39,7 @@ You may also need to install Protoc. On Linux: -```shell +```bash PROTOC_ZIP=protoc-21.12-linux-x86_64.zip curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP sudo unzip -o $PROTOC_ZIP -d /usr/local bin/protoc @@ -49,13 +49,13 @@ rm -f $PROTOC_ZIP On MacOS, using Homebrew: -```shell +```bash brew install protobuf ``` Then run to install Text Generation Inference: -```shell +```bash git clone https://github.com/huggingface/text-generation-inference.git && cd text-generation-inference BUILD_EXTENSIONS=True make install ``` @@ -64,7 +64,7 @@ BUILD_EXTENSIONS=True make install On some machines, you may also need the OpenSSL libraries and gcc. On Linux machines, run: -```shell +```bash sudo apt-get install libssl-dev gcc -y ``` @@ -72,7 +72,7 @@ sudo apt-get install libssl-dev gcc -y Once installation is done, simply run: -```shell +```bash make run-falcon-7b-instruct ``` diff --git a/docs/source/quicktour.md b/docs/source/quicktour.md index 6abf7a82..f447bc19 100644 --- a/docs/source/quicktour.md +++ b/docs/source/quicktour.md @@ -4,7 +4,7 @@ The easiest way of getting started is using the official Docker container. Insta Let's say you want to deploy [Falcon-7B Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct) model with TGI. Here is an example on how to do that: -```shell +```bash model=tiiuae/falcon-7b-instruct volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run @@ -19,7 +19,7 @@ To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvi Once TGI is running, you can use the `generate` endpoint by doing requests. To learn more about how to query the endpoints, check the [Consuming TGI](./basic_tutorials/consuming_tgi) section. -```shell +```bash curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json' ``` @@ -27,7 +27,7 @@ curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","par To see all possible flags and options, you can use the `--help` flag. It's possible to configure the number of shards, quantization, generation parameters, and more. -```shell +```bash docker run ghcr.io/huggingface/text-generation-inference:1.0.0 --help ```