mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-10-20 12:25:23 +00:00
* feat: add support for HF_HUB_USER_AGENT_ORIGIN to add user-agent Origin field in Hub requests. * fix: Rust version for Neuron * fix: PR comments, use rust-toolchain.toml |
||
|---|---|---|
| .. | ||
| server | ||
| tests | ||
| Cargo.toml | ||
| Makefile | ||
| README.md | ||
| tgi_env.py | ||
| tgi-entrypoint.sh | ||
Text-generation-inference - Neuron backend for AWS Trainium and inferentia2
Description
This is the TGI backend for AWS Neuron Trainium and Inferentia family of chips.
This backend is composed of:
- the AWS Neuron SDK,
- the legacy v2 TGI launcher and router,
- a neuron specific inference server for text-generation.
Usage
Please refer to the official documentation.
Build your own image
The simplest way to build TGI with the neuron backend is to use the provided Makefile:
$ make -C backends/neuron image
Alternatively, you can build the image directly from the top directory using a command similar to the one defined
in the Makefile under the image target.