text-generation-inference/Dockerfile

# Rust builder
FROM lukemathwalker/cargo-chef:latest-rust-1.78 AS chef
WORKDIR /usr/src

FROM chef as planner
COPY Cargo.toml Cargo.toml
COPY rust-toolchain.toml rust-toolchain.toml
COPY proto proto
COPY benchmark benchmark
COPY router router
COPY launcher launcher
RUN cargo chef prepare --recipe-path recipe.json

FROM chef AS builder

RUN PROTOC_ZIP=protoc-21.12-linux-x86_64.zip && \
    curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP && \
    unzip -o $PROTOC_ZIP -d /usr/local bin/protoc && \
    unzip -o $PROTOC_ZIP -d /usr/local 'include/*' && \
    rm -f $PROTOC_ZIP

COPY --from=planner /usr/src/recipe.json recipe.json
COPY Cargo.lock Cargo.lock
RUN cargo chef cook --release --recipe-path recipe.json

COPY Cargo.toml Cargo.toml
COPY rust-toolchain.toml rust-toolchain.toml
COPY proto proto
COPY benchmark benchmark
COPY router router
COPY launcher launcher
RUN cargo build --release

# Text Generation Inference base image
FROM vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest as base

# Text Generation Inference base env
ENV HUGGINGFACE_HUB_CACHE=/data \
    HF_HUB_ENABLE_HF_TRANSFER=1 \
    PORT=80

# libssl.so.1.1 is not installed on Ubuntu 22.04 by default, install it
RUN wget http://nz2.archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb && \
    dpkg -i ./libssl1.1_1.1.1f-1ubuntu2_amd64.deb

WORKDIR /usr/src

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        libssl-dev \
        ca-certificates \
        make \
        curl \
        git \
        && rm -rf /var/lib/apt/lists/*

# Install server
COPY proto proto
COPY server server
COPY server/Makefile server/Makefile
RUN cd server && \
    make gen-server && \
    pip install -r requirements.txt && \
    bash ./dill-0.3.8-patch.sh && \
    pip install git+https://github.com/HabanaAI/DeepSpeed.git@1.17.0 && \
    pip install . --no-cache-dir

# Install benchmarker
COPY --from=builder /usr/src/target/release/text-generation-benchmark /usr/local/bin/text-generation-benchmark
# Install router
COPY --from=builder /usr/src/target/release/text-generation-router /usr/local/bin/text-generation-router
# Install launcher
COPY --from=builder /usr/src/target/release/text-generation-launcher /usr/local/bin/text-generation-launcher

# Final image
FROM base

ENTRYPOINT ["text-generation-launcher"]
CMD ["--json-output"]
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`# Rust builder`
Upgrading to rust 1.78. (#1851) <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> 2024-05-06 11:48:11 +00:00			`FROM lukemathwalker/cargo-chef:latest-rust-1.78 AS chef`
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`WORKDIR /usr/src`

			`FROM chef as planner`
			`COPY Cargo.toml Cargo.toml`
			`COPY rust-toolchain.toml rust-toolchain.toml`
			`COPY proto proto`
fix(docker): fix docker build (#299) 2023-05-09 12:39:59 +00:00			`COPY benchmark benchmark`
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`COPY router router`
			`COPY launcher launcher`
			`RUN cargo chef prepare --recipe-path recipe.json`

			`FROM chef AS builder`
feat: add distributed tracing (#62) 2023-02-13 12:02:45 +00:00
			`RUN PROTOC_ZIP=protoc-21.12-linux-x86_64.zip && \`
			`curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP && \`
			`unzip -o $PROTOC_ZIP -d /usr/local bin/protoc && \`
			`unzip -o $PROTOC_ZIP -d /usr/local 'include/*' && \`
			`rm -f $PROTOC_ZIP`
feat: Docker image 2022-10-14 13:56:21 +00:00
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`COPY --from=planner /usr/src/recipe.json recipe.json`
Fixing packages in Dockerfile (#162) 2024-06-19 21:44:47 +00:00			`COPY Cargo.lock Cargo.lock`
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`RUN cargo chef cook --release --recipe-path recipe.json`
feat: Docker image 2022-10-14 13:56:21 +00:00
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`COPY Cargo.toml Cargo.toml`
fix(server): Fix Transformers fork version 2022-11-08 16:42:38 +00:00			`COPY rust-toolchain.toml rust-toolchain.toml`
feat: Docker image 2022-10-14 13:56:21 +00:00			`COPY proto proto`
fix(docker): fix docker build (#299) 2023-05-09 12:39:59 +00:00			`COPY benchmark benchmark`
feat: Docker image 2022-10-14 13:56:21 +00:00			`COPY router router`
v0.1.0 2022-10-18 13:19:03 +00:00			`COPY launcher launcher`
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`RUN cargo build --release`
v0.1.0 2022-10-18 13:19:03 +00:00
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`# Text Generation Inference base image`
Upgrade SynapseAI version to 1.17.0 (#208) Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Thanaji Rao Thakkalapelli <tthakkalapelli@habana.ai> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> 2024-08-26 08:49:29 +00:00			`FROM vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest as base`
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00
			`# Text Generation Inference base env`
			`ENV HUGGINGFACE_HUB_CACHE=/data \`
feat(server): enable hf-transfer (#76) 2023-02-18 13:04:11 +00:00			`HF_HUB_ENABLE_HF_TRANSFER=1 \`
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`PORT=80`
feat: Docker image 2022-10-14 13:56:21 +00:00
Add changes from Optimum Habana's TGI folder 2023-12-05 10:12:16 +00:00			`# libssl.so.1.1 is not installed on Ubuntu 22.04 by default, install it`
			`RUN wget http://nz2.archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb && \`
			`dpkg -i ./libssl1.1_1.1.1f-1ubuntu2_amd64.deb`

fix(docker): revert dockerfile changes (#186) 2023-04-14 17:30:30 +00:00			`WORKDIR /usr/src`
feat(server): flash neoX (#133) 2023-03-24 13:02:14 +00:00
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \`
			`libssl-dev \`
			`ca-certificates \`
			`make \`
Install curl to be able to perform more advanced healthchecks (#1033) # What does this PR do? Install curl within base image, negligible regarding the image volume and will allow to easily perform a better health check. Not sure about the failing github actions though. Should I fix something ? Signed-off-by: Raphael <oOraph@users.noreply.github.com> Co-authored-by: Raphael <oOraph@users.noreply.github.com> 2023-09-26 13:23:47 +00:00			`curl \`
Pali gemma modeling (#1895) This PR adds paligemma modeling code Blog post: https://huggingface.co/blog/paligemma Transformers PR: https://github.com/huggingface/transformers/pull/30814 install the latest changes and run with ```bash # get the weights # text-generation-server download-weights gv-hf/PaliGemma-base-224px-hf # run TGI text-generation-launcher --model-id gv-hf/PaliGemma-base-224px-hf ``` basic example sending various requests ```python from huggingface_hub import InferenceClient client = InferenceClient("http://127.0.0.1:3000") images = [ "https://huggingface.co/datasets/hf-internal-testing/fixtures-captioning/resolve/main/cow_beach_1.png", "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png", ] prompts = [ "What animal is in this image?", "Name three colors in this image.", "What are 10 colors in this image?", "Where is the cow standing?", "answer en Where is the cow standing?", "Is there a bird in the image?", "Is ther a cow in the image?", "Is there a rabbit in the image?", "how many birds are in the image?", "how many rabbits are in the image?", ] for img in images: print(f"\nImage: {img.split('/')[-1]}") for prompt in prompts: inputs = f"![]({img}){prompt}\n" json_data = { "inputs": inputs, "parameters": { "max_new_tokens": 30, "do_sample": False, }, } generated_output = client.text_generation(prompt, max_new_tokens=30, stream=False) print([f"{prompt}\n{generated_output}"]) ``` --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> 2024-05-16 04:58:47 +00:00			`git \`
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`&& rm -rf /var/lib/apt/lists/*`
feat: Docker image 2022-10-14 13:56:21 +00:00
			`# Install server`
feat(server): Use safetensors Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> 2022-10-22 18:00:15 +00:00			`COPY proto proto`
feat: Docker image 2022-10-14 13:56:21 +00:00			`COPY server server`
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`COPY server/Makefile server/Makefile`
feat: Docker image 2022-10-14 13:56:21 +00:00			`RUN cd server && \`
feat(server): Use safetensors Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> 2022-10-22 18:00:15 +00:00			`make gen-server && \`
fix(docker): fix docker image dependencies (#187) 2023-04-16 22:26:47 +00:00			`pip install -r requirements.txt && \`
A patch to address HPU Graphs issue with DILL A temp solution to address overriding issue installing dill with habana torch from gaudi-docker/1.15.0 - Having `import __main__ as _main_module` in the global space of the dill module causes some overriding issue on hpu graph destructor 2024-04-23 19:57:39 +00:00			`bash ./dill-0.3.8-patch.sh && \`
Upgrade SynapseAI version to 1.17.0 (#208) Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Thanaji Rao Thakkalapelli <tthakkalapelli@habana.ai> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> 2024-08-26 08:49:29 +00:00			`pip install git+https://github.com/HabanaAI/DeepSpeed.git@1.17.0 && \`
Add changes from Optimum Habana's TGI folder 2023-12-05 10:12:16 +00:00			`pip install . --no-cache-dir`
feat: Docker image 2022-10-14 13:56:21 +00:00
feat(docker): add benchmarking tool to docker image (#298) 2023-05-09 11:19:31 +00:00			`# Install benchmarker`
			`COPY --from=builder /usr/src/target/release/text-generation-benchmark /usr/local/bin/text-generation-benchmark`
feat: Docker image 2022-10-14 13:56:21 +00:00			`# Install router`
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`COPY --from=builder /usr/src/target/release/text-generation-router /usr/local/bin/text-generation-router`
feat(server): Use safetensors Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> 2022-10-22 18:00:15 +00:00			`# Install launcher`
feat(ci): improve CI speed (#94) 2023-03-03 14:07:27 +00:00			`COPY --from=builder /usr/src/target/release/text-generation-launcher /usr/local/bin/text-generation-launcher`
feat: Docker image 2022-10-14 13:56:21 +00:00
fea(dockerfile): better layer caching (#159) 2023-04-14 08:12:21 +00:00			`# Final image`
feat: aws sagemaker compatible image (#147) The only difference is that now it pushes to registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:... instead of registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-... --------- Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com> 2023-03-29 19:38:30 +00:00			`FROM base`

Upgrade to 2.0.4 Signed-off-by: yuanwu <yuan.wu@intel.com> 2024-07-17 05:08:52 +00:00			`ENTRYPOINT ["text-generation-launcher"]`
			`CMD ["--json-output"]`