text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-09 23:15:23 +00:00

History

drbh 62b2a8b67b Pali gemma modeling (#1895 ) This PR adds paligemma modeling code Blog post: https://huggingface.co/blog/paligemma Transformers PR: https://github.com/huggingface/transformers/pull/30814 install the latest changes and run with ```bash # get the weights # text-generation-server download-weights gv-hf/PaliGemma-base-224px-hf # run TGI text-generation-launcher --model-id gv-hf/PaliGemma-base-224px-hf ``` basic example sending various requests ```python from huggingface_hub import InferenceClient client = InferenceClient("http://127.0.0.1:3000") images = [ "https://huggingface.co/datasets/hf-internal-testing/fixtures-captioning/resolve/main/cow_beach_1.png", "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png", ] prompts = [ "What animal is in this image?", "Name three colors in this image.", "What are 10 colors in this image?", "Where is the cow standing?", "answer en Where is the cow standing?", "Is there a bird in the image?", "Is ther a cow in the image?", "Is there a rabbit in the image?", "how many birds are in the image?", "how many rabbits are in the image?", ] for img in images: print(f"\nImage: {img.split('/')[-1]}") for prompt in prompts: inputs = f"![]({img}){prompt}\n" json_data = { "inputs": inputs, "parameters": { "max_new_tokens": 30, "do_sample": False, }, } generated_output = client.text_generation(prompt, max_new_tokens=30, stream=False) print([f"{prompt}\n{generated_output}"]) ``` --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>		2024-07-17 05:36:58 +00:00
..
gptq	Refactor layers. (#1866 )	2024-07-17 05:36:58 +00:00
__init__.py	Pad next token chooser parameters with empty logits processors (#151 )	2024-05-29 22:43:56 +02:00
convert.py	Force weights_only (before fully breaking pickle files anyway). (#1710 )	2024-04-25 15:10:53 +03:00
debug.py	Add Habana copyright header (#122 )	2024-04-08 18:06:21 +02:00
dist.py	add intel xpu support for TGI (#1475 )	2024-06-10 13:16:45 +03:00
flash_attn.py	Pali gemma modeling (#1895 )	2024-07-17 05:36:58 +00:00
hub.py	Revamp medusa implementation so that every model can benefit. (#1588 )	2024-04-25 09:13:03 +03:00
import_utils.py	Refactor layers. (#1866 )	2024-07-17 05:36:58 +00:00
log.py	v1.3.4	2024-04-22 09:08:34 +03:00
logits_process.py	Fix dtype mismatch in HeterogeneousFrequencyPenaltyLogitsProcessor (#163 )	2024-07-03 10:57:41 +02:00
paged_attention.py	Refactor layers. (#1866 )	2024-07-17 05:36:58 +00:00
peft.py	fix: fix local loading for .bin models (#1419 )	2024-04-22 09:17:52 +03:00
speculate.py	chore: formatting	2024-04-18 16:26:00 +03:00
tokens.py	Use the generation config. (#1808 )	2024-06-10 09:53:00 +03:00
watermark.py	Add changes from Optimum Habana's TGI folder	2023-12-05 11:12:16 +01:00
weights.py	Refactor layers. (#1866 )	2024-07-17 05:36:58 +00:00