feat: fix typo and add more diagrams

This commit is contained in:
drbh 2024-04-30 14:54:11 -04:00
parent d48846351d
commit a2e48ec3a2
3 changed files with 15 additions and 4 deletions

View File

@ -26,7 +26,7 @@
- local: basic_tutorials/safety - local: basic_tutorials/safety
title: Safety title: Safety
- local: basic_tutorials/using_guidance - local: basic_tutorials/using_guidance
title: Using Guidance, JSON, tools (via outlines) title: Using Guidance, JSON, tools
- local: basic_tutorials/visual_language_models - local: basic_tutorials/visual_language_models
title: Visual Language Models title: Visual Language Models
title: Tutorials title: Tutorials
@ -46,6 +46,6 @@
- local: conceptual/speculation - local: conceptual/speculation
title: Speculation (Medusa, ngram) title: Speculation (Medusa, ngram)
- local: conceptual/guidance - local: conceptual/guidance
title: How Guidance Works title: How Guidance Works (via outlines)
title: Conceptual Guides title: Conceptual Guides

View File

@ -122,7 +122,7 @@ print(response.json())
### JSON Schema Integration ### JSON Schema Integration
If Pydantic's not your style, go raw with direct JSON Schema integration. This is simliar to the first example but with programmatic control. If Pydantic's not your style, go raw with direct JSON Schema integration. This is similar to the first example but with programmatic control.
```python ```python
import requests import requests

View File

@ -23,7 +23,6 @@ However these use cases can span a wide range of applications, such as:
- provide reliable and consistent output for downstream tasks - provide reliable and consistent output for downstream tasks
- extract data from multimodal inputs - extract data from multimodal inputs
## How it works? ## How it works?
Diving into the details, guidance is enabled by including a grammar with a generation request that is compiled, and used to modify the chosen tokens. Diving into the details, guidance is enabled by including a grammar with a generation request that is compiled, and used to modify the chosen tokens.
@ -31,6 +30,18 @@ Diving into the details, guidance is enabled by including a grammar with a gener
This process can be broken down into the following steps: This process can be broken down into the following steps:
1. A request is sent to the backend, it is processed and placed in batch. Processing includes compiling the grammar into a finite state machine and a grammar state. 1. A request is sent to the backend, it is processed and placed in batch. Processing includes compiling the grammar into a finite state machine and a grammar state.
<div class="flex justify-center">
<img
class="block dark:hidden"
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/tgi/request-to-batch.gif"
/>
<img
class="hidden dark:block"
src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/tgi/request-to-batch-dark.gif"
/>
</div>
2. The model does a forward pass over the batch. This returns probabilities for each token in the vocabulary for each request in the batch. 2. The model does a forward pass over the batch. This returns probabilities for each token in the vocabulary for each request in the batch.
3. The process of choosing one of those tokens is called `sampling`. The model samples from the distribution of probabilities to choose the next token. In TGI all of the steps before sampling are called `processor`. Grammars are applied as a processor that masks out tokens that are not allowed by the grammar. 3. The process of choosing one of those tokens is called `sampling`. The model samples from the distribution of probabilities to choose the next token. In TGI all of the steps before sampling are called `processor`. Grammars are applied as a processor that masks out tokens that are not allowed by the grammar.