mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-11 12:24:53 +00:00
update doc
This commit is contained in:
parent
faaa9dfe0a
commit
9e042bd117
@ -197,6 +197,14 @@ Options:
|
|||||||
[env: MAX_WAITING_TOKENS=]
|
[env: MAX_WAITING_TOKENS=]
|
||||||
[default: 20]
|
[default: 20]
|
||||||
|
|
||||||
|
```
|
||||||
|
## MAX_BATCH_SIZE
|
||||||
|
```shell
|
||||||
|
--max-batch-size <MAX_BATCH_SIZE>
|
||||||
|
Enforce a maximum number of requests per batch Specific flag for hardware targets that do not support unpadded inference
|
||||||
|
|
||||||
|
[env: MAX_BATCH_SIZE=]
|
||||||
|
|
||||||
```
|
```
|
||||||
## HOSTNAME
|
## HOSTNAME
|
||||||
```shell
|
```shell
|
||||||
|
Loading…
Reference in New Issue
Block a user