mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-11 12:24:53 +00:00
Soft deprecation with clear text explaining the rationale.
This commit is contained in:
parent
ac118a5ad0
commit
96846f633a
@ -23,6 +23,8 @@
|
||||
title: All TGI CLI options
|
||||
- local: basic_tutorials/non_core_models
|
||||
title: Non-core Model Serving
|
||||
- local: basic_tutorials/safety
|
||||
title: Safety
|
||||
title: Tutorials
|
||||
- sections:
|
||||
- local: conceptual/streaming
|
||||
|
31
docs/source/basic_tutorials/safety.md
Normal file
31
docs/source/basic_tutorials/safety.md
Normal file
@ -0,0 +1,31 @@
|
||||
# Model safety.
|
||||
|
||||
[Pytorch uses pickle](https://pytorch.org/docs/master/generated/torch.load.html) by default meaning that for quite a long while
|
||||
*Every* model using that format is potentially executing unintended code while purely loading the model.
|
||||
|
||||
There is a big red warning on Python's page for pickle [link](https://docs.python.org/3/library/pickle.html) but for quite a while
|
||||
this was ignored by the community. Now that AI/ML is getting used much more ubiquitously we need to switch away from this format.
|
||||
|
||||
HuggingFace is leading the effort here by creating a new format which contains pure data ([safetensors](https://github.com/huggingface/safetensors))
|
||||
and moving slowly but surely all the libs to make use of it by default.
|
||||
The move is intentionnally slow in order to make breaking changes as little impact as possible on users throughout.
|
||||
|
||||
|
||||
# TGI 2.0
|
||||
|
||||
Since the release of TGI 2.0, we take the opportunity of this major version increase to break backward compatibility for these pytorch
|
||||
models (since they are a huge security risk for anyone deploying them).
|
||||
|
||||
|
||||
From now on, TGI will not convert automatically pickle files without having `--trust-remote-code` flag or `TRUST_REMOTE_CODE=true` in the environment variables.
|
||||
This flag is already used for community defined inference code, and is therefore quite representative of the level of confidence you are giving the model providers.
|
||||
|
||||
|
||||
If you want to use a model that uses pickle, but you still do not want to trust the authors entirely we recommend making a convertion on our space made for that.
|
||||
|
||||
https://huggingface.co/spaces/safetensors/convert
|
||||
|
||||
This space will create a PR on the original model, which you are use directly regardless of merge status from the original authors. Just use
|
||||
```
|
||||
docker run .... --revision refs/pr/#ID # Or use REVISION=refs/pr/#ID in the environment
|
||||
```
|
@ -250,10 +250,13 @@ def download_weights(
|
||||
|
||||
if auto_convert:
|
||||
if not trust_remote_code:
|
||||
raise RuntimeError(
|
||||
f"Safetensors conversion is disabled without `--trust-remote-code` because "
|
||||
import warnings
|
||||
|
||||
warnings.warn(
|
||||
f"🚨🚨BREAKING CHANGE in 2.0🚨🚨: Safetensors conversion is disabled without `--trust-remote-code` because "
|
||||
f"Pickle files are unsafe and can essentially contain remote code execution."
|
||||
f"Please check the safety checks on the hub and ideally make the conversion in a sandbox or in a space."
|
||||
f"Please check for more information here: https://huggingface.co/docs/text-generation-inference/basic_tutorials/safety",
|
||||
UserWarning,
|
||||
)
|
||||
|
||||
logger.warning(
|
||||
|
Loading…
Reference in New Issue
Block a user