diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml index ad4f29f6..1598c248 100644 --- a/docs/source/_toctree.yml +++ b/docs/source/_toctree.yml @@ -23,6 +23,8 @@ title: All TGI CLI options - local: basic_tutorials/non_core_models title: Non-core Model Serving + - local: basic_tutorials/safety + title: Safety title: Tutorials - sections: - local: conceptual/streaming diff --git a/docs/source/basic_tutorials/safety.md b/docs/source/basic_tutorials/safety.md new file mode 100644 index 00000000..0b865db4 --- /dev/null +++ b/docs/source/basic_tutorials/safety.md @@ -0,0 +1,31 @@ +# Model safety. + +[Pytorch uses pickle](https://pytorch.org/docs/master/generated/torch.load.html) by default meaning that for quite a long while +*Every* model using that format is potentially executing unintended code while purely loading the model. + +There is a big red warning on Python's page for pickle [link](https://docs.python.org/3/library/pickle.html) but for quite a while +this was ignored by the community. Now that AI/ML is getting used much more ubiquitously we need to switch away from this format. + +HuggingFace is leading the effort here by creating a new format which contains pure data ([safetensors](https://github.com/huggingface/safetensors)) +and moving slowly but surely all the libs to make use of it by default. +The move is intentionnally slow in order to make breaking changes as little impact as possible on users throughout. + + +# TGI 2.0 + +Since the release of TGI 2.0, we take the opportunity of this major version increase to break backward compatibility for these pytorch +models (since they are a huge security risk for anyone deploying them). + + +From now on, TGI will not convert automatically pickle files without having `--trust-remote-code` flag or `TRUST_REMOTE_CODE=true` in the environment variables. +This flag is already used for community defined inference code, and is therefore quite representative of the level of confidence you are giving the model providers. + + +If you want to use a model that uses pickle, but you still do not want to trust the authors entirely we recommend making a convertion on our space made for that. + +https://huggingface.co/spaces/safetensors/convert + +This space will create a PR on the original model, which you are use directly regardless of merge status from the original authors. Just use +``` +docker run .... --revision refs/pr/#ID # Or use REVISION=refs/pr/#ID in the environment +``` diff --git a/server/text_generation_server/cli.py b/server/text_generation_server/cli.py index 8b7839e6..4a308e08 100644 --- a/server/text_generation_server/cli.py +++ b/server/text_generation_server/cli.py @@ -293,6 +293,13 @@ def download_weights( local_pt_files = utils.download_weights(pt_filenames, model_id, revision) if auto_convert: + if not trust_remote_code: + logger.warning( + f"🚨🚨BREAKING CHANGE in 2.0🚨🚨: Safetensors conversion is disabled without `--trust-remote-code` because " + f"Pickle files are unsafe and can essentially contain remote code execution!" + f"Please check for more information here: https://huggingface.co/docs/text-generation-inference/basic_tutorials/safety", + ) + logger.warning( f"No safetensors weights found for model {model_id} at revision {revision}. " f"Converting PyTorch weights to safetensors."