Commit Graph

2 Commits

Author SHA1 Message Date
Antoni Baum
2a101207d4
fix(server): Handle loading from local files for MPT (#534)
This PR allows the MPT model to be loaded from local files. Without this
change, an exception will be thrown by `hf_hub_download` function if
`model_id` is a local path.
2023-07-04 18:37:25 +02:00
Nicolas Patry
1da07e85aa
feat(server): Add Non flash MPT. (#514)
# What does this PR do?


This adds a non flash version of MPT.
Flash is harder because we need to create a bias ready cuda kernel of
flash attention.

Fixes
https://github.com/huggingface/text-generation-inference/issues/361
Fixes
https://github.com/huggingface/text-generation-inference/issues/491
Fixes
https://github.com/huggingface/text-generation-inference/issues/290
2023-07-03 13:01:46 +02:00