text-generation-inference/server/text_generation_server/models
Vincent Brouwers 8a5f564942
Fix Falcon weight mapping for H2O.ai checkpoints (#953)
# What does this PR do?
During the safetensor conversion, duplicate weights are removed.
However, which of the duplicates gets removed, differs per checkpoint.
In some, like `h2oai/h2ogpt-oig-oasst1-falcon-40b`, the weight
`transformer.word_embeddings.weightSafetensor` gets removed. In others,
`lm_head.weight` gets removed. Long story long, we need to support both.

Originally, f018143 mapped `lm_head` to `word_embeddings`. Then ac736fd
switched this around. This commit merges them and allows for both.

## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

@Narsil, you wrote both commits I referenced in this PR. I think you'll
understand this change :)
2023-08-31 21:15:14 +02:00
..
custom_modeling Fix f180 (#951) 2023-08-30 11:09:46 +02:00
__init__.py Adding Idefics multi modal model. (#842) 2023-08-17 14:38:49 +02:00
bloom.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
causal_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
flash_causal_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
flash_llama.py fix: LlamaTokenizerFast to AutoTokenizer at flash_llama.py (#619) 2023-08-14 14:20:18 +02:00
flash_neox.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
flash_rw.py Fix Falcon weight mapping for H2O.ai checkpoints (#953) 2023-08-31 21:15:14 +02:00
flash_santacoder.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
galactica.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
gpt_neox.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
idefics_causal_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
idefics.py Adding Idefics multi modal model. (#842) 2023-08-17 14:38:49 +02:00
model.py Fix typing in Model.generate_token (#733) 2023-07-31 14:35:14 +02:00
mpt.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
opt.py feat(server): Using quantize_config.json instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
rw.py "Fix" for rw-1b. (#860) 2023-08-17 09:05:41 +02:00
santacoder.py Directly load GPTBigCode to specified device (#618) 2023-07-21 11:27:31 +02:00
seq2seq_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
t5.py fix(server): T5 weights names. (#582) 2023-07-12 10:01:42 +02:00
types.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00