text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-09 06:55:24 +00:00

History

Vincent Brouwers 8a5f564942 Fix Falcon weight mapping for H2O.ai checkpoints (#953 ) # What does this PR do? During the safetensor conversion, duplicate weights are removed. However, which of the duplicates gets removed, differs per checkpoint. In some, like `h2oai/h2ogpt-oig-oasst1-falcon-40b`, the weight `transformer.word_embeddings.weightSafetensor` gets removed. In others, `lm_head.weight` gets removed. Long story long, we need to support both. Originally, `f018143` mapped `lm_head` to `word_embeddings`. Then `ac736fd` switched this around. This commit merges them and allows for both. ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? @Narsil, you wrote both commits I referenced in this PR. I think you'll understand this change :)		2023-08-31 21:15:14 +02:00
..
custom_modeling	Fix f180 (#951 )	2023-08-30 11:09:46 +02:00
__init__.py	Adding Idefics multi modal model. (#842 )	2023-08-17 14:38:49 +02:00
bloom.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
causal_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
flash_causal_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
flash_llama.py	fix: LlamaTokenizerFast to AutoTokenizer at flash_llama.py (#619 )	2023-08-14 14:20:18 +02:00
flash_neox.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
flash_rw.py	Fix Falcon weight mapping for H2O.ai checkpoints (#953 )	2023-08-31 21:15:14 +02:00
flash_santacoder.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
galactica.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
gpt_neox.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
idefics_causal_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
idefics.py	Adding Idefics multi modal model. (#842 )	2023-08-17 14:38:49 +02:00
model.py	Fix typing in `Model.generate_token` (#733 )	2023-07-31 14:35:14 +02:00
mpt.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
opt.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
rw.py	"Fix" for rw-1b. (#860 )	2023-08-17 09:05:41 +02:00
santacoder.py	Directly load GPTBigCode to specified device (#618 )	2023-07-21 11:27:31 +02:00
seq2seq_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
t5.py	fix(server): T5 weights names. (#582 )	2023-07-12 10:01:42 +02:00
types.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00