text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-04-20 14:22:08 +00:00

Author	SHA1	Message	Date
Nicolas Patry	08bbfa16a1	Preparing for release. (#3060 ) * Preparing for release. * Upgrade doc. * Fix docs auto-generated. * Fix update doc along.	2025-03-04 16:47:10 +01:00
Nicolas Patry	c9d68945cc	Prepare for release 3.1.0 (#2972 ) * Prepare for release 3.1.0 * Back on main flake. * Fixing stuff. * Upgrade to moe-kernels 0.8.2 for Hip support. * Deactivating the flaky test.	2025-01-31 14:19:01 +01:00
Nicolas Patry	29a0893b67	Tmp tp transformers (#2942 ) * Upgrade the version number. * Remove modifications in Lock. * Tmp branch to test transformers backend with 2.5.1 and TP>1 * Fixing the transformers backend. inference_mode forces the use of `aten.matmul` instead of `aten.mm` the former doesn't have sharding support crashing the transformers TP support. `lm_head.forward` also crashes because it skips the hook that cast/decast the DTensor. Torch 2.5.1 is required for sharding support. * Put back the attention impl. * Revert the flashinfer (this will fails). * Building AOT. * Using 2.5 kernels. * Remove the archlist, it's defined in the docker anyway.	2025-01-23 18:07:30 +01:00
Nicolas Patry	07b01293c5	Prepare patch release. (#2829 )	2024-12-11 21:03:50 +01:00
Nicolas Patry	042791fbd5	Prep new version (#2810 ) * New version. * Link fixup. * Update docs. * FIxup.	2024-12-09 20:42:42 +01:00
OlivierDehaene	780531ec77	chore: prepare 2.4.1 release (#2773 ) * chore: prepare 2.4.1 release * fix tests * fmt	2024-11-22 17:26:15 +00:00
OlivierDehaene	a6b02da971	chore: prepare 2.4.0 release (#2695 )	2024-10-25 21:10:49 +00:00
Nicolas Patry	f6e2f05b16	New release 2.3.1 (#2604 ) * New release 2.3.1 * Update doc number	2024-10-03 14:43:49 +02:00
Nicolas Patry	169178b937	Preparing for release. (#2540 ) * Preparing for release. * Upgrade version in docs.	2024-09-20 17:42:04 +02:00
Wang, Yi	9883f3b40e	update doc with intel cpu part (#2420 ) * update doc with intel cpu part Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Apply suggestions from code review we do not use latest ever in documentation, it causes too many issues for users. Release number get update on every release. --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2024-08-29 17:42:02 +02:00
Nicolas Patry	5d121a9705	Preparing for release. (#2285 ) * Preparing for release. * Updating docs. * Fixing token within the docker image for the launcher.	2024-07-23 16:20:17 +02:00
Wang, Yi	07e240ca37	add doc for intel gpus (#2181 ) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2024-07-08 15:57:06 +02:00

12 Commits