Commit Graph

  • 1cae3197c4
    Improve tool call message processing (#3036) drbh 2025-02-21 04:30:29 -0500
  • 7e60666711
    ?? zstd Nicolas Patry 2025-02-21 10:18:56 +0100
  • 3498f6085e
    Update Gradio ChatInterface configuration in consuming_tgi.md (#3042) Adrien Gallouët 2025-02-21 10:11:28 +0100
  • 532e72d5c5
    Proper consistent naming. Nicolas Patry 2025-02-21 10:08:10 +0100
  • 142a49a80d
    Simplify logs2. (#3045) Nicolas Patry 2025-02-21 10:03:40 +0100
  • 821025e60c
    Wrong output. Nicolas Patry 2025-02-21 10:00:14 +0100
  • cfadfe5a1e Use rotary kernel from the Hub Daniël de Kok 2025-02-20 10:19:17 +0000
  • 45787f15bb
    Fixing the condition ? Nicolas Patry 2025-02-21 09:47:11 +0100
  • 262abb8223 ci: doing a precompilation step (with a different token). Nicolas Patry 2025-02-20 15:57:33 +0100
  • 06dfe9abfe
    fix qwen2 vl crash in continous batching (#3004) Wang, Yi 2025-02-21 07:36:45 +0800
  • 3770344529 fix: adjust message types in tests drbh 2025-02-20 19:38:45 +0000
  • d4e434c52a
    Invalid id. Nicolas Patry 2025-02-20 19:21:47 +0100
  • ce5ec9fe47
    Always push do docker.io even during releases. Nicolas Patry 2025-02-20 19:13:55 +0100
  • f3c4d19a56
    Use zstd Nicolas Patry 2025-02-20 19:12:47 +0100
  • 39eeb34d77
    Working around the github runner thing. Nicolas Patry 2025-02-20 18:48:03 +0100
  • debf032ca3 test(neuron): no error anymore when requesting too many tokens David Corvoysier 2025-02-20 17:26:56 +0000
  • 19490afb25
    Changing the scope from module to session to fix the event_loop issue. Nicolas Patry 2025-02-20 18:07:46 +0100
  • 9de97f99c8
    Simplify logs2. Nicolas Patry 2025-02-20 17:50:58 +0100
  • b4fefdcc45
    Fighting docker in docker. Nicolas Patry 2025-02-20 16:55:50 +0100
  • 215f39fb9d
    Pull before running the container python docker doesn't handle zstd correctly it seems. Nicolas Patry 2025-02-20 16:41:15 +0100
  • d39e002fa5
    feat(neuron): avoid installing CUDA in image David Corvoysier 2025-02-20 10:13:49 +0000
  • 7d6ff64c13
    test(neuron): use smaller llama model David Corvoysier 2025-02-19 14:51:43 +0000
  • e89043901d
    fix(neuron): avoid using Levenshtein David Corvoysier 2025-02-19 14:05:29 +0000
  • 393753bc0b
    refactor: remove sagemaker entry-point David Corvoysier 2025-02-19 09:00:54 +0000
  • 49dfdc3f8a
    fix(neuron): export models from container in test fixtures David Corvoysier 2025-02-18 17:47:54 +0000
  • 0e002962da
    feat: add neuron case to build ci drbh 2025-02-17 15:22:02 +0000
  • 8b04be3e04
    review: --privileged should be the exception David Corvoysier 2025-02-18 14:15:08 +0000
  • c7f49d83ff
    review: remove ureq pinned version David Corvoysier 2025-02-18 13:54:31 +0000
  • bc95ef2e8b
    review: do not use latest tag David Corvoysier 2025-02-18 13:50:25 +0000
  • 4a16e8eec2
    test: add --neuron option David Corvoysier 2025-02-18 12:30:34 +0000
  • 2c37e8acbe
    test(neuron): merge integration tests and fixtures David Corvoysier 2025-02-18 10:32:10 +0000
  • 542eee6ca7
    fix(neuron): increase ulimit when building image David Corvoysier 2025-02-17 15:22:36 +0000
  • 0b7c7c3d18
    feat(neuron): add server and integration tests David Corvoysier 2025-02-12 09:10:47 +0000
  • f085204c5e
    feat(neuron): add server standalone installation David Corvoysier 2025-02-11 15:51:09 +0000
  • 13caf6d087
    feat: add neuron backend David Corvoysier 2025-02-11 09:53:16 +0000
  • 21493042ef
    re-enable recompressed Corentin REGAL 2025-02-20 09:42:46 +0100
  • 65c3b3bc21
    re-enable zstd Corentin REGAL 2025-02-14 13:50:56 +0100
  • 3953b8aaf7
    revert zstd test Corentin REGAL 2025-02-14 09:50:28 +0100
  • 87c5e19072
    test fix ci Corentin REGAL 2025-02-06 15:55:20 +0100
  • 51e7d98b6b
    Compress Docker layers with zstd instead of gzip Corentin REGAL 2025-02-03 12:11:52 +0100
  • ed96ba6503
    flashinfer 0.2.0.post1 -> post2 (#3040) Daniël de Kok 2025-02-20 12:34:20 +0100
  • 515c302f1d update ipex and torch to 2.6 for cpu ipex cpu 2.6 support topk_group in moe fusion ops Wang, Yi A 2025-02-20 11:32:22 +0000
  • c808411531
    Update Gradio ChatInterface configuration in consuming_tgi.md Adrien Gallouët 2025-02-20 11:20:47 +0100
  • 166cae4da5
    Fix ruff stuff. Nicolas Patry 2025-02-20 09:17:20 +0100
  • feaa2477b7
    update ipex and torch to 2.6 for cpu (#3039) Wang, Yi 2025-02-20 16:12:28 +0800
  • a02ab4b6ae flashinfer 0.2.0.post1 -> post2 Daniël de Kok 2025-02-20 07:56:00 +0000
  • 4fa8512d99 fix: ruff lint remove unused import drbh 2025-02-19 19:41:17 -0500
  • bcc44890a8 fix: suppoer tool call id in template and remove unnecessary changes drbh 2025-02-20 00:14:20 +0000
  • 56f2d66828 fix: rerun update docs drbh 2025-02-19 03:12:08 +0000
  • bddcf9be6c fix: bump utopia, openapi doc version and improve test drbh 2025-02-19 03:10:01 +0000
  • ac50b14afb feat: add test and serialize tool messages drbh 2025-02-19 00:47:53 +0000
  • f5e1a16582 add tool_calls field to Message struct sailesh duddupudi 2025-02-13 20:24:08 +0000
  • e14617d88d make content field optional in chat request sailesh duddupudi 2025-02-13 19:33:38 +0000
  • 230aa25641
    feat: Add the parsing of HF_HUB_USER_AGENT_ORIGIN environment variable for telemetry (#3027) Hugo Larcher 2025-02-19 21:09:12 +0100
  • 9c89d0070e
    Having less logs in case of failure for checking CI more easily. (#3037) Nicolas Patry 2025-02-19 17:01:33 +0100
  • fde3234cbc
    Using public external registry (to use external runners for CI). (#3031) Nicolas Patry 2025-02-19 14:53:14 +0100
  • 9c1f2574cb
    Ignore entirely the API. Nicolas Patry 2025-02-19 14:52:16 +0100
  • 41d2c559a4
    Cleaning up the versions to uv for the client. Nicolas Patry 2025-02-19 13:07:43 +0100
  • d6a0c67e2f
    feat: add initial qwen2.5-vl model and test (#2971) drbh 2025-02-19 06:38:20 -0500
  • 4070bf2290
    Fixing trtllm tests. Nicolas Patry 2025-02-19 11:58:25 +0100
  • 230b25165d
    Having less logs in case of failure for checking CI more easily. Nicolas Patry 2025-02-19 11:56:34 +0100
  • 05333b7cbe fix: refactor/simplify conditionals drbh 2025-02-18 23:36:02 +0000
  • 3eef08b7a1 make content field optional in Message for role=assistant Andrew Reed 2025-02-18 22:31:56 +0000
  • a7448661f7
    Improve Transformers support (#2970) Cyril Vallez 2025-02-18 19:04:34 +0100
  • fcdb18d1af Support xccl distributed backend Dmitry Rogozhkin 2025-02-14 17:08:27 +0000
  • ff921d22ad
    Fixing the external registry. Nicolas Patry 2025-02-18 15:01:40 +0100
  • 407492aa7b
    Fix build. Nicolas Patry 2025-02-18 13:14:28 +0100
  • 0711b545e5
    bump version Cyril Vallez 2025-02-18 12:22:07 +0100
  • 5543fdc765
    It's find in some machine. using hf_hub::api::sync::Api to download c… (#3030) Nicolas Patry 2025-02-18 12:19:51 +0100
  • 194fa637a0
    bump transformers version Cyril Vallez 2025-02-18 12:04:35 +0100
  • b8a4928d0e
    Pinning trufflehog. (#3032) Nicolas Patry 2025-02-18 12:03:41 +0100
  • aeb6429bd1
    add gpt neox Cyril Vallez 2025-02-18 11:41:51 +0100
  • 188b150b57
    Much better support Cyril Vallez 2025-01-30 15:54:05 +0100
  • 01f688b717
    Pinning trufflehog. Nicolas Patry 2025-02-18 12:02:15 +0100
  • cc9565e805
    Using public external registry (to use external runners for CI). Nicolas Patry 2025-02-18 12:00:13 +0100
  • 09ca20e781
    It's find in some machine. using hf_hub::api::sync::Api to download config is not successful which will make warmup fail since attribute like max_position_embeddings could not be got. update hf-hub to the latest version could fix it Wang, Yi A 2025-02-08 13:56:58 +0000
  • 8a1cfd6122
    Add loop_controls feature to minijinja to handle {% break %} (#2998) Alvaro Bartolome 2025-02-18 10:33:22 +0100
  • 794ec58b75
    Update README.md (#3024) celsowm 2025-02-18 06:08:28 -0300
  • f0ed76583c
    Use eetq kernel from the hub (#3029) Daniël de Kok 2025-02-18 10:03:53 +0100
  • 9052ca9f80
    Fixing the CI. Nicolas Patry 2025-02-18 10:00:14 +0100
  • e4e6ea2598 fix: vendor processor and config from transformers drbh 2025-02-17 18:46:34 +0000
  • 3b096626e8 fix: bump requirements file too drbh 2025-02-17 14:54:49 +0000
  • ef0305dfe4 fix: bump integration test deps for openai drbh 2025-02-17 14:40:31 +0000
  • bba0329f8f fix: ruff lint drbh 2025-02-17 09:26:29 -0500
  • 81786840d7 fix: bump client tests for api changes drbh 2025-02-17 14:19:52 +0000
  • 3f035fd8f2 fix: bump client test expected prefill drbh 2025-02-17 13:56:59 +0000
  • 5f16e79e6a fix: remove snap with incorrect naming drbh 2025-02-17 13:41:06 +0000
  • a840d1d3b8 fix: adjust stream, improve tests and add openai client test drbh 2025-02-17 13:38:49 +0000
  • 2729077d9c fix: ensure wrapping curly is not included drbh 2025-02-11 15:05:14 +0000
  • cf70182447 fix: adjust streaming tool response drbh 2025-02-11 14:22:03 +0000
  • f5626a0576 fix: bump openapi spec drbh 2025-02-10 15:14:00 +0000
  • aad1901aa5 feat: serialize function definition with serialize_as_string drbh 2025-02-07 22:27:24 +0000
  • 7d852cde78 fix: Functioncall is actually a bit different than the deprecated function definition type Nicolas Casademont 2025-02-04 11:09:55 +0100
  • af40829827 fix: Allow back arguments in function definition and the corresponding test Nicolas Casademont 2025-02-04 11:07:42 +0100
  • 431deec9d6 feat: Make streaming for tool calling behave the same as the open ai api Nicolas Casademont 2025-01-24 14:42:25 +0100
  • 39de46129e fix: Adapt function call response to return a json string for arguments Nicolas Casademont 2025-01-24 11:47:01 +0100
  • 38a1987475 Use eetq kernel from the hub Daniël de Kok 2025-02-17 13:07:09 +0000
  • f866e9853c
    fix: trufflehog Hugo Larcher 2025-02-17 16:29:28 +0100
  • 95d1172347 fix: bump ci build yaml pr-3018-ci-branch drbh 2025-02-17 15:24:25 +0000
  • 9501956383 fix(neuron): increase ulimit when building image David Corvoysier 2025-02-17 15:22:36 +0000