Commit Graph

5 Commits

Author SHA1 Message Date
drbh
68a9685f1b
fix: adjust default tool choice (#2244)
* fix: adjust default tool choice

* feat: improve tool choice syntax and response parsing/errors

* fix: remove dev tests

* feat: add ToolChoice to docs
2024-07-19 11:12:02 -04:00
Wang, Yi
58effe78b5
update to metrics 0.23.0 or could work with metrics-exporter-promethe… (#2190)
update to metrics 0.23.0 or could work with metrics-exporter-prometheus 0.15.1

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2024-07-08 16:03:59 +02:00
drbh
9eefb2f672
fix: prefer serde structs over custom functions (#2127)
* fix: prefer enum for chat object

* fix: adjust typo

* fix: enum CompletionType not ObjectType

* fix: adjust typo

* feat: leverage serde for conditional deser

* fix: adjust HubTokenizerConfig after rebase

* fix: update create_post_processor logic for token type

* fix: adjust unwrap syntax in template

* Fixing the post processor.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2024-07-01 15:08:05 +02:00
drbh
42aa8ee1bb
PR #2049 CI run (#2054)
* Use minijinja's pycompat mode for python methods

* fix: cargo fmt lint for pre commit

---------

Co-authored-by: Armin Ronacher <armin.ronacher@active-4.com>
2024-06-13 11:53:49 -04:00
OlivierDehaene
757223b352
feat: add SchedulerV3 (#1996)
- Refactor code to allow supporting multiple versions of the
generate.proto at the same time
- Add v3/generate.proto (ISO to generate.proto for now but allow for
future changes without impacting v2 backends)
- Add Schedule trait to abstract queuing and batching mechanisms that
will be different in the future
- Add SchedulerV2/V3 impl
2024-06-04 15:56:56 +02:00