drbh
68a9685f1b
fix: adjust default tool choice ( #2244 )
...
* fix: adjust default tool choice
* feat: improve tool choice syntax and response parsing/errors
* fix: remove dev tests
* feat: add ToolChoice to docs
2024-07-19 11:12:02 -04:00
Wang, Yi
58effe78b5
update to metrics 0.23.0 or could work with metrics-exporter-promethe… ( #2190 )
...
update to metrics 0.23.0 or could work with metrics-exporter-prometheus 0.15.1
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2024-07-08 16:03:59 +02:00
drbh
9eefb2f672
fix: prefer serde structs over custom functions ( #2127 )
...
* fix: prefer enum for chat object
* fix: adjust typo
* fix: enum CompletionType not ObjectType
* fix: adjust typo
* feat: leverage serde for conditional deser
* fix: adjust HubTokenizerConfig after rebase
* fix: update create_post_processor logic for token type
* fix: adjust unwrap syntax in template
* Fixing the post processor.
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2024-07-01 15:08:05 +02:00
drbh
42aa8ee1bb
PR #2049 CI run ( #2054 )
...
* Use minijinja's pycompat mode for python methods
* fix: cargo fmt lint for pre commit
---------
Co-authored-by: Armin Ronacher <armin.ronacher@active-4.com>
2024-06-13 11:53:49 -04:00
OlivierDehaene
757223b352
feat: add SchedulerV3 ( #1996 )
...
- Refactor code to allow supporting multiple versions of the
generate.proto at the same time
- Add v3/generate.proto (ISO to generate.proto for now but allow for
future changes without impacting v2 backends)
- Add Schedule trait to abstract queuing and batching mechanisms that
will be different in the future
- Add SchedulerV2/V3 impl
2024-06-04 15:56:56 +02:00