Daniël de Kok
622c9c367a
nix: build Torch against MKL and various other improvements ( #2469 )
...
Updates tgi-nix input:
- Move Torch closer to upstream by building against MKL.
- Remove compute capability 8.7 from Torch (Jetson).
- Sync nixpkgs cumpute capabilities with Torch (avoids
compiling too mana capabilities for MAGMA).
- Use nixpkgs configuration passed through by `tgi-nix`.
2024-09-25 06:11:21 +00:00
Daniël de Kok
b7d1adc3e9
nix: add awq-inference-engine as server dependency ( #2442 )
2024-09-25 06:10:59 +00:00
Nicolas Patry
6654c2d11b
Adding eetq to flake. ( #2438 )
2024-09-25 06:10:59 +00:00
Daniël de Kok
516392d790
nix: add pure server to flake, add both pure and impure devshells ( #2430 )
...
* nix: pure server and support both pure and impure devShells
* nix: remove unused poetry2nix input
It is not wired up and we now have a pure server.
* nix: add ipdb to impure devshell
2024-09-25 06:10:59 +00:00
Nicolas Patry
635dde8af9
Prefix caching ( #2402 )
...
* Prefix caching WIP
* Fixing prefix attention.
* Fixing flashinfer import.
* Fixing black.
* Fixing medusa (still wrong outputs, but functional).
* Just medusa values now.
* Fixing medusa without prefix caching.
* Fixing prefix caching.
* Medusa requires reshaping.
* Removing the logs.
* Remove router.nix
* Fixup:
- Remove logs
- Disable VLMs (they do not work)
- Disable prefix caching when user wants prefill logprobs.
* Update flake.lock
---------
Co-authored-by: Daniël de Kok <me@danieldk.eu>
2024-09-25 06:10:59 +00:00
Daniël de Kok
ddba272a66
nix: update to CUDA 12.4 ( #2429 )
...
* Update to CUDA 12.4
* poetry2nix: follow tgi-nix nixpkgs
2024-09-25 06:10:59 +00:00
Daniël de Kok
20ed7b598e
nix: try to reduce the number of Rust rebuilds ( #2424 )
...
Try to reduce the number of router/launcher rebuilds by filtering
sources. In this way, recompiles should only be triggered by changes
in Cargo or Rust files.
2024-09-25 06:08:38 +00:00
Daniël de Kok
e5c39a5545
nix: build router incrementally ( #2422 )
2024-09-25 06:08:00 +00:00
Daniël de Kok
bae161ab84
nix: partial incremental build of the router ( #2416 )
...
This is less incremental than crate2nix, but does build all dependencies
separately, so avoids full rebuilds.
2024-09-25 06:06:17 +00:00
Nicolas Patry
c5e4c1877b
Adding more kernels to flake. ( #2411 )
2024-09-25 06:06:17 +00:00
Daniël de Kok
eb561bb715
nix: incremental build of the launcher ( #2410 )
2024-09-25 06:06:17 +00:00
Nicolas Patry
18d6be6af4
Updating the flake. ( #2404 )
2024-09-25 06:06:17 +00:00
Daniël de Kok
bb833389e0
Update flake for 9.0a capability in Torch ( #2394 )
2024-09-25 06:04:51 +00:00
Daniël de Kok
df719fd527
flake: use rust-overlay ( #2390 )
2024-09-25 06:04:51 +00:00
Daniël de Kok
dc0fa60f55
Add experimental flake ( #2384 )
...
Add flake.nix
2024-09-25 06:01:59 +00:00