Default Branch

c6071749db · Fix mask passed to flashinfer (#3324) · Updated 2025-09-08 17:47:03 +00:00

Branches

4bac76241d · Update server.rs · Updated 2023-08-21 08:10:57 +00:00    Leaf

1046
5

cf43528538 · remove stream since its a separate PR · Updated 2023-08-18 10:57:36 +00:00    Leaf

1048
6

89a4e723d2 · Attempting to fix torch leak. · Updated 2023-08-12 07:06:49 +00:00    Leaf

1061
1

4a9615e8ff · Add to ToC · Updated 2023-08-11 13:05:10 +00:00    Leaf

1064
2

43ed6c217a · Dummy commit · Updated 2023-08-10 08:33:52 +00:00    Leaf

1069
1

4ddb6681ac · Add workflow to upload documentation · Updated 2023-08-08 05:49:45 +00:00    Leaf

1072
1

e994ad1172 · Added InferenceClient · Updated 2023-08-02 14:57:01 +00:00    Leaf

1086
11

7766fee9b1 · fix typo for dynamic rotary (#745) · Updated 2023-07-31 16:58:46 +00:00    Leaf

1080
0
Included

f555dabca8 · Putting back header inclusion (seems unused but still) · Updated 2023-07-20 15:46:51 +00:00    Leaf

1107
21

bfa3920aec · BNB 4bits. · Updated 2023-07-12 12:42:43 +00:00    Leaf

1133
7

db4efbf4bc · fix(server): T5 weights names. (#582) · Updated 2023-07-12 08:01:42 +00:00    Leaf

1129
0
Included

a4fd6905d8 · fmt · Updated 2023-06-23 13:01:05 +00:00    Leaf

1156
2

dca0fe2585 · Adding GPTQ integration tests. · Updated 2023-06-19 12:14:17 +00:00    Leaf

1159
19

17837b1e51 · Adding docs about GPTQ usage. · Updated 2023-06-15 17:41:04 +00:00    Leaf

1159
19

fb0840944c · Reducing number of reps while autotuning. · Updated 2023-06-06 11:56:10 +00:00    Leaf

1210
9

7ccb8eefdc · TMP. · Updated 2023-05-15 14:43:32 +00:00    Leaf

1204
4

a963495315 · add logic to queue · Updated 2023-04-26 11:40:20 +00:00    Leaf

1243
2

7caea42573 · feat(launcher): parse all shard logs · Updated 2023-04-15 19:25:02 +00:00    Leaf

1273
2

47ac334a21 · 0.4.0 · Updated 2023-03-12 09:06:15 +00:00    Leaf

1366
9

60ed7b535c · first tests · Updated 2023-02-23 08:52:17 +00:00    Leaf

1350
1