Set maximum grpc message receive size to 2GiB

The previous default was 4MiB, which doesn't really work well for multi-modal models.
2025-09-11 20:34:54 +00:00 · 2024-06-17 12:26:31 +02:00 · 2024-06-17 12:26:31 +02:00 · 991a1cbb3b
commit 991a1cbb3b
parent 0f7d38e774
1 changed files with 5 additions and 1 deletions
--- a/server/text_generation_server/server.py
+++ b/server/text_generation_server/server.py
@ -240,7 +240,11 @@ def serve(
            interceptors=[
                ExceptionInterceptor(),
                UDSOpenTelemetryAioServerInterceptor(),
-            ]
+            ],
+            options=[
+                # Set the maximum possible message length: i32::MAX
+                ("grpc.max_receive_message_length", (1 << 31) - 1)
+            ],
        )
        generate_pb2_grpc.add_TextGenerationServiceServicer_to_server(
            TextGenerationService(model, Cache(), quantize, server_urls), server