21 KiB
Engine Identity Validation
Summary
Longhorn-manager communicates with longhorn-engine's gRPC ControllerService, ReplicaService, and SyncAgentService by sending requests to TCP/IP addresses kept up-to-date by its various controllers. Additionally, the longhorn-engine controller server sends requests to the longhorn-engine replica server's ReplicaService and SyncAgentService using TCP/IP addresses it keeps in memory. These addresses are relatively stable in normal operation. However, during periods of high process turnover (e.g. a node reboot or network event), it is possible for one longhorn-engine component to stop and another longhorn-engine component to start in its place using the same ports. If this happens quickly enough, other components with stale address lists attempting to execute requests against the old component may errantly execute requests against the new component. One harmful effect of this behavior that has been observed is the expansion of an unintended longhorn-engine replica.
This proposal intends to ensure all gRPC requests to longhorn-engine components are actually served by the intended component.
Related Issues
https://github.com/longhorn/longhorn/issues/5709
Motivation
Goals
- Eliminate the potential for negative effects caused by a Longhorn component communicating with an incorrect longhorn-engine component.
- Provide effective logging when incorrect communication occurs to aide in fixing TCP/IP address related race conditions.
Non-goals
- Fix race conditions within the Longhorn control plane that lead to attempts to communicate with an incorrect longhorn-engine component.
- Refactor the in-memory data structures the longhorn-engine controller server uses to keep track of and initiate communication with replicas.
Proposal
Today, longhorn-manager knows the volume name and instance name of the process it is trying to communicate with, but it only uses the TCP/IP information of each process to initiate communication. Additionally, longhorn-engine components are mostly unaware of the volume name (in the case of longhorn-engine's replica server) and instance name (for both longhorn-engine controller and replica servers) they are associated with. If we provide this information to longhorn-engine processes when we start them and then have longhorn-manager provide it on every communication attempt, we can ensure no accidental communication occurs.
- Add additional flags to the longhorn-engine CLI that inform controller and replica servers of their associated volume and/or instance name.
- Use gRPC client interceptors to automatically inject gRPC metadata (i.e. headers) containing volume and/or instance name information every time a gRPC request is made by a longhorn-engine client to a longhorn-engine server.
- Use gRPC server interceptors to automatically validate the volume and/or instance name information in gRPC metadata (i.e. headers) every time a gRPC request made by a longhorn-engine client is received by a longhorn-engine server.
- Reject any request (with an appropriate error code) if the provided information does not match the information a controller or replica server was launched with.
- Log the rejection at the client and the server, making it easy to identify situations in which incorrect communication occurs.
- Modify instance-manager's
ProxyEngineService
(both server and client) so that longhorn-manager can provide the necessary information for gRPC metadata injection. - Modify longhorn-manager so that is makes proper use of the new
ProxyEngineService
client and launches longhorn-engine controller and replica servers with additional flags.
User Stories
Story 1
Before this proposal:
As an administrator, after an intentional or unintentional node reboot, I notice one or more of my volumes is degraded
and new or existing replicas aren't coming online. In some situations, the UI reports confusing information or one or
more of my volumes might be unable to attach at all. Digging through logs, I see errors related to mismatched sizes, and
at least one replica does appear to have a larger size reported in volume.meta
than others. I don't know how to
proceed.
After this proposal:
As an administrator, after an intentional or unintentional node reboot, my volumes work as expected. If I choose to dig through logs, I may see some messages about refused requests to incorrect components, but this doesn't seem to negatively affect anything.
Story 2
Before this proposal:
As a developer, I am aware that it is possible for one Longhorn component to communicate with another, incorrect component, and that this communication can lead to unexpected replica expansion. I want to work to fix this behavior. However, when I look at a support bundle, it is very hard to catch this communication occurring. I have to trace TCP/IP addresses through logs, and if no negative effects are caused, I may never notice it.
After this proposal:
Any time one Longhorn component attempts to communicate with another, incorrect component, it is clearly represented in the logs.
User Experience In Detail
See the user stories above. This enhancement is intended to be largely transparent to the user. It should eliminate rare failures so that users can't run into them.
API Changes
Longhorn-Engine
Increment the longhorn-engine CLIAPIVersion by one. Do not increment the longhorn-engine CLIAPIMinVersion. The changes
in this LEP are backwards compatible. All gRPC metadata validation is by demand of the client. If a less sophisticated
(not upgraded) client does not inject any metadata, the server performs no validation. If a less sophisticated (not
upgraded) client only injects some metadata (e.g. volume-name
but not instance-name
), the server only validates the
metadata provided.
Add a global volume-name
flag and a global engine-instance-name
flag to the engine CLI (e.g. longhorn -volume-name <volume-name> -engine-instance-name <engine-instance-name> <command> <args>
). Virtually all CLI commands create a
controller client and these flags allow appropriate gRPC metadata to be injected into every client request. Requests
that reach the wrong longhorn-engine controller server are rejected.
Use the global engine-instance-name
flag and the pre-existing volume-name
positional argument to allow the
longhorn-engine controller server to remember its volume and instance name (e.g. longhorn -engine-instance-name <instance-name> controller <volume-name>
). Ignore the global volume-name
flag, as it is redundant.
Use the global volume-name
flag or the pre-existing local volume-name
flag and a new replica-instance-name
flag to
allow the longhorn-engine replica server to remember its volume and instance name (e.g. longhorn -volume-name <volume-name> replica <directory> -replica-instance-name <replica-instance-name>
).
Use the global volume-name
flag and a new replica-instance-name
flag to allow the longhorn-engine sync-agent server
to remember its volume and instance name (e.g. longhorn -volume-name <volume-name> sync-agent -replica-instance-name <replica-instance-name>
).
Add an additional replica-instance-name
flag to CLI commands that launch asynchronous tasks that communicate directly
with the longhorn-engine replica server (e.g. longhorn -volume-name <volume-name> add-replica <address> -size <size> -current-size <current-size> -replica-instance-name <replica-instance-name>
). All such commands create a replica
client and these flags allow appropriate gRPC metadata to be injected into every client request. Requests that reach the
wrong longhorn-engine replica server are rejected.
Return 9 FAILED_PRECONDITION with an appropriate message when metadata validation fails. This code is chosen in accordance with the RPC API, which instructs developers to use FAILED_PRECONDITION if the client should not retry until the system system has been explicitly fixed.
Longhorn-Instance-Manager
Increment the longhorn-instance-manager InstanceManagerProxyAPIVersion by one. Do not increment the longhorn-instance-manager InstanceManagerProxyAPIMinVersion. The changes in this LEP are backwards compatible. No added fields are required and their omission is ignored. If a less sophisticated (not upgraded) client does not include them, no metadata is injected into engine or replica requests and no validation occurs (the behavior is the same as before the implementation of this LEP).
Add volume_name
and instance_name
fields to the ProxyEngineRequest
protocol buffer message. This message, which
currently only contains an address
field, is included in all ProxyEngineService
RPCs. Updated clients can pass
information about the engine process they expect to be communicating with in these fields. When instance-manager creates
an asynchronous task to carry out the requested operation, the resulting controller client includes the gRPC interceptor
described above.
Add replica_instance_name
fields to any ProxyEngineService
RPC associated with an asynchronous task that
communicates directly with a longhorn-engine replica server. When instance-manager creates the task, the resulting
replica client includes the gRPC interceptor described above.
Return 5 NOT FOUND with an appropriate message when metadata validation fails at a lower layer. (The particular return code is definitely open to discussion.)
Design
Implementation Overview
Interceptors (longhorn-engine)
Add a gRPC server interceptor to all grpc.NewServer
calls.
server := grpc.NewServer(withIdentityValidationInterceptor(volumeName, instanceName))
Implement the interceptor so that it validates metadata with best effort.
func withIdentityValidationInterceptor(volumeName, instanceName string) grpc.ServerOption {
return grpc.UnaryInterceptor(identityValidationInterceptor(volumeName, instanceName))
}
func identityValidationInterceptor(volumeName, instanceName string) grpc.UnaryServerInterceptor {
// Use a closure to remember the correct volumeName and/or instanceName.
return func(ctx context.Context, req any, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) {
md, ok := metadata.FromIncomingContext(ctx)
if ok {
incomingVolumeName, ok := md["volume-name"]
// Only refuse to serve if both client and server provide validation information.
if ok && volumeName != "" && incomingVolumeName[0] != volumeName {
return nil, status.Errorf(codes.InvalidArgument, "Incorrect volume name; check controller address")
}
}
if ok {
incomingInstanceName, ok := md["instance-name"]
// Only refuse to serve if both client and server provide validation information.
if ok && instanceName != "" && incomingInstanceName[0] != instanceName {
return nil, status.Errorf(codes.InvalidArgument, "Incorrect instance name; check controller address")
}
}
// Call the RPC's actual handler.
return handler(ctx, req)
}
}
Add a gRPC client interceptor to all grpc.Dial
calls.
connection, err := grpc.Dial(serviceUrl, grpc.WithInsecure(), withIdentityValidationInterceptor(volumeName, instanceName))
Implement the interceptor so that it injects metadata with best effort.
func withIdentityValidationInterceptor(volumeName, instanceName string) grpc.DialOption {
return grpc.WithUnaryInterceptor(identityValidationInterceptor(volumeName, instanceName))
}
func identityValidationInterceptor(volumeName, instanceName string) grpc.UnaryClientInterceptor {
// Use a closure to remember the correct volumeName and/or instanceName.
return func(ctx context.Context, method string, req any, reply any, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
if volumeName != "" {
ctx = metadata.AppendToOutgoingContext(ctx, "volume-name", volumeName)
}
if instanceName != "" {
ctx = metadata.AppendToOutgoingContext(ctx, "instance-name", instanceName)
}
return invoker(ctx, method, req, reply, cc, opts...)
}
}
Modify all client constructors to include this additional information. Wherever these client packages are consumed (e.g. the replica client is consumed by the controller, both the replica and the controller clients are consumed by longhorn-manager), callers can inject this additional information into the constructor and get validation for free.
func NewControllerClient(address, volumeName, instanceName string) (*ControllerClient, error) {
// Implementation.
}
CLI Commands (longhorn-engine)
Add additional flags to all longhorn-engine CLI commands depending on their function.
E.g. command that launches a server:
func ReplicaCmd() cli.Command {
return cli.Command{
Name: "replica",
UsageText: "longhorn controller DIRECTORY SIZE",
Flags: []cli.Flag{
// Other flags.
cli.StringFlag{
Name: "volume-name",
Value: "",
Usage: "Name of the volume (for validation purposes)",
},
cli.StringFlag{
Name: "instance-name",
Value: "",
Usage: "Name of the instance (for validation purposes)",
},
},
// Rest of implementation.
}
}
E.g. command that directly communicates with both a controller and replica server.
func AddReplicaCmd() cli.Command {
return cli.Command{
Name: "add-replica",
ShortName: "add",
Flags: []cli.Flag{
// Other flags.
cli.StringFlag{
Name: "volume-name",
Required: false,
Usage: "Name of the volume (for validation purposes)",
},
cli.StringFlag{
Name: "engine-instance-name",
Required: false,
Usage: "Name of the controller instance (for validation purposes)",
},
cli.StringFlag{
Name: "replica-instance-name",
Required: false,
Usage: "Name of the replica instance (for validation purposes)",
},
},
// Rest of implementation.
}
}
Instance-Manager Integration
Modify the ProxyEngineService server functions so that they can make correct use of the changes in longhorn-engine. Funnel information from the additional fields in the ProxyEngineRequest message and in appropriate ProxyEngineService RPCs into the longhorn-engine task and controller client constructors so it can be used for validation.
message ProxyEngineRequest{
string address = 1;
string volume_name = 2;
string instance_name = 3;
}
Modify the ProxyEngineService client functions so that consumers can provide the information required to enable validation.
Longhorn-Manager Integration
Ensure the engine and replica controllers launch engine and replica processes with -volume-name
and -instance-name
flags so that these processes can validate identifying gRPC metadata coming from requests.
Ensure the engine controller supplies correct information to the ProxyEngineService client functions so that identity validation can occur in the lower layers.
Example Validation Flow
This issue/LEP was inspired by longhorn/longhorn#5709. In the situation described in this issue:
- An engine controller with out-of-date information (including a replica address the associated volume does not own) issues a ReplicaAdd command to instance-manager's EngineProxyService.
- Instance-manager creates a longhorn-engine task and calls its AddReplica method.
- The task makes appropriate calls to a longhorn-engine controller and replica. The ReplicaService's ExpandReplica command is used to expand the replica before a followup failure to actually add the replica to the controller's backend.
After this improvement, the above scenario will be impossible:
- Both the engine and replica controllers will launch engine and replica processes with the
-volume-name
and-instance-name
flags. - When the engine controller issues a ReplicaAdd command, it will do so using the expanded embedded
ProxyEngineRequest
message (withvolume_name
andinstance_name
fields) and an additionalreplica_instance_name
field. - Instance-manager will create a longhorn-engine task that automatically injects
volume-name
andinstance-name
gRPC metadata into each controller request. - When the task issues an ExpandReplica command, it will do so using a client that automatically injects
volume-name
andinstance-name
gRPC metadata into it. - If either the controller or the replica does not agree with the information provided, gRPC requests will fail immediately and there will be no change in any longhorn-engine component.
Test plan
TODO: Integration Test Plan
In my test environment, I have experimented with:
- Running new versions of all components, making gRPC calls to the longhorn-engine controller and replica processes with wrong gRPC metadata, and verifying that these calls fail.
- Running new versions of all components, making gRPC calls to instance-manager with an incorrect volume-name or instance name, and verifying that these calls fail.
- Running new versions of all components, adding additional logging to longhorn-engine and verifying that metadata validation is occurring during the normal volume lifecycle.
This is really a better fit for a negative testing scenario (do something that would otherwise result in improper communication, then verify that communication fails), but we have already eliminated the only known recreate for longhorn/longhorn#5709.
Engine Integration Test Plan
Rework test fixtures so that:
- All controller and replica processes are created with the information needed for identity validation.
- It is convenient to create controller and replica clients with the information needed for identity validation.
- gRPC metadata is automatically injected into controller and replica client requests when clients have the necessary information.
Do not modify the behavior of existing tests. Since these tests were using clients with identity validation information, no identity validation is performed.
Create new tests that:
- Ensure validation fails when a directly created client attempts to communicate with a controller or replica server using the wrong identity validation information.
- Ensure validation fails when an indirectly created client (by the engine) tries to communicate with a replica server using the wrong identity validation information.
- Ensure validation fails when an indirectly created client (by a CLI command) tries to communicate with a controller or replica server using the wrong identity validation information.
Upgrade strategy
The user will get benefit from this behavior automatically, but only after they have upgraded all associated components to a supporting version (longhorn-manager, longhorn-engine, and CRITICALLY instance-manager).
We will only provide volume name and instance name information to longhorn-engine controller and replica processes on a
supported version (as governed by the CLIAPIVersion
). Even if other components are upgraded, when they send gRPC
metadata to non-upgraded processes, it will be ignored.
We will only populate extra ProxyEngineService fields when longhorn-manager is running with an update ProxyEngineService client.
- RPCs from an old client to a new ProxyEngineService server will succeed, but without the extra fields, instance-manager will have no useful gRPC metadata to inject into its longhorn-engine requests.
- RPCs from a new client to an old ProxyEngineService will succeed, but instance-manager will ignore the new fields and not inject useful gRPC metadata into its longhorn-engine request.
Note
Why gRPC metadata?
We initially looked at adding volume name and/or instance name fields to all longhorn-engine ReplicaService and ControllerService calls. However, this would be awkward with some of the existing RPCs. In addition, it doesn't make much intuitive sense. Why should we provide the name of an entity we are communicating with to that entity as part of its API? It makes more sense to think of this identity validation in terms of sessions or authorization/authentication. In HTTP, information of this nature is handled through the use of headers, and metadata is the gRPC equivalent.
Why gRPC interceptors?
We want to ensure the same behavior in every longhorn-engine ControllerService and ReplicaService call so that it is not up to an individual developer writing a new RPC to remember to validate gRPC metadata (and to relearn how it should be done). Interceptors work mostly transparently to ensure identity validation always occurs.