An open source tool to autoscale Spanner instances
Home
·
Poller component
·
Scaler component
·
Forwarder component
·
Terraform configuration
·
Monitoring
The Autoscaler tool for Cloud Spanner is a companion tool to Cloud Spanner that allows you to automatically increase or reduce the number of nodes or processing units in one or more Spanner instances, based on their utilization.
When you create a Cloud Spanner instance, you choose the number of nodes or processing units that provide compute resources for the instance. As the instance's workload changes, Cloud Spanner does not automatically adjust the number of nodes or processing units in the instance.
The Autoscaler monitors your instances and automatically adds or removes compute capacity to ensure that they stay within the recommended maximums for CPU utilization and the recommended limit for storage per node, plus or minus an allowed margin. Note that the recommended thresholds are different depending if a Spanner instance is regional or multi-region.
The diagram above shows the high level components of the Autoscaler and the interaction flow:
-
The Autoscaler consists of two main decoupled components:
These can be deployed to either Cloud Run functions or Google Kubernetes Engine (GKE), and configured so that the Autoscaler runs according to a user-defined schedule. In certain deployment topologies a third component, the Forwarder, is also deployed.
-
At the specified time and frequency, the Poller component queries the Cloud Monitoring API to retrieve the utilization metrics for each Spanner instance.
-
For each instance, the Poller component pushes one message to the Scaler component. The payload contains the utilization metrics for the specific Spanner instance, and some of its corresponding configuration parameters.
-
Using the chosen scaling method, the Scaler compares the Spanner instance metrics against the recommended thresholds, (plus or minus an allowed margin), and determines if the instance should be scaled, and the number of nodes or processing units that it should be scaled to. If the configured cooldown period has passed, then the Scaler component requests the Spanner Instance to scale out or in.
Throughout the flow, the Autoscaler writes a step by step summary of its recommendations and actions to Cloud Logging for tracking and auditing.
To deploy the Autoscaler, decide which of the following strategies is best adjusted to fulfill your technical and operational needs:
In both of the above instances, the Google Cloud Platform resources are deployed using Terraform. Please see the Terraform instructions for more information on the deployment options available.
The autoscaler publishes the following metrics to Cloud Monitoring which can be used to monitor the behavior of the autoscaler, and to configure alerts.
-
Message processing counters:
cloudspannerecosystem/autoscaler/poller/requests-success
- the number of polling request messages recieved and processed successfully.cloudspannerecosystem/autoscaler/poller/requests-failed
- the number of polling request messages which failed processing.
-
Spanner Instance polling counters:
cloudspannerecosystem/autoscaler/poller/polling-success
- the number of successful polls of the Spanner instance metrics.cloudspannerecosystem/autoscaler/poller/polling-failed
- the number of failed polls of the Spanner instance metrics.- Both of these metrics have
projectid
andinstanceid
to identify the Spanner instance.
- Message processing counters:
cloudspannerecosystem/autoscaler/scaler/requests-success
- the number of scaling request messages recieved and processed successfully.cloudspannerecosystem/autoscaler/scaler/requests-failed
- the number of scaling request messages which failed processing.
- Spanner Instance scaling counters:
-
cloudspannerecosystem/autoscaler/scaler/scaling-success
- the number of succesful rescales of the Spanner instance. -
cloudspannerecosystem/autoscaler/scaler/scaling-denied
- the number of Spanner instance rescale attempts that failed -
cloudspannerecosystem/autoscaler/scaler/scaling-failed
- the number of Spanner instance rescale attempts that were denied by autoscaler configuration or policy. -
These three metrics have the following attributes:
spanner_project_id
- the Project ID of the affected Spanner instancespanner_instance_id
- the Instance ID of the affected Spanner instancescaling_method
- the scaling method usedscaling_direction
- which can beSCALE_UP
,SCALE_DOWN
orSCALE_SAME
(when the calculated rescale size is equal to the current size)- In addition, the
scaling-denied
counter has ascaling_denied_reason
attribute containing the reason why the scaling was not performed, which can be:SAME_SIZE
- when the calculated rescale size is equal to the current instance size.MAX_SIZE
- when the instance has already been scaled up to the maximum configured size.WITHIN_COOLDOWN
- when the instance has been recently rescaled, and the autoscaler is waiting for the cooldown period to end.IN_PROGRESS
- when an instance scaling operation is still ongoing.
-
The parameters for configuring the Autoscaler are identical regardless of the chosen deployment type, but the mechanism for configuration differs slightly:
There is also a browser-based configuration file editor and a command line configuration file validator.
Copyright 2020 Google LLC
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
The Autoscaler is a Cloud Spanner Ecosystem project based on open source contributions. We'd love for you to report issues, file feature requests, and send pull requests (see Contributing). You may file bugs and feature requests using GitHub's issue tracker or using the existing Cloud Spanner support channels.