Service Health

This page provides status information on the services that are part of Google Cloud. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://cloud.google.com/.

Incident affecting Cloud CDN, Cloud Load Balancing, Google Cloud Networking, Hybrid Connectivity, Virtual Private Cloud (VPC)

[Cloud CDN, Cloud Load Balancing, Hybrid Connectivity] elevated latency in the UK (europe-west2)

Incident began at 2024-08-12 06:20 and ended at 2024-08-12 08:32 (all times are US/Pacific).

Previously affected location(s)

London (europe-west2)

Date Time Description
15 Aug 2024 14:04 PDT

Incident Report

Summary

On 12 August 2024 at 06:20 US/Pacific, multiple Google Cloud and Google Workspace products experienced connectivity issues in europe-west2 for a duration of 40 minutes. During the time, ingress traffic to europe-west2 and egress traffic from europe-west2 experienced elevated latencies, connection timeouts, and connection failures.

Root Cause

On 12 August 2024 06:20 US/Pacific, primary and backup power feeds were both lost in a Google Point of Presence (POP) due to a substation switchgear failure. The affected POP hosts about ⅓ of serving first-layer Google Front Ends (GFEs) located in europe-west2 and some distributed networking equipment for that region. The power loss impacted the following Google products and services that depend on GFEs in that region:

  • Google Cloud APIs, Google Workspace, and other Google services like YouTube,
  • Customer-created global external application and proxy network load balancers, including Cloud CDN

The power loss also impacted the following Google Cloud products which depended on impacted networking equipment:

  • Customer-created regional external application, proxy network, and passthrough network load balancers in the europe-west2 region,
  • External protocol forwarding and VM external IP address connectivity for VMs in the europe-west2 region.
  • Google Cloud Interconnect connections in some LHR colocation facilities.

Impact was limited to situations where either or both of the following was true:

  • Inbound requests or connections were routed into the europe-west2 region of Google’s network, from the Internet, and those requests or connections depended on networking equipment that was offline, or unreachable pending reconvergence.
  • Outbound responses were routed to the Internet, from the europe-west2 region of Google’s network, and those responses depended on networking equipment that was without power.

The power outage caused Internet routes advertised by Google to be withdrawn in networks connected to Google’s network. The withdrawn routes were automatically replaced by other Google-advertised routes that didn’t depend on impacted networking equipment. Withdrawing and replacing routes relies on the BGP protocol and its timers, so replacement route convergence is not instantaneous, and overloading in the automatically selected replacement route GFEs extended the duration of the incident.

Detailed Description of Impact

  • Google Workspace: _Gmail, Google Calendar, Google Chat, Google Docs, Google Drive, Google Meet and Google Tasks users connecting to Workspace services from the UK region and surrounding areas experienced connectivity issues as described in the next point.
  • GFE-based products and services: _Customers on the Internet experienced a spike of broken connections followed by elevated latencies or HTTP error responses when communicating with GFE-powered Google APIs and services or customer-created global external application and proxy network load balancers. At roughly 06:23 US/Pacific, Google automatically redirected connections to the nearest possible first-layer GFEs with some latency penalty. Unfortunately, some of the nearest possible first-layer GFEs were overloaded until 06:48 when Google engineers made adjustments to more efficiently distribute incoming requests among nearby first-layer GFEs. Depending on the Google API or service or the customer-created global external load balancer, elevated latencies could have persisted until about 08:30 US/Pacific. Elevated latencies also could have applied to customer-created global external load balancers that had Cloud CDN enabled.
  • Regional Google Cloud products and services: _Until replacement routes were in effect, customers on the Internet experienced connection failures to the following GCP resources in the europe-west2 region:
    • Regional external application, proxy network, and passthrough network load balancers.
    • External protocol forwarding and VM external IP addresses.
  • Google Cloud Interconnect: _Google Cloud Interconnect connections in some LHR colocation facilities (lhr-zone1-47, lhr-zone1-832, lhr-zone1-2262, lhr-zone1-4885, lhr-zone1-99051 and lhr-zone2-47) remained offline from 06:20 US/Pacific to at least 06:57 US/Pacific, when power was restored.

At 06:43 US/Pacific, power was restored to the impacted networking equipment. Google networking equipment was fully operational by 06:57 US/Pacific, and connectivity to GFE-based products and services, regional Google Cloud products and services, and Google Cloud Interconnect resumed shortly thereafter.

Remediation and Prevention

Multiple Google engineering teams were alerted and automated recovery tooling was triggered as expected; however, manual adjustments were required to address subsequent first-layer GFE overload. Google is reviewing automation improvements in tasks that required manual intervention to reduce the duration of future power event impact. Similarly, Google is working to increase Cloud Interconnect control plane resilience and reduce mitigation time through automated reaction to isolation events.

Additionally Google's partner who maintains the affected facility power in LHR (London) is conducting a full root cause analysis with the switchboard manufacturer and substation owner(s) involved in supplying power, including follow up as to why stored or generated on-site emergency power did not carry loads.

12 Aug 2024 12:19 PDT

Mini Incident Report

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support or to Google Workspace Support using help article https://support.google.com/a/answer/1047213.

(All Times US/Pacific)

GCP Impact Time:

Incident Start: 12 August 2024 06:20

Incident End: 12 August 2024 08:32

Duration: 2 hours, 12 minutes

Workspace Impact Time:

Incident Start: 12 August 2024 06:20

Incident End: 12 August 2024 07:00

Duration: 40 minutes

Affected Services and Features:

GCP: Cloud CDN, Cloud Load Balancing, Hybrid Connectivity, Virtual Private Cloud (VPC)

Workspace: Gmail, Google Calendar, Google Chat, Google Docs, Google Drive, Google Meet and Google Tasks

Regions/Zones: europe-west2

Description:

GCP:

Cloud CDN, Cloud Load Balancing, Hybrid Connectivity and Virtual Private Cloud (VPC) customers experienced intermittent timeouts (500s) from 06:20 to 07:00 US/Pacific followed by elevated latency in europe-west2 until 08:32 US/Pacific for a total duration of 2 hours and 12 minutes.

Some customers using Cloud Interconnect zones (lhr-zone1-2262, lhr-zone1-832 ,lhr-zone1-47, and lhr-zone2-47) and customers using some Partner Interconnects in London may have experienced connectivity loss to GCP services.

Workspace:

Gmail, Google Calendar, Google Chat, Google Docs, Google Drive, Google Meet and Google Tasks users connecting to Workspace services from the UK region may have experienced connectivity issues for a duration of 40 minutes.

From preliminary analysis, the root cause of the issue was a loss of power to networking equipment in the London data center.

Google will complete a full Incident Report in the following days that will provide a full root cause.

Customer Impact:

GCP:

  • Customers experienced intermittent timeout (500s) followed by elevated latency.
  • Some Cloud Interconnect and Partner Interconnect users experienced connectivity loss to GCP services.

Workspace:

  • Users connecting to Workspace services served from the UK region may have experienced connectivity issues.
12 Aug 2024 08:34 PDT

The issue with Cloud CDN, Cloud Load Balancing, Hybrid Connectivity, Virtual Private Cloud (VPC) has been resolved for all affected users as of Monday, 2024-08-12 08:19 US/Pacific.

During the issue, users connecting to GCP services from the UK region may have experienced elevated latency, intermittent 500 error rates.

We will publish an analysis of this incident once we have completed our internal investigation.

We thank you for your patience while we worked on resolving the issue.

12 Aug 2024 07:19 PDT

Summary: [Cloud CDN, Cloud Load Balancing, Hybrid Connectivity] elevated latency in the UK (europe-west2)

Description: We are experiencing an issue with Cloud CDN, Cloud Load Balancing, Hybrid Connectivity.

The issue is mitigated and our engineering team continues to investigate the issue and are monitoring for the residual impact.

We will provide an update by Monday, 2024-08-12 08:30 US/Pacific with current details.

Diagnosis: Customers who are connecting to europe-west2 (specifically in the UK) will see elevated latency.

Workaround: None at this time.