Security & Identity

Automatic data risk management for BigQuery using DLP

April 14, 2022

Scott Ellis

Senior Product Manager

Protecting sensitive data and preventing unintended data exposure is critical for businesses. However, many organizations lack the tools to stay on top of where sensitive data resides across their enterprise. It’s particularly concerning when sensitive data shows up in unexpected places – for example, in logs that services generate, when customers inadvertently send it in a customer support chat, or when managing unstructured analytical workloads. This is where Automatic Data Loss Prevention (DLP) for BigQuery can help.

Data discovery and classification is often implemented as a manual, on-demand process, and as a result happens less frequently than many organizations would like. With a large amount of data being created on the fly, a more modern, proactive approach is to build discovery and classification into existing data analytics tools. By making it automatic, you can ensure that a key way to surface risk happens continuously - an example of Google Cloud's invisible security strategy. Automatic DLP is a fully-managed service that continuously scans data across your entire organization to give you general awareness of what data you have, and specific visibility into where sensitive data is stored and processed. This awareness is a critical first step in protecting and governing your data and acts as a key control to help improve your security, privacy, and compliance posture.

In October of last year, we announced the public preview for Automatic DLP for BigQuery. Since the announcement, our customers have already scanned and processed both structured and unstructured BigQuery data at multi-petabyte scale to identify where sensitive data resides and gain visibility into their data risk. That’s why we are happy to announce that Automatic DLP is now Generally Available. As part of the release we’ve also added several new features to make it even easier to understand your data and to make use of the insights in more Cloud workflows. These features include:

Premade Data Studio dashboards to give you more advanced summary, reporting, and investigation tools that you can customize to your business needs.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Easy_to_understand_dashboards_give_a_quick.max-1200x1200.jpg

Easy to understand dashboards give a quick overview of data in BQ

Finer grained controls to adjust frequency and conditions for when data is profiled or reprofiled, including the ability to enable certain subsets of your data to be scanned more frequently, less frequently, or skipped from profiling.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Granular_settings_for_how_often_data_is_scan.max-600x600.jpg

Granular settings for how often data is scanned

Automatic sync of DLP profiler insights and risk scores for each table into Chronicle, our Security Analytics platform. We aim to build synergy across our security portfolio, and with this integration we allow analysts using Chronicle to gain immediate insight into if the BQ data involved in a potential incident is of high value or not. This can significantly help to enhance threat detections, prioritizations, and security investigations. For example, if Chronicle detects several attacks, knowing if one is targeting highly sensitive data will help you prioritize, investigate, and remediate the most urgent threats first.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Deep_native_integration_into_Chronicle_hel.max-1600x1600.jpg

Deep native integration into Chronicle helps speed up detection and response

Managing data risk with data classification

Examples of sensitive data elements that typically need special attention are credit cards, medical information, Social Security numbers, government issued IDs, addresses, full names, and account credentials. Automatic DLP leverages machine learning and provides more than 150 predefined detectors to help discover, classify, and govern this sensitive data, allowing you to make sure the right protections are in place.

Once you have visibility into your sensitive data, there are many options to help remediate issues or reduce your overall data risk. For example, you can use IAM to restrict access to datasets or tables or leverage BigQuery Policy Tags to set fine-grained access policies at the column level. Our Cloud DLP platform also provides a set of tools to run on-demand deep and exhaustive inspections of data or can help you obfuscate, mask, or tokenize data to reduce overall data risk. This capability is particularly important if you’re using data for analytics and machine learning, since that sensitive data must be handled appropriately to ensure your users’ privacy and compliance with privacy regulations.

How to get started

Automatic DLP can be turned on for your entire organization, selected organization folders, or individual projects. To learn more about these new capabilities or to get started today, open the Cloud DLP page in the Cloud Console and check out our documentation.

Security & Identity

Cloud Data Loss Prevention is now automatic!

Google Cloud DLP is now automatic and can help you gain visibility into sensitive data across your entire BigQuery footprint.

By Scott Ellis • 4-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/Google_Cloud_security.max-900x900.jpg

Posted in

Customers

How Virgin Media O2 uses Privileged Access Manager to achieve principle of least privilege

By Henry Tze • 4-minute read

Security & Identity

The Cyber Threat Intelligence Program Design Playbook is now available

By Brett Reschke • 2-minute read

Security & Identity

Google Cloud first CSP to join BRC, MFG-ISAC, and affiliates to advance security

By Vinod D’Souza • 4-minute read

Security & Identity

How Google Cloud helps navigate your DPIA and AI privacy compliance journey

By Marc Crandall • 3-minute read

Automatic data risk management for BigQuery using DLP

Scott Ellis

Managing data risk with data classification

How to get started

Cloud Data Loss Prevention is now automatic!

Related articles

How Virgin Media O2 uses Privileged Access Manager to achieve principle of least privilege

The Cyber Threat Intelligence Program Design Playbook is now available

Google Cloud first CSP to join BRC, MFG-ISAC, and affiliates to advance security

How Google Cloud helps navigate your DPIA and AI privacy compliance journey