Analytics Instrumentation Guide
Analytics Instrumentation Overview
At GitLab, we collect product usage data for the purpose of helping us build a better product. Data helps GitLab understand which parts of the product need improvement and which features we should build next. Product usage data also helps our team better understand the reasons why people use GitLab. With this knowledge we are able to make better product decisions.
There are several stages and teams involved to go from collecting data to making it useful for our internal teams and customers.
Stage |
Description |
DRI |
Support Teams |
Privacy Settings |
The implementation of our Privacy Policy including data classification, data access, and user settings to control what data is shared with GitLab. |
Analytics Instrumentation |
Legal, Data |
Collection |
The data collection tools used across all GitLab applications including GitLab SaaS, GitLab self-managed, CustomerDot, VersionDot, and about.gitlab.com. Our current tooling includes Snowplow, Service Ping, and Google Analytics. |
Analytics Instrumentation |
Infrastructure |
Extraction |
The data extraction tools used to extract data from Product, Infrastructure, Enterprise Apps data sources. Our current tooling includes Stitch, Fivetran, and Custom. |
Data |
|
Loading |
The data loading tools used to extract data from Product, Infrastructure, Enterprise Apps data sources and to load them into our data warehouse. Our current tooling includes Stitch, Fivetran, and Custom. |
Data |
|
Orchestration |
The orchestration of extraction and loading tooling to move data from sources into the Enterprise Data Warehouse. Our current tooling includes Airflow. |
Data |
|
Storage |
The Enterprise Data Warehouse (EDW) which is the single source of truth for GitLab’s corporate data, performance analytics, and enterprise-wide data such as Key Performance Indicators. Our current EDW is built on Snowflake. |
Data |
|
Transformation |
The transformation and modelling of data in the Enterprise Data Warehouse in preparation for data analysis. Our current tooling is dbt and Python scripts. |
Data |
Analytics Instrumentation |
Analysis |
The analysis of data in the Enterprise Data Warehouse using a querying and visualization tool. Our current tooling is Tableau. |
Data, Product Data Insights |
Analytics Instrumentation |
Post Launch Instrumentation |
Increase product instrumentation across our features to deliver greater product insights. There is a need to retroactively evaluate what features have been instrumented and need instrumentation from past feature launches. Post launch implementation will allow us to gather insights that currently are being missed and to allow our CSM team to assist customers in seeking to understand feature usage + adoption within their organizations |
Product, Product Data Insights |
Analytics Instrumentation |
Editable source file
Quick Links
2024-05-16: last page update
Our Commitment to Individual User Privacy in relation to Service Usage Data
While there are examples of data collection used for malicious intent, data collection and analysis has also allowed companies to improve their product or service, benefiting their end user/consumer. It is in this vein, that GitLab collects usage data about its products. We collect individual usage data in a pseudonymized manner at the namespace level and then use this information to power our product decisions and improve GitLab for you. We may also aggregate all this information to understand broadly how GitLab product is used.