Data Engineers -> Kimball Data Consumers -> Metric Trees A simple way to think about metric trees: they have the potential to revolutionize data consumption, similar to how modeling frameworks like Kimball transformed the work of data engineers and BI developers. Read more about this analogy here: https://lnkd.in/eE5KaH9m
HelloTrace’s Post
More Relevant Posts
-
Data Modeling Techniques for the Post-Modern Data Stack A set of generic techniques and principles to design a robust, cost-efficient, and scalable data model for your post-modern data stack. https://lnkd.in/gFPzGsCQ
Data Modeling Techniques for the Post-Modern Data Stack
towardsdatascience.com
To view or add a comment, sign in
-
Good article about how to build successful Data Teams, but I would like to highlight something about Data Platform building blocks. “The building blocks don’t have to support every feature needed by other data teams but should adhere to the 80/20 rule. By covering 80% of use-cases with common building blocks, the platform team enables the data teams to quickly deliver most standard use-cases. If a data team needs to deviate from the standard process, they can, but they’ll have to re-implement some of the common building blocks for their use-case.“ #DataPlatform #DataTeams #DataEngineering #Data
The building blocks of successful Data Teams
medium.com
To view or add a comment, sign in
-
#21DaysofData #Day10 Wow day 10! Quarter to half-way there! 😂 Today I want to talk about Data Modelling. This is one way to create relationships between data. It is the process of connecting and structuring data to form a single model. With unique columns/keys (Foreign/Primary Keys) you get to form connections between different datasets and spool information easily from each dataset. Through data modelling you form relationships between dataset/tables. Relationships define how tables are connected to each other. They can be one-to-one, one-to-many, or many-to-many. This is a comprehensive resource to better understand data modelling: https://lnkd.in/dcXHhHzy See you tomorrow 😎
Data Modeling in Power BI Tutorial
datacamp.com
To view or add a comment, sign in
-
Looking to supercharge your data lakehouse? Discover key insights and strategies in our latest blog post to optimize your data lakehouse for intelligent applications.
June 2024: Unfreeze Your Data Lakehouse to Power Intelligent Applications
singlestore.com
To view or add a comment, sign in
-
Looking to supercharge your data lakehouse? Discover key insights and strategies in our latest blog post to optimize your data lakehouse for intelligent applications.
June 2024: Unfreeze Your Data Lakehouse to Power Intelligent Applications
singlestore.com
To view or add a comment, sign in
-
Looking to supercharge your data lakehouse? Discover key insights and strategies in our latest blog post to optimize your data lakehouse for intelligent applications.
June 2024: Unfreeze Your Data Lakehouse to Power Intelligent Applications
singlestore.com
To view or add a comment, sign in
-
Data modeling tip of the day: eventify everything. Trust me, it's going to make your life so much nicer. Data warehouse needs to be time variant (among other things) and be able to answer not only questions on current state, but also how things were. And that time travel capability is what will unlock so many valuable cases. I know, time travel is not the first thing in many peoples mind when they start building their first iterations of some sort of data warehouse. But trust me, we all will run into the moment where we first time go like: "Oh snap, I wish we'd recorded the change history". For that reason ensure everything when it comes in event telling that something happened, be it some sort of activity or change in some state. If the source system and/or ETL tool of your choosing does not support this out of the box make sure you implement it your self. Learn to work with this kind of event streams, do AS OF joins, reconstruct snapshots of the whole system in different times. This stuff will pay off! And these funnily are the type of queries that ChattyG struggles to generate correctly! And as Robert Harmon so often reminds us: the platforms we have today are insanely fast and cost efficient. For that reason working with this kind of data is super easy today. 🤠 now let's head back to future 🤠
To view or add a comment, sign in
-
Building a Data Platform in 2024 How to build a modern, scalable data platform to power your analytics and data science projects (updated)
Building a Data Platform in 2024
towardsdatascience.com
To view or add a comment, sign in
-
Having a solid data design/solution isn’t just a nice-to-have; it’s a must. Whether you're starting a new project or fine-tuning an existing system, asking the right questions early on can save you headaches down the road. Some of best questions can be- 1. What are the different sources of data? 2. What is the size of the data? 3. Where does the data reside today? 4. Do we need to migrate the data between different tools? 5. How do we connect to the data? 6. What kind of transformations are required for the data? 7. How often does the data get updated? 8. Should we expect a schema change in the new data? 9. How do we monitor the pipelines if there are failures? 10. Do we need to create a notification system for failures? 11. Do we need to add a retry mechanism for failures? 12. What is the timeout strategy for failures? 13. How do we run back-dated pipelines if there are failures? 14. How do we deal with bad data? 15. What strategy should we follow – ETL or ELT? 16. How can we save the costs of computation? 17. How will we manage and scale this solution as our data needs grow? Feel free to add more🙂 https://lnkd.in/gFa9GJGw #Dataengineer #Dataanalytics #bestpractices #dataarchitect
To view or add a comment, sign in
-
🔍 Schema Design & Partitioning: Why They’re Game-Changers in Data 🔍 As data analysts, we all want our systems to be faster, more efficient, and—let’s be honest—a bit easier to work with. Here’s where schema design and partitioning come in. 💡 Why Schema Design Matters Organized Data - A solid schema is like a roadmap, making it easy to know where data lives and how to get to it. Faster Queries - A well-structured schema keeps things tidy, which helps your queries run faster and prevents headaches down the road. Data Quality - It makes sure data isn’t duplicated or messed up, so what you see is accurate and reliable. 🚀 Why Partitioning is a Power Move Speeds Up Analysis - By breaking data into partitions (say, by month or region), we can scan just what we need instead of the whole dataset. Less time and resources spent on queries! Cost-Effective - Fewer scans mean lower costs—huge when working with big datasets. Simpler Data Management - Partitioning helps with regular cleanups, so old data doesn’t slow down the system. Schema design + partitioning = efficient, reliable data that’s ready for action. Curious to know: how are you using these tactics to keep your data work fast and clean? Let’s swap ideas! #DataEngineering #SchemaDesign #Partitioning #DataOptimization #DataAnalytics #DataScience
To view or add a comment, sign in
513 followers