An introduction to the medallion architecture for data lakehouses: 🥉 Bronze: - All the data. Is it useful data? TBD... - It's an extremely accurate... depiction of your data swamp. - Where you first find out that product engineers changed the schema. 🥈 Silver: - Analogous to an Italian restaurant as there are spaghetti DAGs everywhere. - There is a whisper of a data model, but it's muffled by all the CASE WHENs. - Essentially a giant game of "telephone" to replicate upstream business logic. 🥇 Gold: - Practically speaking, the staging area for data to be replicated into Excel. - Aggregate tables that power the CEO's dashboard (looks at it once). - Assumes the data in the previous steps are correct..... What did we miss? #data #ai
This is hilarious and painfully accurate! Especially the "data swamp" comment in the Bronze layer. Perhaps we can add a "pre-bronze" layer: The land of broken CSVs and missing documentation.
The gold layer is quite spicy. “Staging area for data to replicated into Excel” 😂😂😂
Data Engineer | Mentor @ Women In Big Data
6moLove this, the Gold layer especially 😂 and not to forget Bronze layer: "All the data. Is it useful data? TBD..." 😄