Stale data only increases the cost of storing and using by mistake. Every organization needs a policy for finding and archiving unmaintained datasets. If there are too many datasets in the data lake and they look almost the same, some of them must be stale. The best way to find them is to monitor data freshness. The datasets with records with recent timestamps should be relatively good. Anything that has not been updated is stale. The problem is with dictionary data that has no update timestamps. You can add the "updated at" timestamp column to see if the table is active. If the table is inactive and there is no record that it is in use, restrict access to it only. When it is used, somebody will raise a ticket. Stale datasets that are not archived consume storage, but that is not the most significant cost. The actual cost is lost time for users who integrate these tables into their dashboards or models, and they have to repeat the whole work when they realize that the data is outdated. #dataquality #dataengineering #datagovernance
Hey Piotr! Love your take on stale data management! 😄 Here's a twist: what if those old datasets could serve as a 'time capsule' for spotting long-term trends or unexpected patterns? Sometimes yesterday's info gives tomorrow's insights! Ever thought about leveraging them this way?
Conorteky agree, the frustration and time that users spend on the wild goose chase, building dashboards and models with data that’s as outdated as dial-up internet… I always say, restrict access and see who notices, if no one complains, it probably wasn’t worth keeping in the first place 🙈
Absolutely. Stale data not only wastes storage but also time. And keeping data fresh isn’t just good practice, it’s essential.
Bridging the gap between data and strategy ✦ Head of Data Strategy @ Profusion ✦ Author of The Data Ecosystem newsletter ✦ R Programmer ✦ Policy Nerd
3dThis type of thing needs to be an essential part of data governance and management. But honestly, I never see it, which creates redundant costs and opens you up to a lot of risk