Data cleaning is the process of correcting or removing invalid values. There are several groups of data cleaning techniques that you can apply: 👉 Data format standardization methods help to correct small errors without losing information. 👉 Value modification methods covert values to be usable, but at the cost of losing some precision. 👉 Missing values are fixed by data enrichment, such as looking up values from other sources. 👉 Invalid data removal methods focus on removing incorrect records or values. 👉 All other typical data quality errors are handled by detecting errors using data quality checks. Most of these methods can be automated, making it an autonomous process. #dataquality #dataengineering #datagovernance
Check out my free eBook "A step-by-step guide to improve data quality" for hints on detecting data quality issues: https://dqops.com/best-practices-for-effective-data-quality-improvement/