Data hygiene is the collective processes conducted to ensure the cleanliness of data. Data is considered clean if it is relatively error-free. Dirty data can be caused by a number of factors including duplicate records, incomplete or outdated data, and the improper parsing of record fields from disparate systems. Errors can be introduced at any stage as data is entered, stored and managed.
Data quality is crucial to operational and transactional processes within the enterprise and to the reliability of business analytics (BA) / business intelligence (BI) reporting.
Data scrubbing, also called data cleansing, is the process of amending or removing data in a database that is incorrect, incomplete, improperly formatted, or duplicated. Typically the process involves updating it, standardizing it, and de-duplicating records to create a single view of the data, even even if it is stored in multiple disparate systems.