What to expect of Data Quality (Tools)
There is no scientific, objective measure of quality. Rather, “quality” implies that something is 100% suitable for its intended purpose. J. M. Juran has provided one of the best known definitions of quality: “High-quality data are data that are fit for use in their intended operational, decision-making, planning, and strategic roles” (1). J. M. Juran published this definition back in 1951, but, surprisingly, it still holds true today.
Data quality is not an invariant measure since it depends on the intended use of data. One of the major and most challenging tasks consists in ensuring high data quality and in enabling the deployment of data quality tools. The intended purpose of data and the necessary breadth and depth of information have to be determined and continuously verified to achieve the expected data quality. Only then can the adoption of data quality tools really improve processes (2).
Expectations of data quality tools are very high. They are expected to automate as many tasks as possible in order to support professionals that are in charge of data maintenance. Quite often, because of the extensive capabilities of good data quality tools, customers also expect them to perform functions that are characteristic of ETL-Tools, i.e. tools that specialize in extracting information from different data sources.The main purpose of data quality tools is to support customers in maintaining quality of data records throughout their entire life cycle and to provide standard reports that will assist data administrators and managers in their respective tasks.
1: Quality Control Handbook, New York, NY: McGraw-Hill 1951
2: Extract, Transform, Load
First of all, policies and rules for the intended data quality must be established. Data quality tools should then offer the following key features in order to support data quality processes within a company:
The validation process involves continuously authenticating and verifying records throughout their entire life cycle against given reference rules. A data quality tool should automate the validation process in its entirety.
A data quality tool should visualize all results of the validation process in order to support data administrators, i.e. should show the information that needs to be altered in the records in order to achieve the desired data quality.
Results of data quality assessments should not only assist data administrators in their tasks; they should also provide the management with an overview of the overall state of data maintenance, i.e. the current state of data quality and scope.
If a data quality tool meets all of these requirements, it will speed up processes, unburden data administrators from the continuous strain of monitoring and analysing data and provide management with comprehensive reports. Data quality tools can support processes throughout the entire life cycle of a product.
Project Manager eBusiness
Data & Applications
Einkaufsbüro Deutscher Eisenhändler GmbH (EDE)