Use Cases

Data Observability: ETL Monitoring and Data Validation (Clario)

December 21, 2022
Data Observability: ETL Monitoring and Data Validation (Clario)

The Need

Clario, a provider of health outcomes research provides mission critical, software-enabled clinical research facilities and services, including cardiac safety, electronic clinical outcomes, and suicide risk assessments. The need for quality data capture and monitoring throughout their ETL process ensures load to their Reltio Master Data Management (MDM) solution. With strict internal SLA’s, error reporting at the onset from various sources was critical because as data moved downstream to a Tableau reporting, it is often too late to fix quickly. Lack of trust in the data at the reporting level caused major issues with clients who depend on Clario for clean and timely data.

The Solution

RDt was already installed doing classic data quality and validation for Clario when the Tech Director recommended to extend RDt’s use for greater monitoring. RDt was further used to monitor the ETL logs and provide a process protection and remediation if any of the data failed; it was beyond monitoring and more important to the DevOps to DataOps cycle.

In addition, because the data could be designated as either active or inactive, RDt could track monitoring against that process metric as well as providing a profile of validation from upstream sources all the way downstream to where the data repository staged the data for import to Tableau. Clario leveraged the RDt platform to unify data quality across the enterprise with an automated process.

ETL Monitoring with Source to Target reconcilliation

Impact

Robust monitoring of ETL-produced data can be challenging because often the nature of ingest from different sources complicates the workflow – not all sources are alike. When you can quickly check the source of errors and have a metric and alarm before it hits the final repository, it simply saves time. The further impact is the trust to internal clients with SLA’s that are often within a few hours to remediate.

The RightData Edge

Clario provided feedback on RDt software and its functionality to do both data validation and data monitoring. The edge that RightData provided strengthens the position of the internal data team tasked with the integrity of the data quality and if something goes wrong, they can fix it easily. The software platform enables that greatly. Clario explains further: “There’s a lot of pressure to perform when we are meeting both internal and external data expectations for major pharm clients. We’ve been in business since 1972 and a hallmark of Clario is that clients trust us. Today, we have to demonstrate that they can trust every aspect of data as well… RightData is very much a part of that.”

Learn more about RDt

RDt is a comprehensive platform for data quality, risk, or compliance needs. Learn more or contact us to chat about your needs.

RDT Data Quality: A no-code data quality suite that improves data quality, reliability, consistency, and completeness of data. Data quality is a complex journey where metrics and reporting validate their work using powerful features such as:

Database Analyzer: Using Query Builder and Data Profiling, stakeholders analyze the data before using corresponding datasets in the validation and reconciliation scenarios.

Data Reconciliation: Comparing Row Counts. Compares number of rows between source and target dataset pairs and identifies tables for the row count not matching.

Data Validation: Rules based engine provides an easy interface to create validation scenarios to define validation rules against target data sets and capture exceptions.

Connectors For All Type of Data Sources: Over 150+ connectors for databases, applications, events, flat file data sources, cloud platforms, SAP sources, REST APIs, and social media platforms.

Data Quality: Ongoing discover that requires a quality-oriented culture to improve the data and commit to continuous process improvement.

Database Profiling: Digging deep into the data source to understand the content and the structure.

Data Reconciliation: An automated data reconciliation and the validation process that checks for completeness and accuracy of your data.

Data Health Reporting: Using dashboards against metrics and business rules, a process where the health and accuracy of your data is measured, usually with specific visualization.