Unified Data Quality

Do you catch bad data, or does bad data catch you by surprise?

Anomaly Detection

Rapidly apply built-in predictive models to discover problems. Offload data science tasks such as feature selection, bucketing, and binning. Select and run machine learning models with just a few clicks. From data discovery to complex predictions, OwlDQ offers algorithms for duplicate detection, cross-column categorical patterns, and outlier analysis.

Trying to guess where our outliers were and then write code reactively to catch them didn't work. We needed a way to find our blindspots.

Read More
dashboard

Autogenerate Rules

One of the most time consuming aspects of data quality is writing the actual rules. OwlDQ can replace the task of implementing handwritten rules for standard technical data quality constraints. Rule conditions are adaptive so if your data has natural variance, the models will automatically adjust over time. For most, this means a reduction in 50%-70% reduction in total number of rules. Reducing 'data quality' technical debt is the primary reason for modernizing existing rule-based solutions.

With Owl we were able to throw away over 50% of our rules. We removed all low value null checks and simple conditions. Owl found and protected Pii data in places we didn't think to look.

Read More
rule

Reconciliation

Often the first and most common data routine is to move data from the source into the target system. This is growing even more popular with cloud migrations. OwlDQ checks that every record in every cell matches between copies (as well as standard row count, column, and conformity checks). This is most commonly used when loading third-party data files, during cloud migrations, and after moving data to persistent storage.

Owl's daily snapshot view revealed that our database tables were not in sync with the upstream source. Owl identified replication errors in our homegrown scripts and our commercial CDC tool.

Read More
validate

Our Partners

Owl Runs Everywhere

Pipelines

Build Rich DQ Pipelines in Spark. Zeppelin, Juypter, Databricks

AutoML

UI or CMDLINE interface to detect data issues, using unsuperivised learning.

Low-Code

Curated analytics with automatic visuals and easy to use wizard.

Cloud

Runs in containers on-prem or in AWS, Azure, GCP or Databricks.