Data Pipeline

Databricks Notebook

If you're looking for a way to add data quality to your spark pipelines then look no further. OwlDQ is built on spark from the ground up and can provide the fastest performance and highest quality analytics on dataframes. Simply passing in a spark DF to Owl and automatically add Profiles, Cataloging, Duplicates, Outliers, Relationships, Patterns and more to your dataframes.

This is a great option for developer savvy groups or teams that want to control their data pipelines out side of a dashboard tool, but still have access to the rich visuals of OwlDQ without having to build them. Let data analysts and managers observe the flow of data using OwlDQ UI, while developers stay in their favorite notebook. Enables real-time control over strange things happening in your data.

Finally a tool that let's me control my code and pipelines but satisfies my Chief Data Officer's concern regarding how we are handling data quality.

Read More

Provide transparency to analysts, use a feedback loop in Owl UI to make your DQ smarter