BACK TO ALL POSTS

Extracting Column-Level Lineage from SQL

Harshal Sheth

Nov 3, 2023

Extracting Column-Level Lineage from SQL


Data people really care about data lineage, particularly from SQL.

We looked at a bunch of open-source SQL automated lineage tools and found that many shared the same underlying problem: they were unaware of the underlying table schemas, and hence couldn't generate accurate column-level lineage.

A metadata management platform and data catalog like DataHub already has APIs for retrieving the schema for any tables in your data stack. So, we built a SQL lineage parser that's schema-aware and can take advantage of DataHub’s APIs to generate accurate column-level lineage from SQL queries across a wide array of dialects. More...

Click here to read the full article, posted on DataHubProject.io

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

Get started with Acryl today.
Acryl Data delivers an easy to consume DataHub platform for the enterprise
See it in action
Acryl Data Logo
Acryl DataHub
Acryl ObserveCustomer Stories
TermsPrivacySecurity
© 2024 Acryl Data