BACK TO ALL POSTS

DataHub Community Updates: March’23 Rundown

Metadata

Data Engineering

Open Source

Data

Project Updates

Maggie Hays

Apr 7, 2023

Metadata

Data Engineering

Open Source

Data

Project Updates

Hello, hello, DataHub Enthusiasts!

Ready to hear what’s been the amazing DataHub Community has been up to in March? Let’s get straight to it!

The DataHub Community Continues to Grow and Thrive

We added over 350 members in March, and our Slack community grew to over 6700+ members with 1000 active users every week!

DataHub Community Snapshot - March 2023

If you’re new to DataHub, I can’t recommend our Slack channel enough — you can be sure that this amazing community always has your back!

Coming Soon: DataHub Community Council (DCC)

I’ll say it again: the DataHub Community is growing FAST 😅😅 Let’s team up to keep up!

Soon, we’ll be rolling out the DataHub Community Council, a forum for DataHub Enthusiasts to collaborate closely with the Core DataHub Team.

As a member, you can

  • Participate in private, expert panel discussions with the Core DataHub Team
  • Get early access to future open-source releases
  • Inform and guide the DataHub strategy and roadmap and influence our quarterly roadmap prioritization via a DCC forum
  • Get direct access to collaborate with other DCC members

…AND

  • Own and show off some Custom DCC Swag! 😎

Formal annoucement to come — watch this space for more updates!

Roadmap Updates and Release Highlights

We had some BIG deliverables to reach in Q1'23 — here’s where we landed by the end of March:

We recently released DataHub v0.10.1 , with a continued focus on improving user and developer experience as well as metadata ingestion. Here’s what this version brings with it:

Here are a few exciting highlights in v0.10.1:

Snowflake Tag and Term Propagation

We’re introducing a DataHub Action that will allow you to propagate tags and terms between entities that are connected via lineage — all the way into downstream warehouses like Snowflake.

Given Snowflake’s own comprehensive tagging capabilities, this improvement will enable you to attach tags to Snowflake tables on DataHub, and propagate those tags into Snowflake.

This way, you get all the benefits of Snowflake’s data management capabilities — by ensuring that these tags show up automatically from DataHub.

Improved Search

We’ve done a ton of work to make our search functionality more reliable, detailed, responsive, and intuitive — across a large number of entities.

With this relase, DataHub Search is going to feel much snappier while being much more robust at a high scale.

Ingestion Improvements

We’ve built improvements around lineage extraction, entity descriptions, memory usage improvements, etc, with a focus on BigQuery & Power BI.

Community Case Study: Jumio’s Journey with DataHub

During the March Town Hall, Ray Suliteanu from the Jumio (https://jumio.com/) team shared how they are using DataHub for easier discovery, compliance, and improved productivity for complex data management — with DataHub allowing them to work with distributed datasets, bringing in data from all over with its push-pull model (as opposed to having a central entity fetching all the data).

Check out this video to know how DataHub is helping Jumio with improved access control for governance and better search — beyond datasets, extending to models, features, jobs, and other metadata.

How Jumio is leveraging DataHub for data discovery, governance, and improved data management and productivity

Product Updates: Improving and Simplifying DataHub

DataHub 201: Data Debugging

We’re working on a few features that will enable not just layer-to-layer debugging on DataHub, but also aid in end-to-end debugging of data issues and flagging data quality issues across the lineage graph.

Here’s John telling you everything you need to know about how data debugging in DataHub just got a LOT easier.

Sneak Peek: Upcoming Improvements to Search

During the March Town Hall, we gave a preview of upcoming changes to the search experince. We’re bringing some exciting improvements to reduce the number of steps required to search for, and ultimately find, the data that matters most.

Check out this video where Brittanie Jakubowich from the Acryl Data Team breaks down what changes are to come:

Getting Started with DataHub’s APIs

We hear time & time again from our Community Members that DataHub’s APIs are gamechangers for how they manage metadata within their organization. We want to make sure that a wide range of folks understand the power of our APIs, and have a clear set of examples of how to go about using them.

Hyejin Yoon (Developer Relations Engineer, DataHub) has been putting in some serious work to ramp up our API guides. Check out this video to learn all about them:

DataHub Integrations: Documentation Support

We now have a dedicated go-to page for all DataHub integrations. This will help you understand — at a glance — all the different systems that DataHub integrates with. You can search across connection types (push-pull), features, and platform types.

One-stop-shop view of DataHub’s supported integrations

Community Contributions and Shoutouts

This is my absolute favorite part of my job — showing well-deserved appreciation to folks in the DataHub Community that are going above & beyond to contribute back to the project.

A MASSIVE shoutout to our March Project contributors — we had 20 first-time contributors, which is simply outstanding!

DataHub Project Contributors - March 2023

In March, we had _a ton_ of Community Members step up to help others out in Slack; let’s show them some love! HUGE shoutout to these folks:

Supporting the DataHub Community

Write for the DataHub Community Blog!

If you have something to share with the community — about how you’re using DataHub, challenges you’re solving, data governance, and other exciting data discovery projects, why don’t you consider contributing to the DataHub Blog Community Program ?

For inspiration, check out this month’s entry by community member Ada Draginda who shares how Notion is using DataHub to automate propagation between Data Hub and dbt.

Check it out: Automating Propagations with DataHub and DataHub-Tools .

I continue to be mindblown (and thrilled!) by the velocity of this amazing community and can’t wait to see what the next quarter holds for us!

Metadata

Data Engineering

Open Source

Data

Project Updates

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

Get started with Acryl today.
Acryl Data delivers an easy to consume DataHub platform for the enterprise
See it in action
Acryl Data Logo
Acryl DataHub
Acryl ObserveCustomer Stories
TermsPrivacySecurity
© 2024 Acryl Data