BACK TO ALL POSTS

DataHub Community Updates: The Feb’23 Low-Down

Metadata Management

Open Source

Data Engineering

Project Updates

Data

Maggie Hays

Mar 7, 2023

Metadata Management

Open Source

Data Engineering

Project Updates

Data

Hello, DataHub Enthusiasts!

It’s March already (how?!), which means it’s time to share what the DataHub Community has been up to in February.

There’s a LOT that the fantastic Community has done to make DataHub better, easier to use, and all-around amazing.

Let’s get right to it.

Growing Strong: DataHub Community Updates

Our Slack Community now has 6400+ outstanding members, with over 400 folks joining us in February. We continue to have nearly 1,000 active users every week, and our emoji game is still going strong (up 200+ from last month! :D)

DataHub Community Snapshot - February 2023

If you’re new on your DataHub journey, remember to join our Slack ! The Community is incredibly supportive, kind, and eager to help 😊

Upcoming Events in March

DataHub<>BigQuery Workshop

For those just getting with DataHub, we’re hosting an interactive session on BigQuery ingestion. We have limited seats, and it’s happening this Thursday, so remember to RSVP here.

Workshop: Getting started with DataHub BigQuery Ingestion

Meet us at Data Council in Austin, March 28–30

The Core DataHub Team is stoked to head to Austin this month for Data Council. We’re packing a ton of content into these three days. Mark your calendar for the following:

Lastly, Acryl Data, and Astronomer are excited to invite you to an evening of tacos, beverages, and fun during Data Council. Join us at Lazarus Brewing on Wednesday, March 29th, at 5:30 pm for a night filled with networking opportunities, delicious food, and great company.

RSVP here, and we will send Uber codes to all attendees for transportation to the event.

Pssst: If you still need to sign up for Data Council, use the code DataHub20 to get 20% off your ticket.

Community Case Study: Hurb’s Journey with DataHub

We heard from Patrick Braz & the Hurb Team during February Town Hall about their DataHub adoption journey. Patrick shared how they use DataHub for easier discovery, security, and better decision-making — with fast-growing assets, new solutions, and growing complexity.

Check it out here:

DataHub + Herb = Data Hurb?

What’s Happening: Roadmap Updates

We launched v0.10.0 in early February, a massive release for us with many additions and improvements. Here’s a quick snapshot of what we shipped:

Project Updates: v0.10.0 Highlights

I’m most excited about

  1. Time-aware lineage, which is SUCH a great addition to an already amazing feature.
  2. Schema coverage: We’ve added support for JSON schema in this release, and we’ll continue to improve our coverage of all kinds of schemas in the ecosystem.

On the roadmap front, here’s how we’re doing. Things are in motion, and I can’t wait to see how they land :)

Progress toward our Q1'23 Roadmap as of Feb’23

Product Updates: Improving and Simplifying DataHub
Improvements to DataHub Search

We’ve improved DataHub’s search experience to implement a scrolling search. We moved on from using an elastic search API approach that prevented you from searching past 10,000 search results. A bonus is that this also fixes our long-standing CSV download issue for large search results.

This update makes our Search feature so much more effective, intuitive, and valuable, especially when searching across lineage — where there’s much more complexity within the metadata graph.

Ryan Holstien shows off DataHub’s new and improved Search experience

Redesign of the Queries Tab

You asked, we delivered. The Community needed a way to view, document, and share other commonly used SQL queries (say, those not executed by a daily job but run by analysts in an ad hoc fashion). We needed a curated query set — alongside top or recently executed queries.

Our redesigned Queries tab on the dataset page helps you

  • Discover more queries
  • Add an element of human curation, and thus
  • Pack in more context for better discovery.

John shares how DataHub redesigned Queries tab can help with better discovery and context

Unified Redshift Ingestion

Previously, our Redshift ingestion source required configuring two connectors: one for metadata & lineage ingestion and another to extract top queries & usage. We have unified the Redshift ingestion — and brought along a few more changes you’ll like.

Here’s Tamás telling you everything you need to know about this.

How DataHub’s unified ingestion for Redshift works

Subscriptions and Notifications

One of our most requested features was the ability to subscribe to entities and get notifications about changes.

Ability to subscribe to an entity to receive notifications when something changes

As we dived deep into this with community interactions and surveys, we saw the below themes/requirements emerge

  • Getting ahead of breaking changes (breaking changes may come from unknown upstream entities)
  • Understanding how new governance classifications drive compliance
  • Optimizing for self-service
  • Notifying data consumers on Slack and Email.

Here’s how this exciting project is shaping up:

Brittanie, with a sneak peek into the Subscriptions and Notifications features

Better Together: Community Contributions and Shoutouts

A massive shout out to the wonderful humans working behind the scenes to keep the DataHub project thriving!

Datahub Project Contributors - February 2023

You’re invited to contribute to the DataHub Community blog!

If you have something to share with the Community — about data governance, exciting data discovery projects, or DataHub deployments — contribute to the DataHub Blog Community Program.

Check out this month’s entry by community member Venkata Krishnan: Starting a Data Governance Journey.

Inspired by our conversations with the Community, we’ve compiled some tips on rolling out DataHub within your organization. Here’s Paul sharing 5 Tips for Rolling out a Data Catalog from the DataHub Community.

Paul shares 5 tried and trusted tips for rolling out DataHub to the rest of your organization.

Paul shares 5 tried and trusted tips for rolling out DataHub to the rest of your organization

That’s it from me for now, folks!

See you in Slack 🙂


Connect with DataHub

Join us on SlackSign up for our NewsletterFollow us on Twitter

Metadata Management

Open Source

Data Engineering

Project Updates

Data

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

TermsPrivacySecurity
© 2024 Acryl Data