BACK TO ALL POSTS

DataHub Summer’22 Rundown☀️

Metadata

Data Engineering

Project Updates

Open Source

Looker

Maggie Hays

Sep 7, 2022

Metadata

Data Engineering

Project Updates

Open Source

Looker

👋 Hello, DataHub Enthusiasts!

It’s hard to believe summer is already winding down; I hope each of you found some time to soak up the sun and recharge. Ready to hear what the DataHub Community has been up to these last couple of months?

Let’s get into it!

The DataHub Community continues to THRIVE!

Every time I blink, there’s a new member in the DataHub Community — we’ve welcomed over 700 people to our vibrant Slack Community within the last two months, and there are no signs of things slowing down!

Datahub Community at a Glance

We continue to see more and more engagement in our Monthly Town Halls, and we are always thrilled to welcome new contributors to the project!

Join us on Slack  RSVP to our Next Town Hall  Follow us on Twitter

NEW! Bulk Edit Metadata via the DataHub UI

DataHub is making it easier than ever to keep your metadata up to date — beginning with v0.8.43, you can now add or remove Owners, Glossary Terms, Tags, Domains, and Deprecate Status to multiple entities with a few clicks:

Example workflow of changing Deprecate Status and adding Owners to multiple entities at once

Example workflow of changing Deprecate Status and adding Owners to multiple entities at once

Want to learn more? Watch John Joyce’s walkthrough from the July Town Hall here:


Looker Integration: Improved Search Experience

We’ve heard feedback from the Community that end-users want an easier way to search for Looker Looks and Dashboards that contain a specific measure or dimension.

So we built just that! Starting with v0.8.44, DataHub indexes all measures and dimensions referenced in Looks and Dashboards, ensuring they will bubble up toward the top of your search results list.

Hear all about it from Gabe Lyons during the August Town Hall:


New to DataHub? Get started with our Docs Site!

We know there’s a lot to learn when it comes to getting ramped up on DataHub, so we want to ensure that folks have the resources they need to get up and running as quickly as possible.

We recently rolled out some significant improvements to the DataHub Docs Site to make it easier and more intuitive for DataHub Developers and End-Users alike to navigate our resources.

The DataHub Docs Site has a new look!

The DataHub Docs Site has a new look!

Keep an eye on that space — we’ll be rolling out more user guides and tutorials in the upcoming months!

BIG Improvements to UI-Based Ingestion

We’re on a mission to make it easy for DataHub users to ingest metadata into DataHub. Starting with v0.8.42, you can now easily configure metadata ingestion in the UI to connect to Snowflake, BigQuery, Looker, and Tableau with an easy-to-follow form.

We know that many teams use a combination of the DataHub CLI and UI to ingest metadata, so it’s critical to provide a unified view of run history and outcomes. With this in mind, we rolled out some massive improvements to UI-based Ingestion in v0.8.44, including:

  • view live logs during job execution
  • view ingestion run summary (i.e., number of entities ingested)
  • rollback functionality in case something goes astray
  • unified overview of UI- and CLI-based ingestion runs

Want to see it in action? Check out Chris Collins’ demo from the August Town Hall:

Metadata Ingestion Improvements, Galore!

The DataHub Community is hard at work, ensuring our existing Ingestion Sources are performant and extract as much valuable metadata as possible. Here are some highlights from v0.8.41 through v0.8.44:

Extracting New Metadata Elements

  • Chart Entity now supports chartUsageStatistics — this helps index search results to surface the most highly used resources towards the top
  • dbt ingestion supports auto-extracting owner from the meta block

Improvements to Existing Sources

  • Stateful Ingestion now supported for the Glue Connector
  • Improved Snowflake Connector is now available; we expect this to provide a reduction in ingestion run-time and lower levels of complexity.
  • Configure your BigQuery Connector to profile only a subset of tables

Miscellaneous Ingestion Updates


265 people have contributed to DataHub to date

Between July and August 2022, we merged 366 pull requests from 50 contributors, 15 of whom contributed for the first time (names in bold):

@abiwill @aditya-radhakrishnan @aezomz @alexey-kravtsov @amanda-her @Ankit-Keshari-Vituity @anshbansal @chriscollins3456 @daha @de-kwanyoung-son @divyamanohar-stripe @dougpm @gabe-lyons @glinmac @hemanthkotaprolu @hsheth2 @Jiafi @jjoyce0510 @justinas-marozas @koconder @ksrinath @leifker @liyuhui666 @maggiehays @Masterchen09 @mayurinehate @milimetric @mohdsiddique @ms32035 @MugdhaHardikar-GSLab @NavinSharma13 @neojunjie @ngamanda @NoahFournier @pedro93 @remisalmon @rslanka @RyanHolstien @salihcaan @Santhin @sgomezvillamor @shirshanka @skylersinclair @szalai1 @tengis @timcosta @topleft @treff7es @vcs9 @xiphl

We are endlessly grateful for the members of this Community — we wouldn’t be here without you!

One Last Thing —

I caught up with my teammate, John Joyce:

Maggie: Hey, John! It’s been a busy summer for DataHub. What upcoming feature are you most excited about?

John: I’m most excited about Advanced Search, which is currently in the works. I think it will allow end users to begin asking much more interesting questions of their Metadata Graph. A lot of power is going to be unlocked by the flexibility this will offer!

M: Totally agree — Gabe’s demo during August Town Hall was so cool! Next question: what song have you been playing on repeat this summer?

J: “I Need a Forest Fire” by Bon Iver & James Blake!

That’s it for this round; see ya on the Internet :)


Connect with DataHub

Join us on SlackSign up for our NewsletterFollow us on Twitter

Metadata

Data Engineering

Project Updates

Open Source

Looker

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

TermsPrivacySecurity
© 2025 Acryl Data