BACK TO ALL POSTS

Humans of DataHub: Mert Tunç

Humans of DataHub

Data Engineering

Open Source

Community

Elizabeth Cohen

Dec 12, 2022

Humans of DataHub

Data Engineering

Open Source

Community

Humans of DataHub

Humans of DataHub

We are excited to share our next installment of Humans of DataHub, featuring Mert Tunç, a Staff Software Engineer at Udemy. Udemy is a global destination for teaching and learning online.

Mert Tunç

Mert Tunç 👋

Mert shared his favorite things about the DataHub Community, how Column-level Lineage has supercharged his organization, and much more. You don’t want to miss this interview!


How did you first learn about DataHub?

At Udemy, we had some pain points that could be solved with a data catalog tool. During our research for an open-source solution for a data catalog, we discovered DataHub. DataHub was an up-and-coming alternative because of its capabilities, parallel view, and features that we were looking for, and it has an incredibly engaged community.

What do you enjoy most about the DataHub Community?

The DataHub Community is active. Many people learn from each other and have the opportunity to contribute directly back to the community. One reason this healthy community gets formed is the guidance given for contributing. The core DataHub team is committed to improving the project and includes Community members through collaboration. The team asks for feedback and guides documentation, code contributions, and answers. You can tell they want to encourage participation. It is a unique experience to be part of such a collaborative product development process.

What has DataHub enabled within your organization?

The current focus is on the lineage-side of DataHub. It was a must-have for many critical use cases, and the impact analysis UI makes migrations and refactors much more manageable. At the same time, we utilize the GrahpQL endpoint for more complex queries over the lineage data stored uniformly in the DataHub backend.

What are you most excited to see within DataHub?

I am excited about everything! DataHub is on a great path and is taking steps to become a standard for being *the* hub for metadata. That being said, my pick would be the UI/UX improvements, especially the ones making the platform more accessible to those not directly from the data domain, like business users.

What’s your favorite DataHub feature/use case?

Column-level lineage is my favorite feature and allows my team to have uniformity. It was a nightmare before to find the producers/consumers of a data entity reliably. Lineage has made essential day-to-day operations such as schema changes and refactorings (to name a few) much easier and more accurate.

What is your favorite DataHub slack channel, and why?

#announcements. I want to follow the news around the tool and updates from DataHub’s core team; the #announcements channel is how I do that. The team highlights new releases, features, and upcoming metadata-related conferences they’ll attend. They also share the monthly Datahub Town Halls agendas.

What advice would you give to someone just joining the DataHub Community?

Don’t hesitate to ask questions or respond to others. Slack is an excellent resource because of the active community; most queries or issues you will face have likely been discussed. Most importantly, feel free to open some PRs. It is surprisingly easy to create a pull request and get a quick review. My personal favorite is the documentation updates or fixes. Even though it may be easy for you, the effect on documentation improvement is huge!


If you are new to DataHub, just beginning to understand what “metadata” and “modern data stack” mean, or you’ve just read these words for the first time (howdy, friends! 🤠), let us take a moment to introduce ourselves and share a little history;

DataHub is an extensible metadata platform, enabling data discovery, data observability, and federated governance to tame the complexity of increasingly diverse data ecosystems. Originally built at LinkedIn, DataHub was open-sourced under the Apache 2.0 License in 2020. It now has a thriving community with over 5.4k members (and growing!) and 300+ code contributors, and many companies are actively using DataHub in production.

We believe that data-driven organizations need a reimagined developer-friendly data catalog to tackle the diversity and scale of the modern data stack. Our goal is to provide the most reliable and trusted enterprise data graph to empower data teams with best-in-class search and discovery and enable continuous data quality based on DataOps practices. This allows central data teams to scale their effectiveness and companies to maximize the value they derive from data.

Want to learn more about DataHub and how to join our community? Visit https://datahubproject.io and say hello on Slack. 👋

Humans of DataHub

Data Engineering

Open Source

Community

NEXT UP

Governing the Kafka Firehose

Kafka’s schema registry and data portal are great, but without a way to actually enforce schema standards across all your upstream apps and services, data breakages are still going to happen. Just as important, without insight into who or what depends on this data, you can’t contain the damage. And, as data teams know, Kafka data breakages almost always cascade far and wide downstream—wrecking not just data pipelines, and not just business-critical products and services, but also any reports, dashboards, or operational analytics that depend on upstream Kafka data.

When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe

Acryl Observe is a complete observability solution offered by Acryl Cloud. It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks. But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.

John Joyce

2024-04-23

Five Signs You Need a Unified Data Observability Solution

A data observability tool is like loss-prevention for your data ecosystem, equipping you with the tools you need to proactively identify and extinguish data quality fires before they can erupt into towering infernos. Damage control is key, because upstream failures almost always have cascading downstream effects—breaking KPIs, reports, and dashboards, along with the business products and services these support and enable. When data quality fires become routine, trust is eroded. Stakeholders no longer trust their reports, dashboards, and analytics, jeopardizing the data-driven culture you’ve worked so hard to nurture

John Joyce

2024-04-17

TermsPrivacySecurity
© 2025 Acryl Data