Acryl DataHub

50+ integrations and counting

Connection Type
Features

Select

Platform Types

Select

Search
Reset

Airflow

Airflow is an open-source data orchestration tool used for scheduling, monitoring, and managing complex data pipelines.

Athena

Athena is a serverless interactive query service that enables users to analyze data in Amazon S3 using standard SQL.

Azure AD

Azure AD is a cloud-based identity and access management tool that provides secure authentication and authorization for users and applications.

BigQuery

BigQuery is a cloud-based data warehousing and analytics tool that allows users to store, query, and analyze large datasets quickly and efficiently.

Business Glossary

A source provided by DataHub for ingesting glossary metadata that provides a comprehensive list of business terms and definitions used within an organization.

ClickHouse

ClickHouse is an open-source column-oriented database management system designed for high-performance data processing and analytics.

CSV

An ingestion source for enriching metadata provided in CSV format provided by DataHub

Databricks

Databricks is a cloud-based data processing and analytics platform that enables data scientists and engineers to collaborate and build data-driven applications.

dbt

dbt is a data transformation tool that enables analysts and engineers to transform data in their warehouses through a modular, SQL-based approach.

Delta Lake

Delta Lake is an open-source data lake storage layer that provides ACID transactions, schema enforcement, and data versioning for big data workloads.

Demo Data

Demo Data is a data tool that provides sample data sets for demonstration and testing purposes.

Druid

Druid is an open-source data store designed for real-time analytics on large datasets.

Elasticsearch

Elasticsearch is a distributed, open-source search and analytics engine designed for handling large volumes of data.

Feast

Feast is an open-source feature store that enables teams to manage, store, and discover features for machine learning applications.

File

An ingestion source for single files provided by DataHub

File Based Lineage

File Based Lineage is a data tool that tracks the lineage of data files and their dependencies.

Glue

Glue is a data integration service that allows users to extract, transform, and load data from various sources into a data warehouse.

Great Expectations

Great Expectations is an open-source data validation and testing tool that helps data teams maintain data quality and integrity.

Hive

Hive is a data warehousing tool that facilitates querying and managing large datasets stored in Hadoop Distributed File System (HDFS).

Iceberg

Iceberg is a data tool that allows users to manage and query large-scale data sets using a distributed architecture.

JSON Schemas

JSON Schemas is a data tool used to define the structure, format, and validation rules for JSON data.

Kafka

Kafka is a distributed streaming platform that allows for the processing and storage of large amounts of data in real-time.

Kafka Connect

Kafka Connect is an open-source data integration tool that enables the transfer of data between Apache Kafka and other data systems.

LDAP

LDAP (Lightweight Directory Access Protocol) is a data tool used for accessing and managing distributed directory information services over an IP network.

Looker

Looker is a business intelligence and data analytics platform that allows users to explore, analyze, and share data insights in real-time.

MariaDB

MariaDB is an open-source relational database management system that is a fork of MySQL.

Metabase

Metabase is an open-source business intelligence and data visualization tool that allows users to easily query and visualize their data.

Microsoft SQL Server

Microsoft SQL Server is a relational database management system designed to store, manage, and retrieve data efficiently and securely.

Mode

Mode is a cloud-based data analysis and visualization platform that enables businesses to explore, analyze, and share data in a collaborative environment.

MongoDB

MongoDB is a NoSQL database that stores data in flexible, JSON-like documents, making it easy to store and retrieve data for modern applications.

MySql

MySQL is an open-source relational database management system that allows users to store, organize, and retrieve data efficiently.

NiFi

NiFi is a data integration tool that allows users to automate the flow of data between systems and applications.

Okta

Okta is a cloud-based identity and access management tool that enables secure and seamless access to applications and data across multiple devices and platforms.

OpenAPI

OpenAPI is a specification for building and documenting RESTful APIs.

Oracle

Oracle is a relational database management system that provides a comprehensive and integrated platform for managing and analyzing large amounts of data.

Postgres

Postgres is an open-source relational database management system that provides a powerful tool for storing, managing, and analyzing large amounts of data.

PowerBI

PowerBI is a business analytics service by Microsoft that provides interactive visualizations and business intelligence capabilities with an interface simple enough for end users to create their own reports and dashboards.

Presto

Presto is an open-source distributed SQL query engine designed for fast and interactive analytics on large-scale data sets.

Presto on Hive

Presto on Hive is a data tool that allows users to query and analyze large datasets stored in Hive using SQL-like syntax.

Protobuf Schemas

Protobuf Schemas is a data tool used for defining and serializing structured data in a compact and efficient manner.

Pulsar

Pulsar is a real-time data processing and messaging platform that enables high-performance data streaming and processing.

Redash

Redash is a data visualization and collaboration platform that allows users to connect and query multiple data sources and create interactive dashboards and visualizations.

Redshift

Redshift is a cloud-based data warehousing tool that allows users to store and analyze large amounts of data in a scalable and cost-effective manner.

S3 Data Lake

S3 Data Lake is a cloud-based data storage and management tool that allows users to store, manage, and analyze large amounts of data in a scalable and cost-effective manner.

Sagemaker

SageMaker is a data tool that provides a fully-managed platform for building, training, and deploying machine learning models at scale.

Salesforce

Salesforce is a cloud-based customer relationship management (CRM) platform that helps businesses manage their sales, marketing, and customer service activities.

SAP HANA

SAP HANA is an in-memory data platform that enables businesses to process large volumes of data in real-time.

Slack

Send notifications to Slack channels on updates to entities in DataHub.

Snowflake

Snowflake is a cloud-based data warehousing platform that allows users to store, manage, and analyze large amounts of structured and semi-structured data.

Spark

Spark is a data processing tool that enables fast and efficient processing of large-scale data sets using distributed computing.

SQLAlchemy

SQLAlchemy is a Python-based data tool that provides a set of high-level API for connecting to relational databases and performing SQL operations.

Superset

Superset is an open-source data exploration and visualization platform that allows users to create interactive dashboards and perform ad-hoc analysis on various data sources.

Tableau

Tableau is a data visualization and business intelligence tool that helps users analyze and present data in a visually appealing and interactive way.

Microsoft Teams

Send notifications to Teams channels on updates to entities in DataHub.

Trino

Trino is an open-source distributed SQL query engine designed to query large-scale data processing systems, including Hadoop, Cassandra, and relational databases.

Vertica

Vertica is a high-performance, column-oriented, relational database management system designed for large-scale data warehousing and analytics.

0
TermsPrivacySecurity
© 2024 Acryl Data