Catalogs

Understanding Catalogs in e6data

Most analytical data is stored in cloud object stores like Amazon S3, GCS, or Azure Blob Storage. However, structural metadata such as table names, schemas, and partitions managed by metastores like Hive, AWS Glue, Dataproc Metastore, Unity Catalog, or Apache Polaris.

In e6data, a Catalog connects to these metastores to provide the metadata needed for querying data stored in object stores efficiently.

Catalog Service

Description

Hive Metastore

Traditional metastore widely used with Hadoop and Spark.

Glue Metastore

AWS-managed metastore with schema versioning and S3 integration.

Unity Catalog

Databricks’ unified metadata and governance layer.

Apache Polaris

REST-based Iceberg catalog for scalable metadata management.

Cross-account support requires specific IAM configuration depending on the cloud provider.

PreviousDelete a Workspace NextCreate Catalogs

Last updated 13 days ago