Catalogs
Understanding Catalogs in e6data
Most analytical data is stored in cloud object stores like Amazon S3, GCS, or Azure Blob Storage. However, structural metadata such as table names, schemas, and partitions managed by metastores like Hive, AWS Glue, Dataproc Metastore, Unity Catalog, or Apache Polaris.
In e6data, a Catalog connects to these metastores to provide the metadata needed for querying data stored in object stores efficiently.
Catalog Service
Description
Traditional metastore widely used with Hadoop and Spark.
AWS-managed metastore with schema versioning and S3 integration.
Databricks’ unified metadata and governance layer.
REST-based Iceberg catalog for scalable metadata management.
Last updated