Delta Lake

e6data supports querying Delta Lake tables through external catalogs such as AWS Glue, Apache Hive, Microsoft Fabric, and Unity Catalog (Databricks). Delta Lake is an open-source table format that brings ACID transactions, scalable metadata handling, and unifies streaming and batch data processing.

Connect to catalog

How It Works

Delta Lake tables are managed and exposed through compatible external catalogs. e6data connects to these catalogs using supported interfaces and accesses metadata and table definitions to query the data.

e6data does not provide a native Delta catalog interface. Instead, Delta tables must be registered in a supported catalog service (like AWS Glue or Hive), and then accessed via e6data.

Once the catalog is connected, Delta Lake tables registered within it become available for querying in e6data.

Key Features of Delta Lake

  • Time Travel (read-only): Access historical snapshots of data if versioning is enabled

  • Deletion Vectors (Merge-on-Read): Efficiently support updates and deletes without rewriting entire files

  • Append-Only Mode (Copy-on-Write): Optimize write performance by appending new data

  • File-Level Statistics: Enables faster queries via intelligent file pruning

  • ACID Compliance: Guarantees atomic and consistent reads/writes

  • Schema Evolution: Supports changes like adding new columns

  • Efficient Metadata Handling: Scales to petabyte-scale datasets with fast query planning

What Is Supported

  • Reading Delta Lake tables via:

    • AWS Glue

    • Apache Hive

    • Microsoft Fabric

    • Unity Catalog (Databricks)

  • Support for partitioned and non-partitioned Delta tables (based on catalog configuration).

  • Access to Delta metadata: schemas, partitions, and snapshots.

  • Catalog integration using interfaces that support Delta tables.

What Is Not Supported (Planned for Future)

  • Writing to Delta Lake tables from e6data

  • Creating Delta tables via the e6data UI

  • DML operations (e.g., MERGE, UPDATE, DELETE)

  • Full metadata synchronization when Delta tables are written from multiple engines.

-- List all tables in a namespace
SHOW TABLES FROM delta_catalog.analytics;

-- Query from a Delta table
SELECT *
FROM delta_catalog.analytics.user_activity
WHERE event_type = 'purchase';

Last updated