Cross-Catalog & Cross-Schema Querying

  • This feature enables you to combine and analyze data from different schemas and catalogs, providing a comprehensive view of your data landscape.

  • To perform cross-schema queries, you need to fully qualify the path by providing the catalog name, schema name, and table name.

Usage Examples:

  1. Cross-Schema Query: This query retrieves columns column1 and column2 from the table table_name within the specified schema_name and catalog_name.

SELECT column1, column2
FROM catalog_name.schema_name.table_name;
  1. Cross-Catalog Query: This query retrieves columns column1 and column2 from the table table_name within the specified schema_name and catalog_name.

SELECT column1, column2
FROM catalog_name.schema_name.table_name;

Limitations:

  • Slight Increase in Parsing Time: Due to the additional complexity introduced by cross-schema and cross-catalog querying, there may be a slight increase in parsing time for initial queries. However, subsequent queries benefit from query plan caching and performance optimizations.

  • Fully Qualified Path Requirement: When performing cross-schema or cross-catalog queries, it is essential to provide the entire path, including the catalog name, schema name, and table name.

    • Examples:

      • Hive Catalog: <catalog_name>.<db_name>.<table_name> catalog_name - name of catalog specified during catalog creation in e6data

      • Glue Catalog: <catalog_name>.<db_name>.<table_name> catalog_name - name of catalog specified during catalog creation in e6data

      • Unity Metastore: <unity_catalog_name>.<db_name>.<table_name> unity_catalog_name - name of catalog which is created under Unity in Databricks

  • Same Account and Region for Cross-Catalog Queries: To query tables across catalogs, the catalogs must reside within the same account and region. This limitation ensures data security and optimal performance.

Last updated