LogoLogo
  • Welcome to e6data
  • Introduction to e6data
    • Concepts
    • Architecture
      • e6data in VPC Deployment Model
      • Connect to e6data serverless compute
  • Get Started
  • Sign Up
  • Setup
    • AWS Setup
      • In VPC Deployment (AWS)
        • Prerequisite Infrastructure
        • Infrastructure & Permissions for e6data
        • Setup Kubernetes Components
        • Setup using Terraform in AWS
          • Update a AWS Terraform for your Workspace
        • AWS PrivateLink and e6data
        • VPC Peering | e6data on AWS
      • Connect to e6data serverless compute (AWS)
        • Workspace Creation
        • Catalog Creation
          • Glue Metastore
          • Hive Metastore
          • Unity Catalog
        • Cluster Creation
    • GCP Setup
      • In VPC Deployment (GCP)
        • Prerequisite Infrastructure
        • Infrastructure & Permissions for e6data
        • Setup Kubernetes Components
        • Setup using Terraform in GCP
        • Update a GCP Terraform for your Workspace
      • Connect to e6data serverless compute (GCP)
    • Azure Setup
      • Prerequisite Infrastructure
      • Infrastructure & Permissions for e6data
      • Setup Kubernetes Components
      • Setup using Terraform in AZURE
        • Update a AZURE Terraform for your Workspace
  • Workspaces
    • Create Workspaces
    • Enable/Disable Workspaces
    • Update a Workspace
    • Delete a Workspace
  • Catalogs
    • Create Catalogs
      • Hive Metastore
        • Connect to a Hive Metastore
        • Edit a Hive Metastore Connection
        • Delete a Hive Metastore Connection
      • Glue Metastore
        • Connect to a Glue Metastore
        • Edit a Glue Metastore Connection
        • Delete a Glue Metastore Connection
      • Unity Catalog
        • Connect to Unity Catalog
        • Edit Unity Catalog
        • Delete Unity Catalog
      • Cross-account Catalog Access
        • Configure Cross-account Catalog to Access AWS Hive Metastore
        • Configure Cross-account Catalog to Access Unity Catalog
        • Configure Cross-account Catalog to Access AWS Glue
        • Configure Cross-account Catalog to Access GCP Hive Metastore
    • Manage Catalogs
    • Privileges
      • Access Control
      • Column Masking
      • Row Filter
  • Clusters
    • Edit & Delete Clusters
    • Suspend & Resume Clusters
    • Cluster Size
    • Load Based Sizing
    • Auto Suspension
    • Query Timeout
    • Monitoring
    • Connection Info
  • Pools
    • Delete Pools
  • Query Editor
    • Editor Pane
    • Results Pane
    • Schema Explorer
    • Data Preview
  • Notebook
    • Editor Pane
    • Results Pane
    • Schema Explorer
    • Data Preview
  • Query History
    • Query Count API
  • Connectivity
    • IP Sets
    • Endpoints
    • Cloud Resources
    • Network Firewall
  • Access Control
    • Users
    • Groups
    • Roles
      • Permissions
      • Policies
    • Single Sign-On (SSO)
      • AWS SSO
      • Okta
      • Microsoft My Apps-SSO
      • Icons for IdP
    • Service Accounts
    • Multi-Factor Authentication (Beta)
  • Usage and Cost Management
  • Audit Log
  • User Settings
    • Profile
    • Personal Access Tokens (PAT)
  • Advanced Features
    • Cross-Catalog & Cross-Schema Querying
  • Supported Data Types
  • SQL Command Reference
    • Query Syntax
      • General functions
    • Aggregate Functions
    • Mathematical Functions & Operators
      • Arithematic Operators
      • Rounding and Truncation Functions
      • Exponential and Root Functions
      • Trigonometric Functions
      • Logarithmic Functions
    • String Functions
    • Date-Time Functions
      • Constant Functions
      • Conversion Functions
      • Date Truncate Function
      • Addition and Subtraction Functions
      • Extraction Functions
      • Format Functions
      • Timezone Functions
    • Conditional Expressions
    • Conversion Functions
    • Window Functions
    • Comparison Operators & Functions
    • Logical Operators
    • Statistical Functions
    • Bitwise Functions
    • Array Functions
    • Regular Expression Functions
    • Generate Functions
    • Cardinality Estimation Functions
    • JSON Functions
    • Checksum Functions
    • Unload Function (Copy into)
    • Struct Functions
  • Equivalent Functions & Operators
  • Connectors & Drivers
    • DBeaver
    • DbVisualiser
    • Apache Superset
    • Jupyter Notebook
    • Tableau Cloud
    • Tableau Desktop
    • Power BI
    • Metabase
    • Zeppelin
    • Python Connector
      • Code Samples
    • JDBC Driver
      • Code Samples
      • API Support
    • Configure Cluster Ingress
      • ALB Ingress in Kubernetes
      • GCE Ingress in Kubernetes
      • Ingress-Nginx in Kubernetes
  • Security & Trust
    • Best Practices
      • AWS Best Practices
    • Features & Responsibilities Matrix
    • Data Protection Addendum(DPA)
  • Tutorials and Best Practices
    • How to configure HIVE metastore if you don't have one?
    • How-To Videos
  • Known Limitations
    • SQL Limitations
    • Other Limitations
    • Restart Triggers
    • Cloud Provider Limitations
  • Error Codes
    • General Errors
    • User Account Errors
    • Workspace Errors
    • Catalog Errors
    • Cluster Errors
    • Data Governance Errors
    • Query History Errors
    • Query Editor Errors
    • Pool Errors
    • Connectivity Errors
  • Terms & Condition
  • Privacy Policy
    • Cookie Policy
  • FAQs
    • Workspace Setup
    • Security
    • Catalog Privileges
  • Services Utilised for e6data Deployment
    • AWS supported regions
    • GCP supported regions
    • AZURE supported regions
  • Release Notes & Updates
    • 6th Sept 2024
    • 6th June 2024
    • 18th April 2024
    • 9th April 2024
    • 30th March 2024
    • 16th March 2024
    • 14th March 2024
    • 12th March 2024
    • 2nd March 2024
    • 10th February 2024
    • 3rd February 2024
    • 17th January 2024
    • 9th January 2024
    • 3rd January 2024
    • 18th December 2023
    • 12th December 2023
    • 9th December 2023
    • 4th December 2023
    • 27th November 2023
    • 8th September 2023
    • 4th September 2023
    • 26th August 2023
    • 21st August 2023
    • 19th July 2023
    • 23rd May 2023
    • 5th May 2023
    • 28th April 2023
    • 19th April 2023
    • 15th April 2023
    • 10th April 2023
    • 30th March 2023
Powered by GitBook
On this page
  • Summary
  • Usage Notes
  • Platform Release Notes
  • WorkspaceAdmin Role
  • Multi-Catalog Support
  • Engine Release Notes
  • SQL Optimizations
  • Bug Fixes & Improvements
  • Know Limitations
  1. Release Notes & Updates

23rd May 2023

Covers the introduction of the Workspace Admin role, multi-catalog support, SQL optimizations, bug fixes, and known limitations.

Previous19th July 2023Next5th May 2023

Last updated 1 month ago

Summary

The 23rd May 2023 release of e6data includes the following features & enhancements:

Usage Notes

To get access to all features from this release:

  • Suspend and resume active Clusters

Platform Release Notes

WorkspaceAdmin Role

A new default role named WorkspaceAdmin has been added. This role allows easier creation of users who should only have access to manage workspaces and, view the catalogs & clusters associated with a workspace.

Multi-Catalog Support

Users can now select multiple catalogs to be connected to a cluster when creating/editing a cluster, instead of adding them individually.

Engine Release Notes

SQL Optimizations

Support for the collect_list Function

e6data now supports the collect_list function. This powerful function enhances your data analysis capabilities, allowing more effective aggregation and windowing of data.

  • The collect_list function enables you to create an ordered list of values from a specified column within a group or window.

  • It is particularly useful when you need to gather and process multiple values associated with a particular grouping key or window frame.

Cross-Schema & Cross-Catalog Querying

We now support cross-schema and cross-catalog querying! This new capability allows you to access and analyze data across different schemas and catalogs within your data lakehouse environment.

  • This feature lets you combine and analyze data from different schemas and catalogs, providing a comprehensive view of your data landscape.

  • To perform cross-schema queries, you need to fully qualify the path by providing the catalog name, schema name, and table name.

    • Hive Catalog: <catalog_name>.<db_name>.<table_name> catalog_name - name of catalog specified during catalog creation in e6data

    • Glue Catalog: <catalog_name>.<db_name>.<table_name> catalog_name - name of catalog specified during catalog creation in e6data

    • Unity Metastore: <unity_catalog_name>.<db_name>.<table_name> unity_catalog_name - name of catalog which is created under Unity in Databricks

Bug Fixes & Improvements

We have added support for querying multi-level partitioned columns in Glue catalogs, which was previously unsupported. This unlocks the full potential of the e6data query planning and optimization when using Glue catalogs.

  • You can now seamlessly query and analyze data stored in multi-level partitioned columns.

  • We utilize the partitioning information to optimize query execution plans. This optimization ensures that queries involving multi-level partitioned columns are processed efficiently, resulting in improved performance.

Know Limitations

  1. Cross-schema & cross-catalog querying have a few limitations, as listed below:

    1. Slight Increase in Parsing Time: Due to the additional complexity introduced by cross-schema and cross-catalog querying, there may be a slight increase in parsing time for initial queries. However, subsequent queries benefit from query plan caching and performance optimizations

    2. Fully Qualified Path Requirement: When performing cross-schema or cross-catalog queries, it is essential to provide the entire path, including the catalog name, schema name, and table name. For example: <catalog-name>.<schema-name>.<table-name>

    3. Same Account and Region for Cross-Catalog Queries: To query tables across catalogs, the catalogs must reside within the same account and region. This limitation ensures data security and optimal performance.

  2. COLLECT_LIST() function has a few limitations, as listed below:

    1. Unsupported Complex Data Types: The COLLECT_LIST function currently does not support complex data type columns as input. It is designed to work with simple data types such as integers, strings, or dates.

    2. Absence of Sorting Capability: The COLLECT_LIST function does not allow for sorting operations, such as using the ORDER BY clause. The resulting list is generated based on the order of appearance of the values in the dataset, without the ability to explicitly control the ordering.

    3. Incompatibility with Distinct Operation: The COLLECT_LIST function cannot be used in conjunction with the DISTINCT operation. It is intended to gather all values within a specified grouping or windowing context, without eliminating duplicates or applying distinctness.

    4. Unsupported Union Operation with Window Functionality: The COLLECT_LIST function cannot be directly combined with the UNION operation when used within a window function. Unioning the results of COLLECT_LIST across different window partitions is currently not supported.

    5. Null Values Omitted from the List: When using the COLLECT_LIST function, null values are automatically excluded from the resulting list. The function gathers and returns non-null values only.

  3. Please take note of the following restrictions when working with complex data types:

    1. Struct Limitations:

      1. Nested Structs: We do not support nested struct types, such as an array of structs. Querying the entire struct or individual leaf fields is supported, but querying intermediate levels within a struct is not.

      2. Array Inside a Struct and Struct inside a struct: Arrays and structs can be used within a struct and are supported.

      3. Standard Operations and Functions: Standard operations and functions can be applied to primitive struct types, allowing you to manipulate and analyze the data. However, operations, like joins, group by, order by, and union all involving structs, are only supported for leaf nodes.

    2. Array Limitations:

      1. Non-Primitive Arrays: Non-primitive arrays, including arrays of arrays, arrays of structs, and nested complex types within arrays, are not supported.

  4. Queries that have the “Nested With clause having a limit” pattern will face parsing issues. Example of a type of that query:

WITH cte2 AS( WITH cte AS
(
           SELECT     d_year,
                      Sum(ss_net_profit) AS ss_net_profit
           FROM       store_sales ss
           INNER JOIN date_dim dd
           ON         ss.ss_sold_date_sk=dd.d_date_sk
           GROUP BY   1
           ORDER BY   1,
                      2)
SELECT   *
FROM     cte
ORDER BY 1 limit 2)
SELECT   d_year,
         avg(ss_net_profit)
FROM     cte2
GROUP BY 1
ORDER BY 1

More information:

More information:

More Information:

Roles
Cross-Catalog & Cross-Schema Querying
Platform Release Notes
WorkspaceAdmin Role
Multi-Catalog Support
Engine Release Notes
SQL Optimizations
Bug Fixes
Known Limitations
COLLECT_LIST()