Introduction to e6data

First, what is e6data?

e6data is an ultra-performant, cost-effective analytics engine for data analysts, data scientists, operational teams, and executives. e6data follows the following principles to achieve this efficiency:

  1. Eliminate the brute force approach of existing SQL engines

    1. On the fly, e6data creates multiple layers of indirection that enable it to reduce the volume of data operated on by orders of magnitude.

    2. This reduces the amount of network shuffle involved in large queries by > 95%, thereby eliminating the cause of multiplicative delays.

  2. Foresight-based, vectorized query evaluation & execution

    1. e6data's engine evaluates a query in its entirety and intelligently combines the execution of multiple heavy operations into a single stage.

    2. Independent execution branches are kicked off concurrently for highly parallelized execution.

  3. Performance gains amplify with scale and complexity!

    1. Unlike Spark and Presto/Trino, e6data's consensus-based distribution framework does not have the coordinator/driver as a single point of failure. Nor does it trade-off between fault tolerance and low latency.

    2. Being Kubernetes native means e6data supports multiple environments and allows e6data to efficiently auto-scale to meet demanding SLAs.

    3. Coupled with our obsessive focus on shuffle elimination, performance gains increase with increases in data volumes and query complexity!

Last updated

#930: Cross account hive GCP

Change request updated