Unload Function (Copy into)
The objective of the "Unload Large Result Set to Object Storage" feature is to empower users of the Open Lakehouse Query Engine with the ability to efficiently export and store query results in object storage. The "Unload Large Result Set to Object Storage" feature supports the COPY INTO command within the e6data Query editor. This functionality enhances the capability of users to export large query result sets directly into object storage solutions.
Key Benefits
Efficient Export: Users can export large result sets efficiently, overcoming limitations on the number of records that can be exported.
Seamless Integration: Integration of the COPY INTO command within the query editor streamlines the export process, ensuring a seamless user experience.
Scalability: Facilitates the export of millions of records, catering to real-world scenarios where large datasets must be managed. The support for the COPY INTO command within the E6data Query editor allows users to export query results beyond the previous limitation of 10,000 records. This enhancement enables users to handle and export large result sets efficiently.
Object Store Support: Enables users to export query results into object stores like S3, providing flexibility and compatibility with popular storage solutions.
Pre-requisites
The user should have the following prerequisites before executing the COPY INTO command -
The user should have the DataExport role to execute the query through the e6data query editor/notebook.
The users should have the desired permission in the workspace.
COPY INTO Command
Unload data from a table (or query) into a single file at a specified storage location
Where,
<Cloud storage path>
- S3 Storage file path
<SQL Query>
- SQL Query whose output results will be stored in Cloud storage
<File size in bytes>
- Output File size in bytes
Here's a sample COPY INTO command
Exporting Large Result Sets
Execute the desired query using the e6data Query editor/Notebook
Upon receiving the query results, identify the large result set that needs to be exported.
Utilize the COPY INTO command to export the result set into the desired object storage solution, such as S3.
Specify the destination within the object storage where the result set should be stored.
Initiate the export process.
Verify the exported result set in the specified object storage location upon successful completion.
For phase 1, the file format will be downloaded with below-mentioned default values -
Limitations
Cloud storage paths must be specified as file paths; folder paths are not supported at this time. For instance, while we accept paths like:
s3://sample/unloads/call_center_files.csv
paths such ass3://sample/unloads/
are not supported.The only supported output file format is CSV.
AWS is the only supported cloud provider.
Currently, only single-file exports are supported; exporting multiple files is not available.
e6data does not apply any compression codec for output CSV files.
Null values are currently replaced with "" across all columns independent of column type. However, Snowflake does not enforce this rule for Integer and Decimal datatype columns.
Standard date and timestamp formats supported are:
Date format:
YYYY-MM-DD
Timestamp format:
YYYY-MM-DD HH24:MI:SS
Last updated