LogoLogo
  • Welcome to e6data
  • Introduction to e6data
    • Concepts
    • Architecture
      • e6data in VPC Deployment Model
      • Connect to e6data serverless compute
  • Get Started
  • Sign Up
  • Setup
    • AWS Setup
      • In VPC Deployment (AWS)
        • Prerequisite Infrastructure
        • Infrastructure & Permissions for e6data
        • Setup Kubernetes Components
        • Setup using Terraform in AWS
          • Update a AWS Terraform for your Workspace
        • AWS PrivateLink and e6data
        • VPC Peering | e6data on AWS
      • Connect to e6data serverless compute (AWS)
        • Workspace Creation
        • Catalog Creation
          • Glue Metastore
          • Hive Metastore
          • Unity Catalog
        • Cluster Creation
    • GCP Setup
      • In VPC Deployment (GCP)
        • Prerequisite Infrastructure
        • Infrastructure & Permissions for e6data
        • Setup Kubernetes Components
        • Setup using Terraform in GCP
        • Update a GCP Terraform for your Workspace
      • Connect to e6data serverless compute (GCP)
    • Azure Setup
      • Prerequisite Infrastructure
      • Infrastructure & Permissions for e6data
      • Setup Kubernetes Components
      • Setup using Terraform in AZURE
        • Update a AZURE Terraform for your Workspace
  • Workspaces
    • Create Workspaces
    • Enable/Disable Workspaces
    • Update a Workspace
    • Delete a Workspace
  • Catalogs
    • Create Catalogs
      • Hive Metastore
        • Connect to a Hive Metastore
        • Edit a Hive Metastore Connection
        • Delete a Hive Metastore Connection
      • Glue Metastore
        • Connect to a Glue Metastore
        • Edit a Glue Metastore Connection
        • Delete a Glue Metastore Connection
      • Unity Catalog
        • Connect to Unity Catalog
        • Edit Unity Catalog
        • Delete Unity Catalog
      • Cross-account Catalog Access
        • Configure Cross-account Catalog to Access AWS Hive Metastore
        • Configure Cross-account Catalog to Access Unity Catalog
        • Configure Cross-account Catalog to Access AWS Glue
        • Configure Cross-account Catalog to Access GCP Hive Metastore
    • Manage Catalogs
    • Privileges
      • Access Control
      • Column Masking
      • Row Filter
  • Clusters
    • Edit & Delete Clusters
    • Suspend & Resume Clusters
    • Cluster Size
    • Load Based Sizing
    • Auto Suspension
    • Query Timeout
    • Monitoring
    • Connection Info
  • Pools
    • Delete Pools
  • Query Editor
    • Editor Pane
    • Results Pane
    • Schema Explorer
    • Data Preview
  • Notebook
    • Editor Pane
    • Results Pane
    • Schema Explorer
    • Data Preview
  • Query History
    • Query Count API
  • Connectivity
    • IP Sets
    • Endpoints
    • Cloud Resources
    • Network Firewall
  • Access Control
    • Users
    • Groups
    • Roles
      • Permissions
      • Policies
    • Single Sign-On (SSO)
      • AWS SSO
      • Okta
      • Microsoft My Apps-SSO
      • Icons for IdP
    • Service Accounts
    • Multi-Factor Authentication (Beta)
  • Usage and Cost Management
  • Audit Log
  • User Settings
    • Profile
    • Personal Access Tokens (PAT)
  • Advanced Features
    • Cross-Catalog & Cross-Schema Querying
  • Supported Data Types
  • SQL Command Reference
    • Query Syntax
      • General functions
    • Aggregate Functions
    • Mathematical Functions & Operators
      • Arithematic Operators
      • Rounding and Truncation Functions
      • Exponential and Root Functions
      • Trigonometric Functions
      • Logarithmic Functions
    • String Functions
    • Date-Time Functions
      • Constant Functions
      • Conversion Functions
      • Date Truncate Function
      • Addition and Subtraction Functions
      • Extraction Functions
      • Format Functions
      • Timezone Functions
    • Conditional Expressions
    • Conversion Functions
    • Window Functions
    • Comparison Operators & Functions
    • Logical Operators
    • Statistical Functions
    • Bitwise Functions
    • Array Functions
    • Regular Expression Functions
    • Generate Functions
    • Cardinality Estimation Functions
    • JSON Functions
    • Checksum Functions
    • Unload Function (Copy into)
    • Struct Functions
  • Equivalent Functions & Operators
  • Connectors & Drivers
    • DBeaver
    • DbVisualiser
    • Apache Superset
    • Jupyter Notebook
    • Tableau Cloud
    • Tableau Desktop
    • Power BI
    • Metabase
    • Zeppelin
    • Python Connector
      • Code Samples
    • JDBC Driver
      • Code Samples
      • API Support
    • Configure Cluster Ingress
      • ALB Ingress in Kubernetes
      • GCE Ingress in Kubernetes
      • Ingress-Nginx in Kubernetes
  • Security & Trust
    • Best Practices
      • AWS Best Practices
    • Features & Responsibilities Matrix
    • Data Protection Addendum(DPA)
  • Tutorials and Best Practices
    • How to configure HIVE metastore if you don't have one?
    • How-To Videos
  • Known Limitations
    • SQL Limitations
    • Other Limitations
    • Restart Triggers
    • Cloud Provider Limitations
  • Error Codes
    • General Errors
    • User Account Errors
    • Workspace Errors
    • Catalog Errors
    • Cluster Errors
    • Data Governance Errors
    • Query History Errors
    • Query Editor Errors
    • Pool Errors
    • Connectivity Errors
  • Terms & Condition
  • Privacy Policy
    • Cookie Policy
  • FAQs
    • Workspace Setup
    • Security
    • Catalog Privileges
  • Services Utilised for e6data Deployment
    • AWS supported regions
    • GCP supported regions
    • AZURE supported regions
  • Release Notes & Updates
    • 6th Sept 2024
    • 6th June 2024
    • 18th April 2024
    • 9th April 2024
    • 30th March 2024
    • 16th March 2024
    • 14th March 2024
    • 12th March 2024
    • 2nd March 2024
    • 10th February 2024
    • 3rd February 2024
    • 17th January 2024
    • 9th January 2024
    • 3rd January 2024
    • 18th December 2023
    • 12th December 2023
    • 9th December 2023
    • 4th December 2023
    • 27th November 2023
    • 8th September 2023
    • 4th September 2023
    • 26th August 2023
    • 21st August 2023
    • 19th July 2023
    • 23rd May 2023
    • 5th May 2023
    • 28th April 2023
    • 19th April 2023
    • 15th April 2023
    • 10th April 2023
    • 30th March 2023
Powered by GitBook
On this page
  • Step 1: Create policies to access Glue & S3 data sources in Account B
  • Step 2: Add access policy to AWS Glue in Account B
  • Step 3: Configure Glue & S3 Access in Account A
  • Step 4: Add cross-account catalog in e6data Console
  1. Catalogs
  2. Create Catalogs
  3. Cross-account Catalog Access

Configure Cross-account Catalog to Access AWS Glue

PreviousConfigure Cross-account Catalog to Access Unity CatalogNextConfigure Cross-account Catalog to Access GCP Hive Metastore

Last updated 10 months ago

To connect your e6data Workspace to an AWS Glue Metastore and S3 data source in a different cloud account, please follow the steps below:

This guide assumes:

  • the e6data Workspace (clusters/compute) is installed in a cloud account named Account A.

  • the AWS Glue metastore & S3 data stores are located in a different cloud account named Account B.

  • Both Account A & Account B are in the same AWS region.

Step 1: Create policies to access Glue & S3 data sources in Account B

  1. Sign in to the Account B AWS Console.

  2. Search for IAM.

  3. Click Policies

  4. Choose Create policy.

  5. In the Policy editor section, choose the JSON option.

  6. Edit the policy :

    1. Replace <DATASTORE_BUCKET_ARN> with the ARN of the S3 bucket/s containing the data

    2. Replace <GLUE_REGION> with the region that the Glue metastore is located in.

    3. Replace <ACCOUNT_B_ID> with the Account ID of the account containing the S3 bucket & Glue metastore.

S3 & Glue Access Policy

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Action":[
            "s3:GetObject",
            "s3:ListBucket",
            "s3:GetObjectVersion",
            "s3:GetObjectTagging"
         ],
         "Resource":[
            "<DATASTORE_BUCKET_ARN>/*",
            "<DATASTORE_BUCKET_ARN>"
         ],
         "Effect":"Allow"
      },
      {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabase*",
                "glue:GetTable*",
                "glue:GetPartitions"
            ],
            "Resource": [
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:catalog",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:database/*",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:table/*"
            ]
        }
   ]
}
  1. Copy & paste the edited policy to the JSON editor.

  2. Choose Next.

  3. On the Review and create page, type a Policy Name and a Description (optional) for the policy.

  4. Review the Permissions defined in this policy to see the permissions that are granted by your policy.

  5. Choose Create policy

    • Make note of the policy name as it will be required further along the process.

  6. Return to IAM Management

  7. In the navigation pane, choose Roles.

  8. Click Create role.

  9. Under Trusted entity type, choose Custom trust policy.

Custom Trust Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<ENGINE_ROLE_ARN>"
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
        }
    ]
}
  1. Copy & paste it into the Custom trust policy editor.

  2. Click Next

  3. Search for the name of the policy created in Steps 4 - 11 and attach this policy to the role.

  4. Click Next: Add tags.

  5. Optional: You can add tags to the role. Or leave these fields blank, and click Next: Review.

  6. Enter a Role name that follows your organization's naming convention.

  7. Click Create role.

  8. Copy the ARN of the newly created role.

    • Make note of the ARN as it will be required further along the process.

Step 2: Add access policy to AWS Glue in Account B

  1. In the AWS Console, navigate to AWS Glue > Data Catalog > Catalog settings.

    1. Replace <ENGINE_ROLE_ARN> with the ARN of the Role created for the e6data engine in Account A. The ARN can be found in IAM management dashboard in Account A, the role name will follow this format: e6data-workspace-<WORKSPACE_NAME>-engine-role.

    2. Replace <GLUE_REGION> with the region that the Glue metastore is located in.

    3. Replace <ACCOUNT_B_ID> with the Account ID of the account containing the S3 bucket & Glue metastore.

  2. Copy & paste the edited policy to the Catalog settings in Glue.

Glue Access Policy

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Principal":{
            "AWS":
               "<ENGINE_ROLE_ARN>"
         },
         "Action":[
            "glue:GetDatabase*",
            "glue:GetTable*",
            "glue:GetPartitions"
         ],
         "Resource":[
            "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:catalog",
            "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:database/*",
            "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:table/*"
         ]
      }
   ]
}

Step 3: Configure Glue & S3 Access in Account A

  1. Sign in to the Account A AWS Console.

  2. Search for IAM

  3. Choose Create policy.

  4. In the Policy editor section, choose the JSON option.

  5. Replace <ACCOUNT_B_ID> with the Account B ID.

  6. Copy and paste the edited policy into the JSON editor.

  7. Choose Next.

  8. On the Review and create page, type a Policy Name and a Description (optional) for the policy.

    • Make note of the policy name as it will be required further along the process.

  9. Review the Permissions defined in this policy to see the permissions that are granted by your policy.

  10. Return to IAM Management

  11. In the navigation pane, choose Roles.

  12. Search for the e6data Engine Role (e6data-workspace-<WORKSPACE_NAME>-engine-role).

    • This role would have been created during the e6data Workspace deployment.

  13. Click Add permission > Attach policies

  14. Search for the policy created in Steps 3 - 9

  15. Click Add permissions.

Cross-account STS Policy for S3 & Glue

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::<ACCOUNT_B_ID>:role/<ROLENAME>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabase*",
                "glue:GetTable*",
                "glue:GetPartitions"
            ],
            "Resource": [
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:catalog",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:database/*",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:table/*"
            ]
        }
    ]
}

Step 4: Add cross-account catalog in e6data Console

  1. Login to the e6data Console.

  2. Navigate to the e6data Workspace that should be connected to the cross-account catalog.

  3. Go to Catalogs

The cross-account catalog will now be available to be attached to all current & future clusters in the e6data Workspace.

Replace <ENGINE_ROLE_ARN> in the . The role name can be found in IAM management dashboard in Account A, and will follow this format: e6data-workspace-<WORKSPACE_NAME>-engine-role

Edit the :

Replace arn:aws:iam::<ACCOUNT_B_ID>:role/<ROLENAME> with the ARN of the policy created in , in the .

Refer to the instructions provided to

Connect to a Glue Metastore
provided below
policy provided below
policy below
Step 1: Create policies to access Glue & S3 data sources in Account B
policy provided below