# Configure Cross-account Catalog to Access AWS Glue

To connect your e6data Workspace to an AWS Glue Metastore and S3 data source in a different cloud account, please follow the steps below:

{% hint style="info" %}
This guide assumes:

* the e6data Workspace (clusters/compute) is installed in a cloud account named <mark style="color:purple;">**Account A.**</mark>
* the AWS Glue metastore & S3 data stores are located in a different cloud account named <mark style="color:green;">**Account B.**</mark>
* Both <mark style="color:purple;">**Account A**</mark> & <mark style="color:green;">**Account B**</mark> are in the same AWS region.
  {% endhint %}

### Step 1: Create policies to access Glue & S3 data sources in <mark style="color:green;">**Account B**</mark>

1. Sign in to the <mark style="color:green;">**Account B**</mark> AWS Console.
2. Search for **IAM.**
3. Click **Policies**
4. Choose **Create policy**.
5. In the **Policy edito**r section, choose the **JSON** option.
6. Edit the policy [provided below](#s3-and-glue-access-policy):
   1. Replace `<DATASTORE_BUCKET_ARN>` with the ARN of the S3 bucket/s containing the data
   2. Replace `<GLUE_REGION>` with the region that the Glue metastore is located in.
   3. Replace `<ACCOUNT_B_ID>` with the Account ID of the account containing the S3 bucket & Glue metastore.

#### S3 & Glue Access Policy

```json
{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Action":[
            "s3:GetObject",
            "s3:ListBucket",
            "s3:GetObjectVersion",
            "s3:GetObjectTagging"
         ],
         "Resource":[
            "<DATASTORE_BUCKET_ARN>/*",
            "<DATASTORE_BUCKET_ARN>"
         ],
         "Effect":"Allow"
      },
      {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabase*",
                "glue:GetTable*",
                "glue:GetPartitions"
            ],
            "Resource": [
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:catalog",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:database/*",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:table/*"
            ]
        }
   ]
}
```

7. Copy & paste the edited policy to the **JSON editor**.
8. Choose **Next**.
9. On the **Review and create** page, type a **Policy Name** and a **Description** (optional) for the policy.
10. Review the Permissions defined in this policy to see the permissions that are granted by your policy.
11. Choose **Create policy**
    * Make note of the policy name as it will be required further along the process.
12. Return to **IAM Management**
13. In the navigation pane, choose **Roles**.
14. Click **Create role**.
15. Under Trusted entity type, choose **Custom trust policy.**
16. Replace `<ENGINE_ROLE_ARN>` in the [policy provided below](#custom-trust-policy). The role name can be found in IAM management dashboard in <mark style="color:purple;">**Account A**</mark>, and will follow this format: `e6data-workspace-<WORKSPACE_NAME>-engine-role`

#### Custom Trust Policy

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<ENGINE_ROLE_ARN>"
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
        }
    ]
}
```

17. Copy & paste it into the **Custom trust policy** editor.
18. Click **Next**
19. Search for the name of the policy created in Steps 4 - 11 and attach this policy to the role.
20. Click **Next: Add tags**.
21. Optional: You can add tags to the role. Or leave these fields blank, and click **Next: Review**.
22. Enter a **Role name** that follows your organization's naming convention.
23. Click **Create role**.
24. Copy the ARN of the newly created role.
    * Make note of the ARN as it will be required further along the process.

### Step 2: Add access policy to AWS Glue in <mark style="color:green;">**Account B**</mark>

1. In the AWS Console, navigate to **AWS Glue > Data Catalog > Catalog settings**.
2. Edit the [policy below](#glue-access-policy):
   1. Replace `<ENGINE_ROLE_ARN>` with the ARN of the Role created for the e6data engine in <mark style="color:blue;">**Account A.**</mark> The ARN can be found in IAM management dashboard in <mark style="color:blue;">**Account A**</mark>, the role name will follow this format: `e6data-workspace-<WORKSPACE_NAME>-engine-role.`
   2. Replace `<GLUE_REGION>` with the region that the Glue metastore is located in.
   3. Replace `<ACCOUNT_B_ID>` with the Account ID of the account containing the S3 bucket & Glue metastore.
3. Copy & paste the edited policy to the Catalog settings in Glue.

#### Glue Access Policy

```json
{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Principal":{
            "AWS":
               "<ENGINE_ROLE_ARN>"
         },
         "Action":[
            "glue:GetDatabase*",
            "glue:GetTable*",
            "glue:GetPartitions"
         ],
         "Resource":[
            "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:catalog",
            "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:database/*",
            "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:table/*"
         ]
      }
   ]
}
```

### Step 3: Configure Glue & S3 Access in <mark style="color:purple;">**Account A**</mark>

1. Sign in to the <mark style="color:purple;">**Account A**</mark> AWS Console.
2. Search for **IAM**
3. Choose **Create policy**.
4. In the **Policy edito**r section, choose the **JSON** option.
5. Replace `arn:aws:iam::<ACCOUNT_B_ID>:role/<ROLENAME>` with the ARN of the policy created in [*Step 1: Create policies to access Glue & S3 data sources in Account B*](#step-1-create-policies-to-access-glue-and-s3-data-sources-in-account-b), in the [policy provided below](#cross-account-sts-policy-for-s3-and-glue).
6. Replace `<ACCOUNT_B_ID>` with the <mark style="color:green;">**Account B**</mark> ID.
7. Copy and paste the edited policy into the **JSON editor.**
8. Choose **Next**.
9. On the **Review and create** page, type a **Policy Name** and a **Description** (optional) for the policy.
   * Make note of the policy name as it will be required further along the process.
10. Review the Permissions defined in this policy to see the permissions that are granted by your policy.
11. Return to **IAM Management**
12. In the navigation pane, choose **Roles**.
13. Search for the **e6data Engine Role** (e6data-workspace-\<WORKSPACE\_NAME>-engine-role).
    * This role would have been created during the e6data Workspace deployment.
14. Click **Add permission** > **Attach policies**
15. Search for the policy created in Steps 3 - 9
16. Click **Add permissions**.

#### Cross-account STS Policy for S3 & Glue

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::<ACCOUNT_B_ID>:role/<ROLENAME>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabase*",
                "glue:GetTable*",
                "glue:GetPartitions"
            ],
            "Resource": [
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:catalog",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:database/*",
                "arn:aws:glue:<GLUE_REGION>:<ACCOUNT_B_ID>:table/*"
            ]
        }
    ]
}
```

## Step 4: Add cross-account catalog in e6data Console

1. Login to the e6data Console.
2. Navigate to the e6data Workspace that should be connected to the cross-account catalog.
3. Go to **Catalogs**
4. Refer to the instructions provided to [Connect to a Glue Metastore](https://docs.e6data.com/product-documentation/catalogs/create-catalogs/glue-metastore/connect-to-a-glue-metastore)

{% hint style="success" %}
The cross-account catalog will now be available to be attached to all current & future clusters in the e6data Workspace.
{% endhint %}
