Infrastructure & Permissions for e6data

The specific infrastructure and permissions required by e6data and instructions to create them are provided below:

Required Infrastructure

The following infrastructure required to run e6data must be created before setup:

  1. S3 Bucket

    • To store e6data operational logs, cache & usage data.

Required Permissions

The following permissions required to run e6data must be created before setup:

  • AWS IAM Roles & Policies for:

    • EKS Cluster

    • Karpenter

    • AWS ALB Ingress Controller

Create an S3 Bucket for e6data

An S3 bucket is required to store data required for the operation of the e6data workspace, eg: service logs, query results, state information, etc.

When creating an S3 bucket it is advisable to follow the best practices below.

S3 Bucket Best Practices
  • Enable versioning for the newly created bucket:

    1. In the AWS Management Console, go to the bucket's properties.

    2. Navigate to "Versioning" and enable it.

  • Enable server-side encryption for the bucket:

    1. In the AWS Management Console, go to the bucket's properties

    2. Navigate to "Default encryption"

    3. Select the encryption algorithm "AES-256"

    4. Make sure the “Enable the bucket key” option is enabled.

  • Block public access to the bucket:

    1. In the AWS Management Console, go to the bucket's properties

    2. Navigate to "Permissions"

    3. Click on "Block public access."

    4. Disable all public access settings and save the changes.

  • Enable logging for the bucket:

    1. In the AWS Management Console, go to the bucket's properties

    2. Navigate to "Logging"

    3. Choose a target bucket to store the logs and a prefix for the logs.

  • Configure the bucket's ACL to be private:

    1. In the AWS Management Console, go to the bucket's properties.

    2. Navigate to "Permissions"

    3. Select "Access Control List (ACL)"

    4. Set the bucket's ACL to private.

  • Configure ownership controls for the bucket.

    1. In the AWS Management Console, go to the bucket's properties

    2. Navigate to "Ownership"

    3. Enable "Bucket owner preferred" as the object ownership.

Please make note of the S3 Bucket Name, it will be required when creating the Workspace in the e6data Console.

Create an OIDC IAM Role for e6data Query Engine

The e6data Query Engine requires access to the S3 buckets containing the target data for querying. To provision the required access we need to create an IAM Role and associate it with a Kubernetes service account.

This configuration allows us to establish a secure connection between the Kubernetes environment and AWS. Once this IAM Role is associated with the service account, any Pods within the e6data clusters that are configured to use this service account will inherit the permissions defined in the IAM Role.

Retrieve the OIDC Provider Suffix

First retrieve the OIDC Provider Suffix, which is required to create the IAM Role:

  1. Open a Terminal

    • Open a terminal or command prompt where you can run AWS CLI commands.

  2. Run the Command

    • Execute the following command to retrieve the OIDC provider suffix for your EKS cluster. Replace <EKS_CLUSTER_NAME> with the actual name of your EKS cluster, and <AWS_REGION> with the AWS region where your cluster is located.

aws eks describe-cluster --name <EKS_CLUSTER_NAME> --region <AWS_REGION> --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///"

Create an IAM Role for the e6data Query Engine

Create the AssumeRole policy for the e6data Query Engine using the template provided below. Replace the <OIDC_PROVIDER_SUFFIX> with the value retrieved in the previous step:

AssumeRole policy for e6data Query Engine
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "<OIDC_PROVIDER_SUFFIX>:sub": "system:serviceaccount:<KUBERNETES_NAMESPACE_TO_DEPLOY_E6DATA>:<E6DATA_WORKSPACE_NAME>"
        }
      },
      "Principal": {
        "Federated": "arn:aws:iam::<YOUR_AWS_ACCOUNT_ID>:oidc-provider/<OIDC_PROVIDER_SUFFIX>"
      }
    }
  ]
}

Attach the following policies to it:

  1. IAM S3 Read-Write Access Policy, created previously.

  2. S3 bucket read-access (to query data)

  3. Glue read-access (optional, to access AWS Glue metastores/catalog)

S3 bucket read-access
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListBucket",
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::<QUERY_DATA_BUCKET_1>",
        "arn:aws:s3:::<QUERY_DATA_BUCKET_2>",
        ...
      ]
    },
    {
      "Sid": "ReadE6dataBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectTagging",
        "s3:GetObjectVersion",
        "s3:ListObjects"
      ],
      "Resource": [
        "arn:aws:s3:::<QUERY_DATA_BUCKET_1>/*",
        "arn:aws:s3:::<QUERY_DATA_BUCKET_2>/*",
        ...
      ]
    }
  ]
}
Glue read-access (optional)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "glueReadOnlyAccess",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase*",
        "glue:GetTable*",
        "glue:GetPartitions"
      ],
      "Resource": ["*"]
    }
  ]
}

For more info, please refer the official AWS documentation: Configuring a Kubernetes service account to assume an IAM role - Amazon EKS

Create a Cross-Account IAM Role

Create an IAM Role with the following AssumeRole policy:

AssumeRole Policy for establishing trust relationship with e6data Console
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Principal": {
        "AWS": "arn:aws:iam::<e6data_account_id>:root"
      },
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "<e6data_cross_account_external_id>"
        }
      }
    }
  ]
}

Attach the below policies to the AssumeRole policy:

Cross-Account IAM EKS Access Policy
{
    "Statement": [
        {
            "Action": [
                "eks:ListNodegroups",
                "eks:DescribeCluster"
            ],
            "Effect": "Allow",
            "Resource": "<ARN_OF_YOUR_EKS_CLUSTER>",
            "Sid": "describeEKSCluster"
        },
        {
            "Action": "eks:DescribeNodegroup",
            "Effect": "Allow",
            "Resource": [
                "arn:aws:eks:<AWS_REGION>:<AWS_ACCOUNT_ID>:nodegroup/<E6DATA_EKS_NODEGROUP_NAME>/*/*",
                "<ARN_OF_YOUR_EKS_CLUSTER>"
            ],
            "Sid": "descriEKSNodegroup"
        },
        {
            "Action": [
                "wafv2:GetWebACLForResource",
                "wafv2:GetWebACL",
                "servicequotas:GetServiceQuota",
                "elasticloadbalancing:DescribeLoadBalancers",
                "ec2:DescribeRouteTables",
                "ec2:DescribeInstances",
                "cloudwatch:GetMetricStatistics",
                "acm:DescribeCertificate"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "PermissionsForAllResources"
        },
        {
            "Action": [
                "wafv2:DisassociateWebACL",
                "elasticloadbalancing:SetWebACL"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:loadbalancer/app/e6data-*/*",
            "Sid": "AllowSetWebACLforELB"
        },
        {
            "Action": "elasticloadbalancing:CreateLoadBalancer",
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/app": "e6data"
                }
            },
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "AllowCreateELB"
        },
        {
            "Action": "elasticloadbalancing:DeleteLoadBalancer",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/app": "e6data"
                }
            },
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "AllowDeleteELB"
        },
        {
            "Action": [
                "elasticloadbalancing:RemoveTags",
                "elasticloadbalancing:AddTags"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:listener/net/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:listener/app/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:listener-rule/net/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:listener-rule/app/e6data-*/*"
            ],
            "Sid": "ManageELBTags"
        },
        {
            "Action": [
                "elasticloadbalancing:RemoveTags",
                "elasticloadbalancing:AddTags"
            ],
            "Condition": {
                "Null": {
                    "aws:RequestTag/elbv2.k8s.aws/cluster": "true",
                    "aws:ResourceTag/elbv2.k8s.aws/cluster": "false"
                }
            },
            "Effect": "Allow",
            "Resource": [
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:targetgroup/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:loadbalancer/net/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:loadbalancer/app/e6data-*/*"
            ],
            "Sid": "manageELBTargetGroupTags"
        },
        {
            "Action": "elasticloadbalancing:AddTags",
            "Condition": {
                "Null": {
                    "aws:RequestTag/elbv2.k8s.aws/cluster": "false"
                }
            },
            "Effect": "Allow",
            "Resource": [
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:targetgroup/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:loadbalancer/net/e6data-*/*",
                "arn:aws:elasticloadbalancing:<AWS_REGION>:<AWS_ACCOUNT_ID>:loadbalancer/app/e6data-*/*"
            ],
            "Sid": "ALBIngressControllerTags"
        },
        {
            "Action": [
                "wafv2:CreateWebACL",
                "wafv2:CreateIPSet"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/app": "e6data"
                }
            },
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "AllowWAFv2WebACLAndIPSetCreation"
        },
        {
            "Action": [
                "wafv2:UpdateWebACL",
                "wafv2:UpdateIPSet",
                "wafv2:UntagResource",
                "wafv2:TagResource",
                "wafv2:GetIPSet",
                "wafv2:DeleteWebACL",
                "wafv2:DeleteIPSet",
                "wafv2:CreateIPSet",
                "wafv2:AssociateWebACL"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/app": "e6data"
                }
            },
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "AllowWAFv2WebACLManagement"
        }
    ],
    "Version": "2012-10-17"
}
IAM S3 Read-Write Access Policy
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListBucket",
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::<E6DATA_BUCKET_CREATED_IN_STEP1>"
    },
    {
      "Sid": "ReadWriteE6dataBucket",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:GetObjectTagging",
        "s3:GetObjectVersion",
        "s3:PutObjectTagging",
        "s3:DeleteObjectVersion",
        "s3:DeleteObject",
        "s3:DeleteObjectTagging",
        "s3:ListObjects"
      ],
      "Resource": "arn:aws:s3:::<E6DATA_BUCKET_CREATED_IN_STEP1>/*"
    }
  ]
}

Please make note of the created CrossAccountRole ARN, it will be required later.

Cross-Account IAM Role to use Unload Operator

To grant the e6data engine access to the S3 bucket where query results are stored using the unload operator, specific permissions must be configured. The following IAM policy must be attached to the engine role, which was created while adding prerequisites:

{
    "Statement": [
        {
            "Action": "s3:ListBucket",
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::<UNLOAD_BUCKET>/",
            "Sid": "ListBucket"
        },
        {
            "Action": [
                "s3:PutObjectTagging",
                "s3:PutObject",
                "s3:GetObjectVersion",
                "s3:GetObjectTagging",
                "s3:GetObject"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::<UNLOAD_BUCKET>/*",
            "Sid": "ReadWriteE6dataBucket"
        }
    ],
    "Version": "2012-10-17"
}

This policy grants the necessary permissions for the e6data engine role to list the contents of the S3 bucket (s3:ListBucket) and perform read/write operations on objects within the bucket (s3:PutObject, s3:GetObject, s3:GetObjectTagging, s3:GetObjectVersion, s3:PutObjectTagging, s3:DeleteObjectVersion, s3:DeleteObject, s3:DeleteObjectTagging, s3:ListObjects).

Ensure to replace <UNLOAD_BUCKET> it with the actual ARN of your S3 bucket.

Update ConfigMap in the EKS Cluster

  1. Open a terminal or command prompt and connect to your EKS cluster by updating the context.

  2. Use the kubectl command-line tool to view the current ConfigMap "aws-auth" in the "kube-system" namespace by running the following command:

kubectl get configmap aws-auth -n kube-system -o yaml

This will display the current configuration of the "aws-auth" ConfigMap, including its YAML representation.

  1. Modify the ConfigMap and add mapRoles similar to the YAML file below.

    • RoleARN of the e6data cross-account role that was previously created, with the username e6data-<WORKSPACE_NAME>-user.

    • RoleARN of the Karpenter node role that was previously created, with the username "system:node: {{EC2PrivateDNSName}}" and groups ["system: bootstrappers", "system: nodes"].

Update ConfigMap

The existing ConfigMap will look similar to this:

apiVersion: v1
data:
  mapRoles: |z
    - groups:
      - system:masters
      rolearn: arn:aws:iam::1234567890:role/<NODE_GROUP_ROLE>
      username: system:node:{{EC2PrivateDNSName}}
kind: ConfigMap
metadata:
  creationTimestamp: "2023-01-01T11:11:11Z"
  name: aws-auth
  namespace: kube-system
  resourceVersion: "126345"
  uid: 234r5g4-3442-48r6-9fd8-320dee97ce8b

Replace <CROSS_ACCOUNT_ROLE> & <WORKSPACE_NAME> with the appropriate values.

     - rolearn: <CROSS_ACCOUNT_ROLE>
      username: e6data-<WORKSPACE_NAME>-user
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::1234567890:role/<karpenter_NODE_ROLE>
      username: system:node:{{EC2PrivateDNSName}}

The updated ConfigMap will look similar to this:

apiVersion: v1
data:
  mapRoles: |
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::1234567890:role/<karpenter_NODE_ROLE>
      username: system:node:{{EC2PrivateDNSName}}
      - rolearn: <CROSS_ACCOUNT_ROLE>
      username: e6data-<WORKSPACE_NAME>-user
kind: ConfigMap
metadata:
  creationTimestamp: "2023-01-01T11:11:11Z"
  name: aws-auth
  namespace: kube-system
  resourceVersion: "126345"
  uid: 234r5g4-3442-48r6-9fd8-320dee97ce8b

Be cautious when modifying the "aws-auth" ConfigMap, as it controls the authentication and authorization of your Amazon EKS worker nodes. Incorrect changes can lead to issues with the cluster's functionality. Always verify your changes before applying them to the cluster and ensure you have the necessary permissions to make updates.

Last updated

#930: Cross account hive GCP

Change request updated