Workspace Setup
Frequently Asked Questions about Workspace
Last updated
Frequently Asked Questions about Workspace
Last updated
If your organization mandates the use of a Customer Managed KMS Key for EBS volume encryption, you might face permission issues while using Karpenter. Follow these steps to ensure deployment success:
Step 1: Update KMS Key Policy
Go to Key Management Service: Open the AWS Management Console and navigate to the Key Management Service (KMS).
Select your KMS key: Choose the relevant Customer Managed Key.
Update the Key Policy:
Add the following policy block to the key policy.
Replace <WORKSPACE>
with the name of your workspace.
Replace <ACCOUNT_ID>
with the ACCOUNT ID of your AWS account.
{
"Sid": "Allow use of the key",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<ACCOUNT_ID>:role/e6data-<WORKSPACE>-karpenter-oidc-role"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey",
"kms:CreateGrant"
],
"Resource": "*"
}
Following these steps should resolve any permission issues and ensure that Karpenter can successfully create nodes with encrypted EBS volumes using your Customer Managed KMS Key.
The format used for the ARN (arn:aws:iam::<ACCOUNT_ID>:role/e6data--karpenter-oidc-role)
matches the OIDC role created through Terraform for Karpenter.
If you have created the setup manually, you can replace the karpenter-oidc-role
in the ARN with the name of the role you are using for Karpenter.
Q: Our organization has an SCP (Service Control Policy) that restricts the use of Graviton instances. Should we reconsider this policy?
A: Yes, it is advisable to review and potentially revise the SCP policy that restricts Graviton instances. These instances, powered by AWS's ARM-based processors, provide notable advantages in both cost and performance, which can be beneficial for your Kubernetes workloads.
Key Benefits of Graviton Instances:
Cost Savings: Graviton instances typically offer lower pricing compared to x86-based instances, making them a more economical choice for running large-scale workloads.
Enhanced Performance: Graviton instances often deliver superior performance, particularly in compute-intensive applications, due to their efficient processing power and memory bandwidth.
Energy Efficiency: Graviton processors are designed to consume less power, leading to reduced operational costs and a smaller environmental impact.
Broad Compatibility: Most modern software, including containerized applications, is compatible with the ARM architecture used by Graviton instances, making it easy to adopt and integrate into your current infrastructure.
Recommendation:
We suggest revisiting the SCP policy to permit the use of Graviton instances, at least for testing. Evaluating their performance and cost-effectiveness in your specific workloads and regions could reveal substantial benefits. Many organizations have experienced significant improvements in cost and performance by adopting Graviton instances, and your organization could benefit similarly.
Q: How can we connect our e6data VPC to a private Hive Metastore located in a different VPC?
A: To connect your e6data VPC with a private Hive Metastore in a different VPC, you can establish VPC peering between the two VPCs. VPC peering allows secure and direct communication between instances in these VPCs, functioning as though they were within the same network.
Steps to Establish VPC Peering:
Set Up VPC Peering: Use the Terraform scripts provided to establish a VPC peering connection between the e6data VPC and the VPC where the Hive Metastore resides. This will enable seamless communication between the resources in both VPCs.
For AWS: Use the Terraform scripts available to create the VPC peering connection in AWS.
For GCP: Use the Terraform scripts available for setting up network peering in GCP.
Security Group and NACL Adjustments: Ensure that security groups and network access control lists (NACLs) are configured to allow traffic between the e6data VPC and the Hive Metastore VPC. This step is crucial for enabling access between services in the peered VPCs.
Additional Notes:
Network Configuration: Verify that the CIDR blocks of the two VPCs do not overlap, as this can lead to routing issues.
Scenario
e6data is deployed in one project/account.
A private Hive metastore is hosted in a second project/account.
Data is stored in a third project/account.
Solution
To securely access the Hive metastore and the data across these different projects/account, follow these steps:
Establish a Secure Connection to the Hive Metastore:
Use VPC Peering: Set up VPC peering between the VPC where e6data is deployed and the VPC hosting the Hive metastore in the second project/account. This will ensure a secure and private connection to the Hive metastore.
For guidance on setting up VPC peering, refer to the relevant Terraform scripts:
For AWS: Use the Terraform scripts available to create the VPC peering connection in AWS.
For GCP: Use the Terraform scripts available for setting up network peering in GCP.
Additional Notes:
Network Configuration: Verify that the CIDR blocks of the two VPCs do not overlap, as this can lead to routing issues.
Grant Access to Data in the Third Project:
Cross-Project Catalog Configuration: To access data stored in the third project/account, Follow the steps outlined in the to configure roles and permissions.
Your organization uses Amazon S3 buckets encrypted with a customer-managed KMS Key to store data. You need to ensure that e6data can securely access this encrypted S3 bucket. Follow these steps to configure the necessary access.
Step 1: Update KMS Key Policy
Go to Key Management Service: Open the AWS Management Console and navigate to the Key Management Service (KMS).
Select your KMS key: Choose the relevant Customer Managed Key.
Update the Key Policy:
Add the following policy block to the key policy.
{
"Sid": "Allow use of the key",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<ACCOUNT_ID>:role/<WORKSPACE_NAME>-engine-role-<RANDOM_STRING>"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey",
"kms:CreateGrant"
],
"Resource": "*"
}
Note:
The format used for the ARN (arn:aws:iam::<ACCOUNT_ID>:role/<WORKSPACE_NAME>-engine-role-<RANDOM_STRING>arn:aws:iam::<ACCOUNT_ID>:role/e6data-<WORKSPACE>-engine-rolekarpenter-oidc-role
) matches the OIDC role created through Terraform for e6data engine.
If you have created the setup manually, you can replace the engine-oidc-role
in the ARN with the name of the role you created earlier for engine .
Q. Why aren't our cost allocation tags, specified through Terraform or manual infrastructure setup, visible in the AWS Cost Management portal?
A: Even if you specify cost tags through Terraform or during manual resource creation, they may not appear in the AWS Cost Management portal if they are not activated. To make these tags visible in your cost and usage reports, you need to activate them in the AWS Billing and Cost Management console.
For more details on how to activate cost allocation tags, please refer to the .
An Internet Gateway (IGW) is crucial for enabling communication between an AWS Virtual Private Cloud (VPC) and the public internet. It allows resources in public subnets to send and receive traffic, provided they have public IP addresses. If the ability to create or attach an IGW is denied, it will restrict connectivity, preventing instances from accessing external services or being reachable from the internet. This would also hinder e6data control plane connectivity. Therefore, maintaining the IGW is essential for ensuring operational flexibility and connectivity.
To ensure that e6data infrastructure resources across AWS, Azure, and Google Cloud Platform (GCP) operate without restrictions, it is essential to verify that no relevant policies hinder their functionality.
For AWS, confirm that no Service Control Policies (SCPs) affect key services such as EKS, EC2, S3, IAM, Subnet, VPC, NAT Gateway, Internet Gateway, VPC Endpoint, Security Group, SQS, CloudWatch, WAF, and ELB. SCPs act as permission guardrails within AWS Organizations, controlling the maximum permissions for IAM users and roles in member accounts. Since SCPs do not grant permissions themselves, review existing SCPs to ensure they do not impose limitations on these essential services that could impede operations or resource utilization.
For Azure, it is crucial to check for any Azure policies that might restrict services like AKS (Azure Kubernetes Service), Key Vault, Storage Account, Managed Identities, NAT Gateway, Virtual Network, Public IP Addresses, and Load Balancing.
For GCP, ensure that no Organization Policies limit the operations of services such as Kubernetes Engine (GKE), Key Management, Cloud Storage, IAM & Admin, Service Accounts, Cloud NAT, VPC Network, IP Addresses, and Load Balancing.
Finally, review any third-party restriction policies, such as custodian policies, to ensure they do not impose additional constraints that could hinder operational needs or resource utilization.
The specified inbound rules are essential for E6Data to function smoothly and without interruption. The rules include allowing HTTPS traffic on TCP protocol through port 443 from any source (0.0.0.0/0), ensuring secure communication for data transmissions. Additionally, HTTP traffic is permitted on TCP protocol through port 80, also from any source, facilitating standard web access. A custom TCP rule allows traffic across a wide range of ports, from 1001 to 65535, enabling various application-specific requests critical for dynamic data operations. Furthermore, all traffic is allowed from the internal network range of the existing VPC, while denying all other external traffic (0.0.0.0/0).
You should check if you have reserved instance types that belong to the families we've requested. If you don’t have those reserved, please use on-demand instances instead. Additionally, if you do have reserved instance types for the requested families, the system will automatically apply those reservations, enabling cost savings associated with reserved capacity. This approach ensures flexibility while maximizing the benefits of existing reservations.
To install the AWS Load Balancer Controller (ALB Controller) in an existing Amazon EKS setup, follow the steps outlined in the . Ensure that the necessary IAM policy with the required permissions is added as specified . This will facilitate the successful configuration and deployment of the ALB Controller in your EKS cluster.
To install Terraform on Windows, you can follow the detailed instructions in the documentation available at . This guide will walk you through the necessary steps to get Terraform up and running on your local machine. When executing a Terraform script from Windows, it is important to update the interpreter to PowerShell. Here’s an example of how you can specify the interpreter in your Terraform configuration: