Setup using Terraform in AZURE
Deploying e6data Workspace in Microsoft Azure using Terraform
In this documentation, we will walk you through the process of deploying e6data on AZURE using Terraform.
Deploying e6data in AZURE using Terraform
Terraform is an open-source infrastructure-as-code tool developed by HashiCorp. It allows you to define and manage your infrastructure in a declarative way, making it easier to provision and manage resources across various cloud providers, including Azure.
Prerequisites
Before you begin, ensure that you have the following prerequisites in place:
An AZURE account with appropriate permissions to create and manage resources.
A local development environment with Terraform installed. The installation steps are outlined in the next section.
Create the e6data Workspace
Login to the e6data Console
Navigate to
Workspaces
in the left-side navigation bar or clickCreate Workspace
Select AZURE as the Cloud Provider.
In the e6data UI, after selecting Azure, copy the Cognito Identity ID and Cognito Pool ID.
Proceed to the next step to deploy the prerequisite resources using terraform.
Setup e6data
Using the Terraform script, the e6data Workspace will be deployed inside an Azure AKS Cluster. The subsequent sections will provide instructions to edit two Terraform files required for the deployment:
provider.tf
terraform.tfvars
If an Azure AKS cluster is not available, please follow these instructions.
If Terraform is not installed, please follow these instructions.
Download e6data Terraform Scripts
Please download/clone the e6x-labs/terraform
repo from Github.
Configure provider.tf
The AZURE provider in Terraform allows you to manage AZURE resources efficiently. However, before utilizing the provider, it's crucial to configure it with the appropriate credentials.
Extract the scripts downloaded in the previous step and navigate to the _workspace
folder.
Edit the provider.tf
file according to your requirements. Please refer to the official Terraform documentation to find instructions to use the authentication method most appropriate to your environment.
Specifying Azure Storage account for Terraform State file
Utilizing an Azure Blob for Terraform state storage provides a reliable and scalable solution for managing the infrastructure state on AZURE.
We are using the azurerm provider and configuring the backend to use an Azure Storage Account for storing the Terraform state. Replace <resource_group_name>
and <storage_account_name>
with the names of the resource group and Azure Storage Account you want to use, and <container_name>
with the name of the container within the storage account where you want to store the state file.
The key parameter specifies the name of the state file within the container. It is set to "terraform.tfstate", but you can adjust it according to your needs.
Ensure that the Azure credentials used for authenticating Terraform have the appropriate permissions to read from and write to the specified Azure Storage Account and container.
Note:
Before configuring the backend, make sure you have already created the Azure Storage Account and container in the desired Azure subscription and resource group.
For more information and to explore additional backend options, you can refer to the Terraform Backend Configuration documentation.
Configuration Variables in terraform.tfvars File
The terraform.tfvars
file contains the following variables that need to be configured before executing the terraform:
Please update the values of these variables in the terraform.tfvars
file to match the specific configuration details for your environment:
prefix
Prefix for resources
region
Azure region
workspace_name
Name of the e6data workspace to be created
subscription_id
Subscription ID of Azure subscription
aks_resource_group_name
Resource group name for AKS cluster
aks_cluster_name
AKS cluster name
kube_version
Kubernetes version
kubernetes_namespace
Namespace to deploy e6data workspace
private_cluster_enabled
Private cluster enabled (true/false)
cidr_block
CIDR block for the VNet
nodepool_instance_family
Instance families for node pools
nodepool_instance_arch
Instance architecture for node pools
priority
VM priority (Regular or Spot)
data_storage_account_name
Storage account name
data_resource_group_name
Resource group for storage account
list_of_containers
Containers to access in storage account
helm_chart_version
Helm chart version for e6data workspace
cost_tags
Tags used for cost allocation and management. Helps in tracking and optimizing resource costs. Here, the tag "App" is set to "e6data."
default_node_pool_vm_size
VM size for the default node pool
default_node_pool_node_count
Number of nodes in the default node pool
default_node_pool_name
Name of the default node pool
identity_pool_id
The identity pool ID available in the e6data console after clicking on the "Create Workspace" button and selecting AZURE
identity_id
The identity ID available in the e6data console, used for authentication and authorization in the workspace
karpenter_namespace
Namespace for Karpenter deployment
karpenter_service_account_name
Service account name for Karpenter
karpenter_release_version
Karpenter release version
key_vault_name
Please provide the Key Vault name in which the certificate for the domain is present. If left blank, a new Key Vault will be created in the AKS resource group.
key_vault_rg_name
The resource group for the specified Key Vault. If left blank, it will default to the AKS resource group.
nginx_ingress_controller_namespace
Namespace where the Nginx Ingress Controller will be deployed
nginx_ingress_controller_version
Version of the Nginx Ingress Controller to be installed
Execution Commands
Once you have configured the necessary variables in the terraform.tfvars file, you can proceed with the execution of the Terraform script to deploy the e6data workspace. Follow the steps below to initiate the deployment:
Navigate to the directory containing the Terraform files. It is essential to be in the correct directory for the Terraform commands to execute successfully.
Initialize Terraform:
Generate a Terraform plan and save it to a file (e.g., e6.plan):
The -var-file flag specifies the input variable file (terraform.tfvars) that contains the necessary configuration values for the deployment.
Review the generated plan.
Apply the changes using the generated plan file:
This command applies the changes specified in the plan file (e6.plan) to deploy e6data workspace in your environment.
After successfully applying the Terraform changes and deploying the e6data workspace, you can retrieve the values of the secret, application_id, and tenant_id by running the below commands.
These commands will display the output values defined in your Terraform configuration. These are the values you need to update in the e6data console.
Deployment Overview and Resource Provisioning
This section provides a comprehensive overview of the resources deployed using the Terraform script for the e6data workspace deployment.
AKS Node Pool: Creates a dedicated node pool in Azure Kubernetes Service (AKS) to host the e6data workspace. The node pool is configured with autoscaling to dynamically adjust the number of nodes based on workload demand, ensuring scalability.
Blob Storage Container: Creates a dedicated storage account and blob container in Azure to store the query results for the e6data workspace. The storage account and container provide a reliable and scalable solution for storing and accessing the query output data.
App Registration and client secret: An app registration and associated service principal will be created to grant read and write access to the previously created Azure Blob Storage container. Additionally, minimum permissions will be assigned to the service principal to establish a secure connection to the Azure Kubernetes Service (AKS). Furthermore, a secret will be generated to facilitate secure authentication and authorization for the e6data platform. This comprehensive setup will allow the application to seamlessly interact with the Blob Storage container, securely connect to the AKS cluster, and utilize the generated secret for integration with the e6data platform.
Managed Identity with federated credentials: To establish secure authentication and access control within the AKS cluster, a managed identity will be created and associated with federated credentials using the provided OIDC issuer URL. This managed identity will be granted "Storage Blob Data Contributor" access to the previously created Blob Storage container, enabling read and write operations. Additionally, the managed identity will be assigned "Storage Blob Data Reader" permission for the buckets specified in the tfvars file which contains the data used by the e6data engine. This permission allows the AKS cluster to securely read data from the designated buckets without the ability to modify or write to them. Overall, this setup ensures controlled access and facilitates seamless interaction between the AKS cluster, managed identity, and the necessary data resources.
Helm chart with Service Account: The deployment of a Helm chart to your AKS cluster plays a crucial role in configuring the federated credentials acquired from the user-assigned managed identity. The purpose of this Helm chart is to establish a seamless integration between the AKS cluster and the federated credentials. As part of this process, a service account is created within the AKS cluster. This service account is specifically associated with the configured federated credentials, enabling secure and authorized access to the storage resources using the managed identity.
Last updated