Azure Setup

Deploying e6data Workspace in Microsoft Azure using Terraform

Prerequisites

Before you begin, ensure that you have the following prerequisites in place:

  1. An Azure account with appropriate permissions to create and manage resources.

  2. A local development environment with Terraform installed. The installation steps are outlined in the next section.

Installing Terraform

To install Terraform on your local machine, you can follow the steps adapted from the official HashiCorp Terraform documentation:

  1. Visit the official Terraform website at Terraform by HashiCorp

  2. Navigate to the "Downloads" page or click here to directly access the downloads page.

  3. Download the appropriate package for your operating system (e.g., Windows, macOS, Linux).

  4. Extract the downloaded package to a directory of your choice.

  5. Add the Terraform executable to your system's PATH environment variable.

  • For Windows:

    • Open the Start menu and search for "Environment Variables."

    • Select "Edit the system environment variables."

    • Click the "Environment Variables" button.

    • Under "System variables," find the "Path" variable and click "Edit."

    • Add the path to the directory where you extracted the Terraform executable (e.g., C:\terraform) to the list of paths.

    • Click "OK" to save the changes.

  • For macOS and Linux:

    • Open a terminal.

    • Run the following command, replacing <path_to_extracted_binary> with the path to the directory where you extracted the Terraform executable: export PATH=$PATH:<path_to_extracted_binary>

    • Optionally, you can add this command to your shell's profile file (e.g., ~/.bash_profile, ~/.bashrc, ~/.zshrc) to make it persistent across terminal sessions.

  1. Verify the installation by opening a new terminal window and running the following command: terraform version. If Terraform is installed correctly, you should see the version number displayed.

Azure Terraform Provider for Authentication

The Azure Provider can be used to configure infrastructure in Microsoft Azure using the Azure Resource Manager APIs.

Provider Block

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "=3.0.0"
    }
  }
}

# Configure the Microsoft Azure Provider
provider "azurerm" {
  features {}
  subscription_id   = {{your_subscription_id}}
}

Authentication and Configuration

Terraform supports a number of different methods for authenticating to Azure:

  • Authenticating to Azure using the Azure CLI The Azure CLI (Command-Line Interface) provides a convenient way to authenticate Terraform to Azure. By running the az login command and following the authentication flow, Terraform can use the credentials provided by the Azure CLI to access Azure resources.

  • Authenticating to Azure using Managed Service Identity Managed Service Identity (MSI) allows applications or services running on Azure to authenticate without needing explicit credentials. Terraform can leverage MSI to authenticate itself and access Azure resources without additional authentication configuration.

  • Authenticating to Azure using a Service Principal and a Client Certificate A Service Principal is an identity that can be used by applications, services, or automation tools like Terraform to access Azure resources. This method involves creating a Service Principal and associating a client certificate with it. Terraform can then use the Service Principal and client certificate for authentication.

  • Authenticating to Azure using a Service Principal and a Client Secret Similar to the previous method, this authentication approach involves creating a Service Principal, but instead of a client certificate, a client secret is used. The client secret is essentially a password associated with the Service Principal. Terraform can utilize the Service Principal and client secret for authentication to Azure.

  • Authenticating to Azure using a Service Principal and OpenID Connect OpenID Connect (OIDC) is an authentication protocol that allows clients, such as Terraform, to verify the identity of users or services. This method involves creating a Service Principal and configuring an OIDC identity provider. Terraform can then authenticate using the Service Principal and OIDC to access Azure resources.

Specifying an Azure Blob Container for Terraform State file

Utilizing an Azure Blob for Terraform state storage provides a reliable and scalable solution for managing the infrastructure state on Azure.

To specify an Azure Storage Account for storing the Terraform state when using Azure as the provider, you can add the following configuration to the Terraform script:

terraform {
  backend "azurerm" {
    resource_group_name  = "<resource_group_of_the_storage_account>"
    storage_account_name = "<storage_account_name>"
    container_name       = "<container_name>"
    key                  = "terraform.tfstate"
  }
}

We are using the azurerm provider and configuring the backend to use an Azure Storage Account for storing the Terraform state. Replace <storage_account_name> with the name of the Azure Storage Account you want to use, and <container_name> with the container's name within the storage account where you want to store the state file.

The key parameter specifies the name of the state file within the container. It is set to terraform.tfstate, but you can adjust it according to your needs.

Ensure that the Azure credentials used for authenticating Terraform have the appropriate permissions to read from and write to the specified Azure Storage Account and container.

Before configuring the backend, make sure that the Azure Storage Account and container have been created in the desired Azure subscription and resource group.

Deployment Overview and Resource Provisioning

This section provides a comprehensive overview of the resources deployed using the Terraform script for the e6data workspace deployment.

  1. AKS Node Pool Creates a dedicated node pool in Azure Kubernetes Service (AKS) to host the e6data workspace. The node pool is configured with autoscaling to dynamically adjust the number of nodes based on workload demand, ensuring scalability.

  2. Blob Storage Container Creates a dedicated storage account and blob container in Azure to store the query results for the e6data workspace. The storage account and container provide a reliable and scalable solution for storing and accessing the query output data.

  3. App Registration and Client Secret An app registration and associated service principal will be created to grant read and write access to the previously created Azure Blob Storage container. Additionally, minimum permissions will be assigned to the service principal to establish a secure connection to the Azure Kubernetes Service (AKS). Furthermore, a secret will be generated to facilitate secure authentication and authorization for the e6data platform. This comprehensive setup will allow the application to seamlessly interact with the Blob Storage container, securely connect to the AKS cluster, and utilize the generated secret for integration with the e6data platform.

  4. Managed Identity with Federated Credentials To establish secure authentication and access control within the AKS cluster, a managed identity will be created and associated with federated credentials using the provided OIDC issuer URL. This managed identity will be granted "Storage Blob Data Contributor" access to the previously created Blob Storage container, enabling read and write operations. Additionally, the managed identity will be assigned "Storage Blob Data Reader" permission for the containers specified in the tfvars file which contains the data used by the e6data engine. This permission allows the AKS cluster to securely read data from the designated containers without the ability to modify or write to them. Overall, this setup ensures controlled access and facilitates seamless interaction between the AKS cluster, managed identity, and the necessary data resources.

  5. Helm Chart with Service Account The deployment of a Helm chart to your AKS cluster plays a crucial role in configuring the federated credentials acquired from the user-assigned managed identity. The purpose of this Helm chart is to establish a seamless integration between the AKS cluster and the federated credentials. As part of this process, a service account is created within the AKS cluster. This service account is specifically associated with the configured federated credentials, enabling secure and authorized access to the storage resources using the managed identity.

Configuration Variables in terraform.tfvars File

The terraform.tfvars file contains the following variables that need to be configured before executing the terraform:

terraform.tfvars
subscription_id                   = "<your_subscription_id>"
workspace_name                    = "<e6data_workspace_name>"
e6data_app_secret_expiration_time = "<expiration_time_for_the_e6data_app_secret>"
aks_cluster_name                  = "<name_of_your_aks_cluster>"
aks_resource_group_name           = "<resource_group_of_the_aks_cluster>"
vm_size                           = "<vm_size_for_the_node_pool>"
min_number_of_nodes               = "<minimum_number_of_nodes_in_the_nodepool>"
max_number_of_nodes               = "<maximum_number_of_nodes_in_the_nodepool>"
aks_namespace                     = "<namespace_in_your_aks>"
data_resource_group_name          = "<resource group containing data to be queried>"
data_storage_account_name         = "<storage account containing data to be queried>"
list_of_containers                = "<list_of_containers>"

subscription_id

The subscription ID of the Azure subscription in which the e6data resources will be deployed.

workspace_name

Name of the e6data workspace to be created.

e6data_app_secret_expiration_time

A relative duration for which the password is valid until, for example 240h (10 days) or 2400h30m.

aks_cluster_name

The name of your Azure Kubernetes Service (AKS) cluster in which to deploy the e6data workspace.

aks_resource_group_name

The name of the resource group where the AKS cluster is deployed.

vm_size

The VM size for the AKS node pool.(for example Standard_DS2_v2)

min_number_of_nodes

The minimum number of nodes in the AKS node pool.

max_number_of_nodes

The maximum number of nodes in the AKS node pool

aks_namespace

The namespace in the AKS cluster to deploy the e6data workspace.

data_resource_group_name

The name of the resource group containing data to be queried.

data_storage_account_name

The name of the storage account containing data to be queried.

list_of_containers

List of names of the containers inside the data storage account, that the 6data engine queries and require read access to.

Execution Commands

Once you have configured the necessary variables in the terraform.tfvars file, you can proceed with the execution of the Terraform script to deploy the e6data workspace. Follow the steps below to initiate the deployment:

  1. Navigate to the directory containing the Terraform files. It is essential to be in the correct directory for the Terraform commands to execute successfully.

  2. Initialize Terraform: terraform init

  3. Generate a Terraform plan and save it to a file (e.g., e6.plan): terraform plan -var-file="terraform.tfvars" --out="e6.plan". The -var-file flag specifies the input variable file (terraform.tfvars) that contains the necessary configuration values for the deployment.

  4. Review the generated plan.

  5. Apply the changes using the generated plan file: terraform apply "e6.plan". This command applies the changes specified in the plan file (e6.plan) to deploy the e6data workspace in your environment.

  6. After successfully applying the Terraform changes and deploying the e6data workspace, you can retrieve the values of the secret, application_id, and tenant_id by running the below commands.

terraform output secret
terraform output application_id
terraform output tenant_id

These commands will display the output values defined in your Terraform configuration. These are the values you need to update in the e6data console.

Last updated