# Code Samples

## Best Practices

### Environment Variables

Setting environment variables is encouraged over hardcoding authentication variables and cluster IPs. The following environment variables can be set:

* HOST\_NAME: IP address or hostname of the cluster to be used.
* E6\_USER: Email address of the user account running the queries.
* E6\_TOKEN: Access token generated by the user in the e6data Console.

```python
HOST_NAME=os.getenv("HOST_NAME")
E6_USER=os.getenv("E6_USER")
E6_TOKEN = os.getenv("E6_TOKEN")
```

Usage:

```python
from e6data_python_connector import Connection
```

## Create Connection

Sample code for establishing a connection with an e6data cluster.

```python
from e6data_python_connector import Connection

username = '<username>'  # Your e6data Email ID.
password = '<password>'  # Access Token generated in the e6data console.

host = '<host>'  # IP address or hostname of the cluster to be used.
database = '<database>'  # # Database to perform the query on.
port = 443  # Port of the e6data engine.
catalog_name = '<catalog_name>'
cluster_uuid = '<cluster_uuid>'  # Specify cluster UUID on when secure=True
secure = True # Flag to use a secure channel for data transfer, default to False

conn = Connection(
    host=host,
    port=port,
    username=username,
    database=database,
    password=password,
    cluster_uuid=cluster_uuid,
    secure=secure
)
```

### Switch Database in an Existing Connection

```python
database = '<new_database_name>'  # Replace with the new database name
cursor = conn.cursor(database)
```

## Perform Query & Get Results

This code executes a query and fetches all the results.&#x20;

*The `cursor` supports `fetchmany` and `fetchone` methods.*

```python
query = 'SELECT * FROM <TABLE_NAME>'  # Replace with the query.

cursor = conn.cursor(catalog_name=catalog_name)
query_id = cursor.execute(query)  # The execute function returns a unique query ID, which can be use to abort the query.
all_records = cursor.fetchall()
for row in all_records:
   print(row)
```

To get all the results:

```python
records = cursor.fetchall()
```

To get only one result (first row):

```python
record = cursor.fetchone()
```

To limit the number of results:

```python
limit = 500
records = cursor.fetchmany(limit)
```

### Code Hygiene

It is recommended to clear the cursor, close the cursor and close the connection after running a function as a best practice. This enhances performance by clearing old data from memory.

```python
cursor.clear() # Not needed when aborting a query
cursor.close()
connection.close()
```

## Get Row Count

This code executes a query and returns the number of rows in the result set.

*Fetching `rowcount` from `queryplanner` is preferred over using `fetchall` method.*

```python
def get_result_row_count(query,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


   cursor = connection.cursor(catalog_name=catalog_name, db_name=database)
   query_id = cursor.execute(query)
   records = cursor.fetchmany(10000)
   query_planner = json.loads(cursor.explain_analyse())
   row_count = cursor.rowcount
   cursor.clear()
   cursor.close()
   connection.close()
   return row_count

if __name__=="__main__":
   get_result_row_count(<QUERY>,<DATABASE>,<CATALOG_NAME>)
```

## Get Query Execution Plan

The following code runs a query and returns the execution plan generated for the query by the e6data engine.

The execution plan data will be returned as a JSON string and should be parsed as JSON data.

```python
def get_query_planner(query,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


   cursor = connection.cursor(catalog_name=catalog_name, db_name=database)
   query_id = cursor.execute(query)
   records = cursor.fetchmany(10000)
   query_planner=json.loads(cursor.explain_analyse())
   cursor.clear()
   cursor.close()
   connection.close()
   return query_planner

if __name__=="__main__":
   get_query_planner(<QUERY>,<DATABASE>,<CATALOG_NAME>)
```

## Abort a Query

The following code aborts a query, referenced by the input Query ID. Refer [Get Query ID](#get-query-id) for information on obtaining the Query ID.

```python
def cancel_query(query_id,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


   cursor = connection.cursor(catalog_name=catalog_name, db_name=database)
   cursor.cancel(query_id)
   cursor.close()
   connection.close()
   print("Query Cancelled")

if __name__=="__main__":
   get_query_planner(<QUERY_ID>,<DATABASE>,<CATALOG_NAME>)

```

## Get Query Time Metrics

The following code runs a query and returns:

* amount of time taken to execute the query in seconds
* amount of time the query spent in the queue awaiting execution in seconds
* amount of time taken to parse the query in seconds

```python
def get_e6_time_outputs(query,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


cursor = conn.cursor(catalog_name)
query_id = cursor.execute(query)  # execute function returns query id, can be use for aborting the query.
all_records = cursor.fetchall()
explain_response = cursor.explain_analyse()
query_planner = json.loads(explain_response.get('planner'))

execution_time = query_planner.get("total_query_time")  # In milliseconds
queue_time = query_planner.get("executionQueueingTime")  # In milliseconds
parsing_time = query_planner.get("parsingTime")  # In milliseconds
row_count = cursor.rowcount

if __name__=="__main__":
   get_e6_time_outputs(<QUERY>,<DATABASE>,<CATALOG_NAME>) 
```

## Get Query ID

This code executes a query and returns the query ID.

```python
def get_query_id(query_id,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


   cursor = connection.cursor(catalog_name=catalog_name, db_name=database)
   query_id = cursor.execute(query)
   all_records = cursor.fetchall()
   cursor.clear()
   cursor.close()
   connection.close()
   return query_id
   
if __name__=="__main__":
   e6x_query(<QUERY>,<DATABASE>)
```

## Combine Multiple Functions

The following code is an example using multiple functions defined above to return:

* Number of rows
* The time taken to execute the query
* Query results

```python
from e6data_python_connector import Connection
import json

username = '<username>'  # Your e6data Email ID.
password = '<password>'  # Access Token generated in the e6data console.

host = '<host>'  # IP address or hostname of the cluster to be used.
database = '<database>'  # # Database to perform the query on.
port = 80  # Port of the e6data engine.

sql_query = 'SELECT * FROM <TABLE_NAME>'  # Replace with the actual query.

catalog_name = '<catalog_name>'  # Replace with the actual catalog name.

conn = Connection(
    host=host,
    port=port,
    username=username,
    database=database,
    password=password
)

cursor = conn.cursor(db_name=database, catalog_name=catalog_name)
query_id = cursor.execute(sql_query)
all_records = cursor.fetchall()
explain_response = cursor.explain_analyse()
planner_result = json.loads(explain_response.get('planner'))
execution_time = planner_result.get("total_query_time") / 1000  # Converting into seconds.
row_count = cursor.rowcount
columns = [col[0] for col in cursor.description]  # Get the column names and merge them with the results.
results = []
for row in all_records:
   row = dict(zip(columns, row))
   results.append(row)
   print(row)
print('Total row count {}, Execution Time (seconds): {}'.format(row_count, execution_time))
cursor.clear()
cursor.close()
conn.close()
```

## Auto Resume

To enable Auto Resume while establishing a connection using the Python Connector, set auto\_resume=True in the connection configuration. This ensures that the cluster resumes automatically when a query is executed, eliminating the need for manual intervention.

{% hint style="info" %}
**Note:** Auto Resume must also be enabled at the cluster level to take effect.
{% endhint %}

```
conn = Connection(
    host=host,
    port=port,
    username=username,
    database=database,
    password=password,
    auto_resume=True
)
```
