# Code Samples

## Best Practices

### Environment Variables

Setting environment variables is encouraged over hardcoding authentication credentials or cluster details.\
The following environment variables can be configured before initializing the connection:

* **HOST\_NAME** — Hostname or IP address of the e6data cluster.
* **E6\_USER** — Username (email address) associated with your e6data account.
* **E6\_TOKEN** — Access token generated from the e6data Console under your user profile.

These environment variables are fetched dynamically using `os.getenv()` during connection setup.

```python
import os
from e6data_python_connector import Connection

HOST_NAME = os.getenv("HOST_NAME")
E6_USER = os.getenv("E6_USER")
E6_TOKEN = os.getenv("E6_TOKEN")
```

Usage:

```python
from e6data_python_connector import Connection
```

## Create Connection

Sample code for establishing a connection with an e6data cluster.

```python
host = HOST_NAME or "<host>"
username = E6_USER or "<username>"
password = E6_TOKEN or "<access_token>"
database = "<database>"
catalog_name = "<catalog_name>"

conn = Connection(
    host=host,
    port=443,
    username=username,
    password=password,
    database=database,
    cluster_name='production-cluster',  # or your cluster
    secure=True,
    auto_resume=True  # enable auto-resume if supported
)

```

### Switch Database in an Existing Connection

```python
database = '<new_database_name>'  # Replace with the new database
cursor = conn.cursor(catalog_name=catalog_name, database=database)
```

## Perform Query & Get Results

Setting up the cursor allows you to execute queries and fetch results.\
The cursor supports:

* `fetchall()` — fetch all results.
* `fetchmany(limit)` — fetch a limited number of rows.
* `fetchone()` — fetch a single row.
* `fetchall_buffer()` — memory-efficient streaming for large result sets.

```python
cursor = conn.cursor(catalog_name=catalog_name)

query = "SELECT * FROM <TABLE_NAME>"
query_id = cursor.execute(query)

# Fetch all results
all_records = cursor.fetchall()
for row in all_records:
    print(row)

# OR fetch limited number of rows
records = cursor.fetchmany(500)

# OR memory-efficient streaming for large result sets
for item in cursor.fetchall_buffer():
    print(item)

```

To get all the results:

```python
records = cursor.fetchall()
```

To get only one result (first row):

```python
record = cursor.fetchone()
```

To limit the number of results:

```python
limit = 500
records = cursor.fetchmany(limit)
```

To fetch all results in a memory-efficient way

```
for item in cursor.fetchall_buffer():
    print(item)
```

**Code Hygiene**

It is recommended to clear the cursor, close the cursor and close the connection after running a function as a best practice. This enhances performance by clearing old data from memory.

```python
cursor.clear() # Not needed when aborting a query
cursor.close()
connection.close()
```

## Get Row Count

This code executes a query and returns the number of rows in the result set.\
Fetching `rowcount` directly from the cursor is preferred over using `fetchall()` for performance and memory efficiency.

```python
def get_result_row_count(query, database, catalog_name):
    conn = Connection(
        host=HOST_NAME or "<host>",
        port=443,
        username=E6_USER or "<username>",
        password=E6_TOKEN or "<access_token>",
        database=database,
        catalog_name=catalog_name,
        cluster_name='production-cluster',  # or your cluster
        secure=True,
        auto_resume=True
    )

    cursor = conn.cursor(catalog_name=catalog_name, database=database)
    query_id = cursor.execute(query)
    # Fetch a limited number of records if needed
    records = cursor.fetchmany(10000)
    row_count = cursor.rowcount

    cursor.clear()
    cursor.close()
    conn.close()
    return row_count

if __name__ == "__main__":
    get_result_row_count(<QUERY>, <DATABASE>, <CATALOG_NAME>)

```

## Get Query Execution Plan

The following code runs a query and returns the execution plan generated for the query by the e6data engine.

The execution plan data will be returned as a JSON string and should be parsed as JSON data.

```python
def get_query_planner(query,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


   cursor = connection.cursor(catalog_name=catalog_name, db_name=database)
   query_id = cursor.execute(query)
   records = cursor.fetchmany(10000)
   query_planner=json.loads(cursor.explain_analyse())
   cursor.clear()
   cursor.close()
   connection.close()
   return query_planner

if __name__=="__main__":
   get_query_planner(<QUERY>,<DATABASE>,<CATALOG_NAME>)
```

## Abort a Query

The following code aborts a query, referenced by the input Query ID. Refer [Get Query ID](#get-query-id) for information on obtaining the Query ID.

```python
import json

def get_query_planner(query, database, catalog_name):
    conn = Connection(
        host=HOST_NAME or "<host>",
        port=443,
        username=E6_USER or "<username>",
        password=E6_TOKEN or "<access_token>",
        database=database,
        catalog_name=catalog_name,
        cluster_name='production-cluster',  # or your cluster
        secure=True,
        auto_resume=True
    )

    cursor = conn.cursor(catalog_name=catalog_name, database=database)
    query_id = cursor.execute(query)
    # Fetch a limited number of records if needed
    records = cursor.fetchmany(10000)
    explain_response = cursor.explain_analyse()
    query_planner = json.loads(explain_response.get('planner'))

    cursor.clear()
    cursor.close()
    conn.close()
    return query_planner

if __name__ == "__main__":
    get_query_planner(<QUERY>, <DATABASE>, <CATALOG_NAME>)

```

## Get Query Time Metrics

The following code runs a query and returns:

* amount of time taken to execute the query in seconds
* amount of time the query spent in the queue awaiting execution in seconds
* amount of time taken to parse the query in seconds

```python
execution_time = query_planner.get("total_query_time") / 1000  # convert to seconds
queue_time = query_planner.get("executionQueueingTime") / 1000
parsing_time = query_planner.get("parsingTime") / 1000
row_count = cursor.rowcount
```

## Get Query ID

This code executes a query and returns the query ID.

```python
from e6data_python_connector import Connection
import os

HOST_NAME = os.getenv("HOST_NAME")
E6_USER = os.getenv("E6_USER")
E6_TOKEN = os.getenv("E6_TOKEN")

def get_query_id(query, database, catalog_name):
    conn = Connection(
        host=HOST_NAME or "<host>",
        port=443,
        username=E6_USER or "<username>",
        password=E6_TOKEN or "<access_token>",
        database=database,
        catalog=catalog_name,
        secure=True,
        auto_resume=True
    )

    cursor = conn.cursor(catalog_name=catalog_name, database=database)
    query_id = cursor.execute(query)
    all_records = cursor.fetchall()
    
    cursor.clear()
    cursor.close()
    conn.close()
    
    return query_id

if __name__ == "__main__":
    get_query_id("<QUERY>", "<DATABASE>", "<CATALOG_NAME>")

```

## Combine Multiple Functions

The following code is an example using multiple functions defined above to return:

* Number of rows
* The time taken to execute the query
* Query results

```python
cursor = conn.cursor(catalog_name=catalog_name)
query_id = cursor.execute(sql_query)
all_records = cursor.fetchall()
explain_response = cursor.explain_analyse()
planner_result = json.loads(explain_response.get('planner'))
execution_time = planner_result.get("total_query_time") / 1000
row_count = cursor.rowcount

columns = [col[0] for col in cursor.description]
results = [dict(zip(columns, row)) for row in all_records]

for row in results:
    print(row)

print(f"Total row count {row_count}, Execution Time (seconds): {execution_time}")

cursor.clear()
cursor.close()
conn.close()

```

## Auto Resume

To enable Auto Resume while establishing a connection using the Python Connector, set auto\_resume=True in the connection configuration. This ensures that the cluster resumes automatically when a query is executed, eliminating the need for manual intervention.

{% hint style="info" %}
**Note:** Auto Resume must also be enabled at the cluster level to take effect.
{% endhint %}

```
conn = Connection(
    host=host,
    port=443,
    username=username,
    password=password,
    database=database,
    cluster_name='production-cluster',
    secure=True,
    auto_resume=True
)

```
