# Code Samples

## Best Practices

### Environment Variables

Setting environment variables is encouraged over hardcoding authentication credentials or cluster details.\
The following environment variables can be configured before initializing the connection:

* **HOST\_NAME** — Hostname or IP address of the e6data cluster.
* **E6\_USER** — Username (email address) associated with your e6data account.
* **E6\_TOKEN** — Access token generated from the e6data Console under your user profile.

These environment variables are fetched dynamically using `os.getenv()` during connection setup.

```python
import os
from e6data_python_connector import Connection

HOST_NAME = os.getenv("HOST_NAME")
E6_USER = os.getenv("E6_USER")
E6_TOKEN = os.getenv("E6_TOKEN")
```

Usage:

```python
from e6data_python_connector import Connection
```

## Create Connection

Sample code for establishing a connection with an e6data cluster.

```python
host = HOST_NAME or "<host>"
username = E6_USER or "<username>"
password = E6_TOKEN or "<access_token>"
database = "<database>"
catalog_name = "<catalog_name>"

conn = Connection(
    host=host,
    port=443,
    username=username,
    password=password,
    database=database,
    cluster_name='production-cluster',  # or your cluster
    secure=True,
    auto_resume=True  # enable auto-resume if supported
)

```

### Switch Database in an Existing Connection

```python
database = '<new_database_name>'  # Replace with the new database
cursor = conn.cursor(catalog_name=catalog_name, database=database)
```

## Perform Query & Get Results

Setting up the cursor allows you to execute queries and fetch results.\
The cursor supports:

* `fetchall()` — fetch all results.
* `fetchmany(limit)` — fetch a limited number of rows.
* `fetchone()` — fetch a single row.
* `fetchall_buffer()` — memory-efficient streaming for large result sets.

```python
cursor = conn.cursor(catalog_name=catalog_name)

query = "SELECT * FROM <TABLE_NAME>"
query_id = cursor.execute(query)

# Fetch all results
all_records = cursor.fetchall()
for row in all_records:
    print(row)

# OR fetch limited number of rows
records = cursor.fetchmany(500)

# OR memory-efficient streaming for large result sets
for item in cursor.fetchall_buffer():
    print(item)

```

To get all the results:

```python
records = cursor.fetchall()
```

To get only one result (first row):

```python
record = cursor.fetchone()
```

To limit the number of results:

```python
limit = 500
records = cursor.fetchmany(limit)
```

To fetch all results in a memory-efficient way

```
for item in cursor.fetchall_buffer():
    print(item)
```

**Code Hygiene**

It is recommended to clear the cursor, close the cursor and close the connection after running a function as a best practice. This enhances performance by clearing old data from memory.

```python
cursor.clear() # Not needed when aborting a query
cursor.close()
connection.close()
```

## Get Row Count

This code executes a query and returns the number of rows in the result set.\
Fetching `rowcount` directly from the cursor is preferred over using `fetchall()` for performance and memory efficiency.

```python
def get_result_row_count(query, database, catalog_name):
    conn = Connection(
        host=HOST_NAME or "<host>",
        port=443,
        username=E6_USER or "<username>",
        password=E6_TOKEN or "<access_token>",
        database=database,
        catalog_name=catalog_name,
        cluster_name='production-cluster',  # or your cluster
        secure=True,
        auto_resume=True
    )

    cursor = conn.cursor(catalog_name=catalog_name, database=database)
    query_id = cursor.execute(query)
    # Fetch a limited number of records if needed
    records = cursor.fetchmany(10000)
    row_count = cursor.rowcount

    cursor.clear()
    cursor.close()
    conn.close()
    return row_count

if __name__ == "__main__":
    get_result_row_count(<QUERY>, <DATABASE>, <CATALOG_NAME>)

```

## Get Query Execution Plan

The following code runs a query and returns the execution plan generated for the query by the e6data engine.

The execution plan data will be returned as a JSON string and should be parsed as JSON data.

```python
def get_query_planner(query,database,catalog_name):
   connection = Connection(host= HOST_NAME,
               port = 80,
               username = E6_USER,
               database = database,
               password = E6_TOKEN)


   cursor = connection.cursor(catalog_name=catalog_name, db_name=database)
   query_id = cursor.execute(query)
   records = cursor.fetchmany(10000)
   query_planner=json.loads(cursor.explain_analyse())
   cursor.clear()
   cursor.close()
   connection.close()
   return query_planner

if __name__=="__main__":
   get_query_planner(<QUERY>,<DATABASE>,<CATALOG_NAME>)
```

## Abort a Query

The following code aborts a query, referenced by the input Query ID. Refer [Get Query ID](#get-query-id) for information on obtaining the Query ID.

```python
import json

def get_query_planner(query, database, catalog_name):
    conn = Connection(
        host=HOST_NAME or "<host>",
        port=443,
        username=E6_USER or "<username>",
        password=E6_TOKEN or "<access_token>",
        database=database,
        catalog_name=catalog_name,
        cluster_name='production-cluster',  # or your cluster
        secure=True,
        auto_resume=True
    )

    cursor = conn.cursor(catalog_name=catalog_name, database=database)
    query_id = cursor.execute(query)
    # Fetch a limited number of records if needed
    records = cursor.fetchmany(10000)
    explain_response = cursor.explain_analyse()
    query_planner = json.loads(explain_response.get('planner'))

    cursor.clear()
    cursor.close()
    conn.close()
    return query_planner

if __name__ == "__main__":
    get_query_planner(<QUERY>, <DATABASE>, <CATALOG_NAME>)

```

## Get Query Time Metrics

The following code runs a query and returns:

* amount of time taken to execute the query in seconds
* amount of time the query spent in the queue awaiting execution in seconds
* amount of time taken to parse the query in seconds

```python
execution_time = query_planner.get("total_query_time") / 1000  # convert to seconds
queue_time = query_planner.get("executionQueueingTime") / 1000
parsing_time = query_planner.get("parsingTime") / 1000
row_count = cursor.rowcount
```

## Get Query ID

This code executes a query and returns the query ID.

```python
from e6data_python_connector import Connection
import os

HOST_NAME = os.getenv("HOST_NAME")
E6_USER = os.getenv("E6_USER")
E6_TOKEN = os.getenv("E6_TOKEN")

def get_query_id(query, database, catalog_name):
    conn = Connection(
        host=HOST_NAME or "<host>",
        port=443,
        username=E6_USER or "<username>",
        password=E6_TOKEN or "<access_token>",
        database=database,
        catalog=catalog_name,
        secure=True,
        auto_resume=True
    )

    cursor = conn.cursor(catalog_name=catalog_name, database=database)
    query_id = cursor.execute(query)
    all_records = cursor.fetchall()
    
    cursor.clear()
    cursor.close()
    conn.close()
    
    return query_id

if __name__ == "__main__":
    get_query_id("<QUERY>", "<DATABASE>", "<CATALOG_NAME>")

```

## Combine Multiple Functions

The following code is an example using multiple functions defined above to return:

* Number of rows
* The time taken to execute the query
* Query results

```python
cursor = conn.cursor(catalog_name=catalog_name)
query_id = cursor.execute(sql_query)
all_records = cursor.fetchall()
explain_response = cursor.explain_analyse()
planner_result = json.loads(explain_response.get('planner'))
execution_time = planner_result.get("total_query_time") / 1000
row_count = cursor.rowcount

columns = [col[0] for col in cursor.description]
results = [dict(zip(columns, row)) for row in all_records]

for row in results:
    print(row)

print(f"Total row count {row_count}, Execution Time (seconds): {execution_time}")

cursor.clear()
cursor.close()
conn.close()

```

## Auto Resume

To enable Auto Resume while establishing a connection using the Python Connector, set auto\_resume=True in the connection configuration. This ensures that the cluster resumes automatically when a query is executed, eliminating the need for manual intervention.

{% hint style="info" %}
**Note:** Auto Resume must also be enabled at the cluster level to take effect.
{% endhint %}

```
conn = Connection(
    host=host,
    port=443,
    username=username,
    password=password,
    database=database,
    cluster_name='production-cluster',
    secure=True,
    auto_resume=True
)

```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.e6data.com/product-documentation/connectors-and-drivers/python-connector/code-samples.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
